首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Multiple‐trait and random regression models have multiplied the number of equations needed for the estimation of variance components. To avoid inversion or decomposition of a large coefficient matrix, we propose estimation of variance components by Monte Carlo expectation maximization restricted maximum likelihood (MC EM REML) for multiple‐trait linear mixed models. Implementation is based on full‐model sampling for calculating the prediction error variances required for EM REML. Performance of the analytical and the MC EM REML algorithm was compared using a simulated and a field data set. For field data, results from both algorithms corresponded well even with one MC sample within an MC EM REML round. The magnitude of the standard errors of estimated prediction error variances depended on the formula used to calculate them and on the MC sample size within an MC EM REML round. Sampling variation in MC EM REML did not impair the convergence behaviour of the solutions compared with analytical EM REML analysis. A convergence criterion that takes into account the sampling variation was developed to monitor convergence for the MC EM REML algorithm. For the field data set, MC EM REML proved far superior to analytical EM REML both in computing time and in memory need.  相似文献   

2.
Simulated horse data were used to compare multivariate estimation of genetic parameters and prediction of breeding values (BV) for categorical, continuous and molecular genetic data using linear animal models via residual maximum likelihood (REML) and best linear unbiased prediction (BLUP) and mixed linear-threshold animal models via Gibbs sampling (GS). Simulation included additive genetic values, residuals and fixed effects for one continuous trait, liabilities of four binary traits, and quantitative trait locus (QTL) effects and genetic markers with different recombination rates and polymorphism information content for one of the liabilities. Analysed data sets differed in the number of animals with trait records and availability of genetic marker information. Consideration of genetic marker information in the model resulted in marked overestimation of the heritability of the QTL trait. If information on 10,000 or 5,000 animals was used, bias of heritabilities and additive genetic correlations was mostly smaller, correlation between true and predicted BV was always higher and identification of genetically superior and inferior animals was - with regard to the moderately heritable traits, in many cases - more reliable with GS than with REML/BLUP. If information on only 1,000 animals was used, neither GS nor REML/BLUP produced genetic parameter estimates with relative bias 50% for all traits. Selection decisions for binary traits should rather be based on GS than on REML/BLUP breeding values.  相似文献   

3.
This data set consisted of over 29 245 field records from 24 herds of registered Nelore cattle born between 1980 and 1993, with calves sires by 657 sires and 12 151 dams. The records were collected in south‐eastern and midwestern Brazil and animals were raised on pasture in a tropical climate. Three growth traits were included in these analyses: 205‐ (W205), 365‐ (W365) and 550‐day (W550) weight. The linear model included fixed effects for contemporary groups (herd‐year‐season‐sex) and age of dam at calving. The model also included random effects for direct genetic, maternal genetic and maternal permanent environmental (MPE) contributions to observations. The analyses were conducted using single‐trait and multiple‐trait animal models. Variance and covariance components were estimated by restricted maximum likelihood (REML) using a derivative‐free algorithm (DFREML) for multiple traits (MTDFREML). Bayesian inference was obtained by a multiple trait Gibbs sampling algorithm (GS) for (co)variance component inference in animal models (MTGSAM). Three different sets of prior distributions for the (co)variance components were used: flat, symmetric, and sharp. The shape parameters (ν) were 0, 5 and 9, respectively. The results suggested that the shape of the prior distributions did not affect the estimates of (co)variance components. From the REML analyses, for all traits, direct heritabilities obtained from single trait analyses were smaller than those obtained from bivariate analyses and by the GS method. Estimates of genetic correlations between direct and maternal effects obtained using REML were positive but very low, indicating that genetic selection programs should consider both components jointly. GS produced similar but slightly higher estimates of genetic parameters than REML, however, the greater robustness of GS makes it the method of choice for many applications.  相似文献   

4.
The genetic evaluation using the carcass field data in Japanese Black cattle has been carried out employing an animal model, implementing the restricted maximum likelihood (REML) estimation of additive genetic and residual variances. Because of rapidly increasing volumes of the official data sets and therefore larger memory spaces required, an alternative approach like the REML estimation could be useful. The purpose of this study was to investigate Gibbs sampling conditions for the single-trait variance component estimation using the carcass field data. As prior distributions, uniform and normal distributions and independent scaled inverted chi-square distributions were used for macro-environmental effects, breeding values, and the variance components, respectively. Using the data sets of different sizes, the influences of Gibbs chain length and thinning interval were investigated, after the burn-in period was determined using the coupling method. As would be expected, the chain lengths had obviously larger effects on the posterior means than those of thinning intervals. The posterior means calculated using every 10th sample from 90 000 of samples after 10 000 samples discarded as burn-in period were all considered to be reasonably comparable to the corresponding estimates by REML.  相似文献   

5.
We investigated the effects of different strategies for genotyping populations on variance components and heritabilities estimated with an animal model under restricted maximum likelihood (REML), genomic REML (GREML), and single‐step GREML (ssGREML). A population with 10 generations was simulated. Animals from the last one, two or three generations were genotyped with 45,116 SNP evenly distributed on 27 chromosomes. Animals to be genotyped were chosen randomly or based on EBV. Each scenario was replicated five times. A single trait was simulated with three heritability levels (low, moderate, high). Phenotypes were simulated for only females to mimic dairy sheep and also for both sexes to mimic meat sheep. Variance component estimates from genomic data and phenotypes for one or two generations were more biased than from three generations. Estimates in the scenario without selection were the most accurate across heritability levels and methods. When selection was present in the simulations, the best option was to use genotypes of randomly selected animals. For selective genotyping, heritabilities from GREML were more biased compared to those estimated by ssGREML, because ssGREML was less affected by selective or limited genotyping.  相似文献   

6.
Mixed model (co)variance component estimates by REML and Gibbs sampling for two traits were compared for base populations and control lines of Red Flour Beetle (Tribolium castaneum). Two base populations (1296 records in the first replication, 1292 in the second) were sampled from laboratory stock. Control lines were derived from corresponding base populations with random selection and mating for 16 generations. The REML estimate of each (co)variance component for both pupa weight and family size was compared with the mean and 95% central interval of the particular (co)variance estimated by Gibbs sampling with three different weights on the given priors: ‘flat’, smallest, and 3.7% degrees of belief. Results from Gibbs sampling showed that flat priors gave a wider and more skewed marginal posterior distribution than the other two weights on priors for all parameters. In contrast, the 3.7% degree of belief on priors provided reasonably narrow and symmetric marginal posterior distributions. Estimation by REML does not have the flexibility of changing the weight on prior information as does the Bayesian analysis implemented by Gibbs sampling. In general, the 95% central intervals from the three different weights on priors in the base populations were similar to those in control lines. Most REML estimates in base populations differed from REML estimates in control lines. Insufficient information from the data, and confounding of random effects contributed to the variability of REML estimates in base populations. Evidence is presented showing that some (co)variance components were estimated with less precision than others. Results also support the hypothesis that REML estimates were equivalent to the joint mode of posterior distribution obtained from a Bayesian analysis with flat priors, but only when there was sufficient information from data, and no confounding among random effects.  相似文献   

7.
Multivariate estimation of genetic parameters involving more than a handful of traits can be afflicted by problems arising through substantial sampling variation. We present a review of underlying causes and proposals to improve estimates, focusing on linear mixed model‐based estimation via restricted maximum likelihood (REML). Both full multivariate analyses and pooling of results from overlapping subsets of traits are considered. It is suggested to impose a penalty on the likelihood designed to reduce sampling variances at the expense of a little additional bias. Simulation results are discussed which demonstrate that this can yield REML estimates that are on average closer to the population values than their unpenalized counterparts. Suitable penalties can be obtained based on assumed prior distributions of selected parameters. Necessary choices of penalty functions and of the stringency of penalization are examined. We argue that scale‐free penalty functions lend themselves to a simple scheme imposing a mild, default penalty which can yield “better” estimates without being likely to incur detrimental effects.  相似文献   

8.
Conventional selective genotyping which is using the extreme phenotypes (EP) was compared with alternative criteria to find the most informative animals for genotyping with respects to mapping quantitative trait loci (QTL). Alternative sampling strategies were based on minimizing the sampling error of the estimated QTL effect (MinERR) and maximizing likelihood ratio test (MaxLRT) using both phenotypic and genotypic information. In comparison, animals were randomly genotyped either within or across families. One hundred data sets were simulated each with 30 half-sib families and 120 daughters per family. The strategies were compared in these datasets with respect to estimated effect and position of a QTL within a previously defined genomic region at genotyping 10, 20 or 30% of the animals. Combined linkage disequilibrium linkage analysis (LDLA) was applied in a variance component approach. Power to detect QTL was significantly higher for both MinERR and MaxLRT compared with EP and random genotyping methods (either across or within family), for all the proportions of genotyped animals. Power to detect significant QTL (alpha = 0.01) with 20% genotyping for MinERR and MaxLRT was 80 and 75% of that obtained with complete genotyping compared with 70 and 38% genotyping for EP within and across families respectively. With 30% genotyping, the powers were 78, 83, 78 and 58% respectively. The estimated variance components were unbiased in EP strategies (within and across family), only when at least 30% was genotyped. To decrease the number of genotyped individuals either MinERR or MaxLRT could be considered. With 20% genotyping in MinERR, the estimated QTL variance components were not significant compared with complete genotype information but all studied strategies at 20% genotyping overestimated the QTL effect. Results showed that combining the phenotypic and genotypic information in selective genotyping (e.g. MinERR and MaxLRT) is better than using only the EPs and the combined methods can be considered as alternative approaches to decrease genotyping costs, with unbiased QTL effects, decreased sampling variance of the QTL variance component and also increased the power of QTL detection.  相似文献   

9.
Quantitative trait loci (QTL) analysis in designed experiments is investigated using a mixed model framework through the modification of segment mapping techniques. Allele effects are modelled in the F1 generation allowing the estimation of additive substitution effects while accounting for QTL segregation within lines and differences in mean QTL effects between lines. The resulting approach is called F1 segment mapping. Simulation is used to illustrate the method and its properties. F1 segment mapping has advantages over F2 segment mapping in the derivation of exact additive genetic covariances and in the computation time for variance component estimation.  相似文献   

10.
Monte Carlo (MC) methods have been found useful in estimation of variance parameters for large data and complex models with many variance components (VC), with respect to both computer memory and computing time. A disadvantage has been a fluctuation in round‐to‐round values of estimates that makes the estimation of convergence challenging. Furthermore, with Newton‐type algorithms, the approximate Hessian matrix might have sufficient accuracy, but the inaccuracy in the gradient vector exaggerates the round‐to‐round fluctuation to intolerable. In this study, the reuse of the same random numbers within each MC sample was used to remove the MC fluctuation. Simulated data with six VC parameters were analysed by four different MC REML methods: expectation‐maximization (EM), Newton–Raphson (NR), average information (AI) and Broyden's method (BM). In addition, field data with 96 VC parameters were analysed by MC EM REML. In all the analyses with reused samples, the MC fluctuations disappeared, but the final estimates by the MC REML methods differed from the analytically calculated values more than expected especially when the number of MC samples was small. The difference depended on the random numbers generated, and based on repeated MC AI REML analyses, the VC estimates were on average non‐biased. The advantage of reusing MC samples is more apparent in the NR‐type algorithms. Smooth convergence opens the possibility to use the fast converging Newton‐type algorithms. However, a disadvantage from reusing MC samples is a possible “bias” in the estimates. To attain acceptable accuracy, sufficient number of MC samples need to be generated.  相似文献   

11.
Gilmour, Thompson, and Cullis (Biometrics, 1995, 51, 1440) presented the average information residual maximum likelihood (REML) algorithm for efficient variance parameter estimation in the linear mixed model. That paper dealt specifically with traditional variance component models, but the algorithm was quickly applied to more general models and implemented in several REML packages including ASReml (Gilmour et al., Biometrics, 2015, 51, 1440). This paper outlines the theory with respect to these more general models, describes the main issues encountered in fitting these models and how they have been addressed in the ASReml software. The issues covered are the basics steps in the implementation of the algorithm, keeping parameters within the parameter space, maximizing sparsity, avoiding issues associated with unstructured variance matrices by using the factor‐analytic structure and handling singularities in marker‐based relationship matrices and current work.  相似文献   

12.
The amount of variance captured in genetic estimations may depend on whether a pedigree‐based or genomic relationship matrix is used. The purpose of this study was to investigate the genetic variance as well as the variance of predicted genetic merits (PGM) using pedigree‐based or genomic relationship matrices in Brown Swiss cattle. We examined a range of traits in six populations amounting to 173 population‐trait combinations. A main aim was to determine how using different relationship matrices affect variance estimation. We calculated ratios between different types of estimates and analysed the impact of trait heritability and population size. The genetic variances estimated by REML using a genomic relationship matrix were always smaller than the variances that were similarly estimated using a pedigree‐based relationship matrix. The variances from the genomic relationship matrix became closer to estimates from a pedigree relationship matrix as heritability and population size increased. In contrast, variances of predicted genetic merits obtained using a genomic relationship matrix were mostly larger than variances of genetic merit predicted using pedigree‐based relationship matrix. The ratio of the genomic to pedigree‐based PGM variances decreased as heritability and population size rose. The increased variance among predicted genetic merits is important for animal breeding because this is one of the factors influencing genetic progress.  相似文献   

13.
The multiple-trait derivative-free REML set of programs was written to handle partially missing data for multiple-trait analyses as well as single-trait models. Standard errors of genetic parameters were reported for univariate models and for multiple-trait analyses only when all traits were measured on animals with records. In addition to estimating (co)variance components for multiple-trait models with partially missing data, this paper shows how the multiple-trait derivative-free REML set of programs can also estimate SE by augmenting the data file when not all animals have all traits measured. Although the standard practice has been to eliminate records with partially missing data, that practice uses only a subset of the available data. In some situations, the elimination of partial records can result in elimination of all the records, such as one trait measured in one environment and a second trait measured in a different environment. An alternative approach requiring minor modifications of the original data and model was developed that provides estimates of the SE using an augmented data set that gives the same residual log likelihood as the original data for multiple-trait analyses when not all traits are measured. Because the same residual vector is used for the original data and the augmented data, the resulting REML estimators along with their sampling properties are identical for the original and augmented data, so that SE for estimates of genetic parameters can be calculated.  相似文献   

14.
The objective was to compare the performance of a recently derived, new method of estimating variances and covariances with any mixed linear model and any pattern of missing data with that of restricted maximum likelihood. For each of 96 combinations of six three-herd x four-sire unbalanced designs of 39 offspring each, four heritability values, two ratios of sire variance to interaction variance, and two distributions (multivariate normal and multivariate chi2, 3 df), 15,000 vectors (n = 39) were generated. Least squares Lehmann-Scheffé (LSLS) estimators of sire variance, interaction variance, and heritability were compared to those of REML with the performance measures of percentage of estimates (of the 15,000) that were positive, mean square error, variance, percentage of estimates within +/- 50% of the parameter, bias, maximum value, skewness, and kurtosis. The LSLS method vastly outperformed REML in almost all 96 combinations. Averaged over the 48 combinations with multivariate normal data, the average percentage that REML estimators of heritability performed relative to those of LSLS for the first five of the above listed eight performance measures was -100%. The number of times LSLS was better than REML was 235 out of 240. The analogous values for the 48 combinations with multivariate chi2, 3-df data were -90% and 230 out of 240. The REML maximum values were always larger than the LSLS values. The LSLS skewness and kurtosis values were about the same as those for REML, with the exception of LSLS heritability kurtosis values, which were notably less than those for REML. The explicit expectations of the LSLS estimators showed that the LSLS estimators were surprisingly unbiased given the paucity of data. Explicit coefficients for calculating mean square errors, variances, and biases squared of the LSLS estimators of the three variances were obtained for each design. The LSLS advantage was not quite so large with the multivariate chi2, 3-df data as with the multivariate normal data. Results with a symmetric multinomial distribution were the same as with the multivariate normal. The overall result was that the LSLS estimators produced substantially more non-zero estimates than REML estimators and these more abundant positive estimates were substantially grouped closer to their respective parameters. Results justify efforts to make the LSLS procedure computationally available.  相似文献   

15.
Choosing families to sample for a quantitative trait locus mapping experiment is a critical component of experimental design because only heterozygous families contribute information to the analysis. Additive genetic variance of a paternal half-sib family can be partitioned into two parts: a variance component of maternal source that is constant across different families and a variance component of paternal source that is defined as an index of heterozygosity of a sire. This index is shown to be an upper limit of variance among marker genotypes of a half-sib family and can be used to identify highly heterozygous sires, thus improving the power of detecting QTL in detection studies. Simulated progeny phenotypic data were used to estimate sire's heterozygosity index via an ANOVA method, and accuracy of the estimation was evaluated with the correlation coefficient between the true and estimated index summarized both as the correlation and by the correct ranking of results as measured by the ratio of the true average heterozygosity index of experimentally selected parents to average heterozygosity of all sires. Positive but small correlation can be achieved in the estimation of a sire's heterozygosity when based on the daughters' phenotypic data, and accuracy was improved when progeny-tested sons were used to estimate their grandsire's heterozygosity index, depending on the genetic model of a trait and the size and structure of families.  相似文献   

16.
Estimation of genetic variance in populations under selection involves assumptions on base animals. Base animals are often considered unselected and it also has been proposed to treat selected base animals as fixed. The consequences of assumptions on base animals in the estimation of genetic variance in selected populations are not fully understood. Variance decompositions are introduced for simple designs to quantify the differences between models that treat base animals in different ways. Independent contrasts were constructed and REML estimates of variance components were compared for different designs and selection rules. The method shows how selection is accounted for in a complete model and why estimation of variance components can become biased when base animals are treated as fixed.  相似文献   

17.
The widespread use of the set of multiple-trait derivative-free REML programs for prediction of breeding values and estimation of variance components has led to significant improvement in traits of economic importance. The initial version of this software package, however, was generally limited to pedigree-based relationships. With continued advances in genomic research and the increased availability of genotyping, relationships based on molecular markers are obtainable and desirable. The addition of a new program to the set of multiple-trait derivative-free REML programs is described that allows users the flexibility to calculate relationships using standard pedigree files or an arbitrary relationship matrix based on genetic marker information. The strategy behind this modification and its design is described. An application is illustrated in a QTL association study for canine hip dysplasia.  相似文献   

18.
Bayesian estimation via Gibbs sampling, REML, and Method R were compared for their empirical sampling properties in estimating genetic parameters from data subject to parental selection using an infinitesimal animal model. Models with and without contemporary groups, random or nonrandom parental selection, two levels of heritability, and none or 15% randomly missing pedigree information were considered. Nonrandom parental selection caused similar effects on estimates of variance components from all three methods. When pedigree information was complete, REML and Bayesian estimation were not biased by nonrandom parental selection for models with or without contemporary groups. Method R estimates, however, were strongly biased by nonrandom parental selection when contemporary groups were in the model. The bias was empirically shown to be a consequence of not fully accounting for gametic phase disequilibrium in the subsamples. The joint effects of nonrandom parental selection and missing pedigree information caused estimates from all methods to be highly biased. Missing pedigree information did not cause biased estimates in random mating populations. Method R estimates usually had greater mean square errors than did REML and Bayesian estimates.  相似文献   

19.
The principle of interval mapping for quantitative trait loci (QTL) was originally developed for the analysis of single backcross data but it has been increasingly applied to more complicated experimental designs and data structures. It is important to study whether accounting for the heterogeneity of variance would improve the precision of QTL mapping based on data of multiple populations or families. This study compared homogeneous and heterogeneous maximum likelihood approaches for QTL mapping. The data consisted of 433 sons from six sire families with 69 microsatellite markers distributed over 12 chromosomes. The results of this study indicate that the heterogeneous approach generally produced a smaller residual variance and thus provided a better fit to the data than the homogeneous approach, meaning that the heterogeneous approach offers better precision in estimating both positions and effects of QTL. The results further showed that accounting for the heterogeneity of residual variance led to different statistical inferences from ignoring the heterogeneity of variance in QTL mapping. The heterogeneous approach is useful for QTL mapping based on the joint data of diverse reference populations or heteroscedastic data obtained from crossing animals with different genetic backgrounds.  相似文献   

20.
Summary Restricted maximum likelihood (REML) was used to determine the choice of statistical model, additive genetic maternal and common litter effects and consequences of ignoring these effects on estimates of variance–covariance components under random and phenotypic selection in swine using computer simulation. Two closed herds of different size and two traits, (i) pre‐weaning average daily gain and (ii) litter size at birth, were considered. Three levels of additive direct and maternal genetic correlations (rdm) were assumed to each trait. Four mixed models (denoted as GRM1 through GRM4) were used to generate data sets. Model GRM1 included only additive direct genetic effects, GRM2 included only additive direct genetic and common litter effects, GRM3 included only additive direct and maternal genetic effects and GRM4 included all the random effects. Four mixed animal models (defined as EPM1 through EPM4) were defined for estimating genetic parameters similar to GRM. Data from each GRM were fitted with EPM1 through EPM4. The largest biased estimates of additive genetic variance were obtained when EPM1 was fitted to data generated assuming the presence of either additive maternal genetic, common litter effects or a combination thereof. The bias of estimated additive direct genetic variance (VAd) increased and those of recidual variance (VE) decreased with an increase in level of rdm when GRM3 was used. EPM1, EPM2 and EPM3 resulted in biased estimation of the direct genetic variances. EPM4 was the most accurate in each GRM. Phenotypic selection substantially increased bias of estimated additive direct genetic effect and its mean square error in trait 1, but decreased those in trait 2 when ignored in the statistical model. For trait 2, estimates under phenotypic selection were more biased than those under random selection. It was concluded that statistical models for estimating variance components should include all random effects considered to avoid bias.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号