首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 351 毫秒
1.
The purpose of this study was to examine accuracy of genomic selection via single‐step genomic BLUP (ssGBLUP) when the direct inverse of the genomic relationship matrix ( G ) is replaced by an approximation of G ?1 based on recursions for young genotyped animals conditioned on a subset of proven animals, termed algorithm for proven and young animals (APY). With the efficient implementation, this algorithm has a cubic cost with proven animals and linear with young animals. Ten duplicate data sets mimicking a dairy cattle population were simulated. In a first scenario, genomic information for 20k genotyped bulls, divided in 7k proven and 13k young bulls, was generated for each replicate. In a second scenario, 5k genotyped cows with phenotypes were included in the analysis as young animals. Accuracies (average for the 10 replicates) in regular EBV were 0.72 and 0.34 for proven and young animals, respectively. When genomic information was included, they increased to 0.75 and 0.50. No differences between genomic EBV (GEBV) obtained with the regular G ?1 and the approximated G ?1 via the recursive method were observed. In the second scenario, accuracies in GEBV (0.76, 0.51 and 0.59 for proven bulls, young males and young females, respectively) were also higher than those in EBV (0.72, 0.35 and 0.49). Again, no differences between GEBV with regular G ?1 and with recursions were observed. With the recursive algorithm, the number of iterations to achieve convergence was reduced from 227 to 206 in the first scenario and from 232 to 209 in the second scenario. Cows can be treated as young animals in APY without reducing the accuracy. The proposed algorithm can be implemented to reduce computing costs and to overcome current limitations on the number of genotyped animals in the ssGBLUP method.  相似文献   

2.
Reliabilities for a multiple-trait maternal model were obtained by combining reliabilities obtained from single-trait models. Single-trait reliabilities were obtained using an approximation that supported models with additive and permanent environmental effects. For the direct effect, the maternal and permanent environmental variances were assigned to the residual. For the maternal effect, variance of the direct effect was assigned to the residual. Data included 10,550 birth weight, 11,819 weaning weight, and 3,617 postweaning gain records of Senepol cattle. Reliabilities were obtained by generalized inversion and by using single-trait and multiple-trait approximation methods. Some reliabilities obtained by inversion were negative because inbreeding was ignored in calculating the inverse of the relationship matrix. The multiple-trait approximation method reduced the bias of approximation when compared with the single-trait method. The correlations between reliabilities obtained by inversion and by multiple-trait procedures for the direct effect were 0.85 for birth weight, 0.94 for weaning weight, and 0.96 for postweaning gain. Correlations for maternal effects for birth weight and weaning weight were 0.96 to 0.98 for both approximations. Further improvements can be achieved by refining the single-trait procedures.  相似文献   

3.
The Algorithm for Proven and Young (APY) enables the implementation of single‐step genomic BLUP (ssGBLUP) in large, genotyped populations by separating genotyped animals into core and non‐core subsets and creating a computationally efficient inverse for the genomic relationship matrix ( G ). As APY became the choice for large‐scale genomic evaluations in BLUP‐based methods, a common question is how to choose the animals in the core subset. We compared several core definitions to answer this question. Simulations comprised a moderately heritable trait for 95,010 animals and 50,000 genotypes for animals across five generations. Genotypes consisted of 25,500 SNP distributed across 15 chromosomes. Genotyping errors and missing pedigree were also mimicked. Core animals were defined based on individual generations, equal representation across generations, and at random. For a sufficiently large core size, core definitions had the same accuracies and biases, even if the core animals had imperfect genotypes. When genotyped animals had unknown parents, accuracy and bias were significantly better (p ≤ .05) for random and across generation core definitions.  相似文献   

4.
We investigated the importance of SNP weighting in populations with 2,000 to 25,000 genotyped animals. Populations were simulated with two effective sizes (20 or 100) and three numbers of QTL (10, 50 or 500). Pedigree information was available for six generations; phenotypes were recorded for the four middle generations. Animals from the last three generations were genotyped for 45,000 SNP. Single‐step genomic BLUP (ssGBLUP) and weighted ssGBLUP (WssGBLUP) were used to estimate genomic EBV using a genomic relationship matrix ( G ). The WssGBLUP performed better in small genotyped populations; however, any advantage for WssGBLUP was reduced or eliminated when more animals were genotyped. WssGBLUP had greater resolution for genome‐wide association (GWA) as did increasing the number of genotyped animals. For few QTL, accuracy was greater for WssGBLUP than ssGBLUP; however, for many QTL, accuracy was the same for both methods. The largest genotyped set was used to assess the dimensionality of genomic information (number of effective SNP). The number of effective SNP was considerably less in weighted G than in unweighted G . Once the number of independent SNP is well represented in the genotyped population, the impact of SNP weighting becomes less important.  相似文献   

5.
Joint Nordic (Denmark, Finland, Sweden) genetic evaluation of female fertility is currently based on the multiple trait multilactation animal model (BLUP). Here, single step genomic model (ssGBLUP) was applied for the Nordic Red dairy cattle fertility evaluation. The 11 traits comprised of nonreturn rate and days from first to last insemination in heifers and first three parities, and days from calving to first insemination in the first three parities. Traits had low heritabilities (0.015–0.04), but moderately high genetic correlations between the parities (0.60–0.88). Phenotypic data included 4,226,715 animals with records and pedigree 5,445,392 animals. Unknown parents were assigned into 332 phantom parent groups (PPG). In mixed model equations animals were associated with PPG effects through the pedigree or both the pedigree and genomic information. Genotype information of 46,914 SNPs was available for 33,969 animals in the pedigree. When PPG used pedigree information only, BLUP converged after 2,420 iterations whereas the ssGBLUP evaluation needed over ten thousand iterations. When the PPG effects were solved accounting both the pedigree and the genomic information, the ssGBLUP model converged after 2,406 iterations. Also, with the latter model breeding values by ssGBLUP and BLUP became more consistent and genetic trends followed each other well. Models were validated using forward prediction of the young bulls. Reliabilities and variance inflation of predicted genomic breeding values (values for parent averages in brackets) for the 11 traits ranged 0.22–0.31 (0.10–0.27) and 0.81–0.95 (0.83–1.06), respectively. The ssGBLUP model gave always higher validation reliabilities than BLUP, but largest increases were for the cow fertility traits.  相似文献   

6.
The number of genotyped animals has increased rapidly creating computational challenges for genomic evaluation. In animal model BLUP, candidate animals without progeny and phenotype do not contribute information to the evaluation and can be discarded. In theory, genotyped candidate animal without progeny can bring information into single‐step BLUP (ssGBLUP) and affect the estimation of other breeding values. We studied the effect of including or excluding genomic information of culled bull calves on genomic breeding values (GEBV) from ssGBLUP. In particular, GEBVs of genotyped bulls with daughters and GEBVs of young bulls selected into AI to be progeny tested (test bulls) were studied. The ssGBLUP evaluation was computed using Nordic test day (TD) model and TD data for the Nordic Red Dairy Cattle. The results indicate that genomic information of culled bull calves does not affect the GEBVs of progeny tested reference animals, but if genotypes of the culled bulls are used in the TD ssGBLUP, the genetic trend in the test bulls is considerably higher compared to the situation when genomic information of the culled bull calves is excluded. It seems that by discarding genomic information of culled bull calves without progeny, upward bias of GEBVs of test bulls is reduced.  相似文献   

7.
Genomic information has a limited dimensionality (number of independent chromosome segments [Me]) related to the effective population size. Under the additive model, the persistence of genomic accuracies over generations should be high when the nongenomic information (pedigree and phenotypes) is equivalent to Me animals with high accuracy. The objective of this study was to evaluate the decay in accuracy over time and to compare the magnitude of decay with varying quantities of data and with traits of low and moderate heritability. The dataset included 161,897 phenotypic records for a growth trait (GT) and 27,669 phenotypic records for a fitness trait (FT) related to prolificacy in a population with dimensionality around 5,000. The pedigree included 404,979 animals from 2008 to 2020, of which 55,118 were genotyped. Two single-trait models were used with all ancestral data and sliding subsets of 3-, 2-, and 1-generation intervals. Single-step genomic best linear unbiased prediction (ssGBLUP) was used to compute genomic estimated breeding values (GEBV). Estimated accuracies were calculated by the linear regression (LR) method. The validation population consisted of single generations succeeding the training population and continued forward for all generations available. The average accuracy for the first generation after training with all ancestral data was 0.69 and 0.46 for GT and FT, respectively. The average decay in accuracy from the first generation after training to generation 9 was −0.13 and −0.19 for GT and FT, respectively. The persistence of accuracy improves with more data. Old data have a limited impact on the predictions for young animals for a trait with a large amount of information but a bigger impact for a trait with less information.  相似文献   

8.
Genomic selection has been adopted nationally and internationally in different livestock and plant species. However, understanding whether genomic selection has been effective or not is an essential question for both industry and academia. Once genomic evaluation started being used, estimation of breeding values with pedigree best linear unbiased prediction (BLUP) became biased because this method does not consider selection using genomic information. Hence, the effective starting point of genomic selection can be detected in two possible ways including the divergence of genetic trends and Realized Mendelian sampling (RMS) trends obtained with BLUP and single-step genomic BLUP (ssGBLUP). This study aimed to find the start date of genomic selection for a set of economically important traits in three livestock species by comparing trends obtained using BLUP and ssGBLUP. Three datasets were used for this purpose: 1) a pig dataset with 117k genotypes and 1.3M animals in pedigree, 2) an Angus cattle dataset consisted of ~842k genotypes and 11.5M animals in pedigree, and 3) a purebred broiler chicken dataset included ~154k genotypes and 1.3M birds in pedigree were used. The genetic trends for pigs diverged for the genotyped animals born in 2014 for average daily gain (ADG) and backfat (BF). In beef cattle, the trends started diverging in 2009 for weaning weight (WW) and in 2016 for postweaning gain (PWG), with little divergence for birth weight (BTW). In broiler chickens, the genetic trends estimated by ssGBLUP and BLUP diverged at breeding cycle 6 for two out of the three production traits. The RMS trends for the genotyped pigs diverged for animals born in 2014, more for ADG than for BF. In beef cattle, the RMS trends started diverging in 2009 for WW and in 2016 for PWG, with a trivial trend for BTW. In broiler chickens, the RMS trends from ssGBLUP and BLUP diverged strongly for two production traits at breeding cycle 6, with a slight divergence for another trait. Divergence of the genetic trends from ssGBLUP and BLUP indicates the onset of the genomic selection. The presence of trends for RMS indicates selective genotyping, with or without the genomic selection. The onset of genomic selection and genotyping strategies agrees with industry practices across the three species. In summary, the effective start of genomic selection can be detected by the divergence between genetic and RMS trends from BLUP and ssGBLUP.  相似文献   

9.
Genomic prediction has become the new standard for genetic improvement programs, and currently, there is a desire to implement this technology for the evaluation of Angus cattle in Brazil. Thus, the main objective of this study was to assess the feasibility of evaluating young Brazilian Angus (BA) bulls and heifers for 12 routinely recorded traits using single-step genomic BLUP (ssGBLUP) with and without genotypes from American Angus (AA) sires. The second objective was to obtain estimates of effective population size (Ne) and linkage disequilibrium (LD) in the Brazilian Angus population. The dataset contained phenotypic information for up to 277,661 animals belonging to the Promebo breeding program, pedigree for 362,900, of which 1,386 were genotyped for 50k, 77k, and 150k single nucleotide polymorphism (SNP) panels. After imputation and quality control, 61,666 SNPs were available for the analyses. In addition, genotypes from 332 American Angus (AA) sires widely used in Brazil were retrieved from the AA Association database to be used for genomic predictions. Bivariate animal models were used to estimate variance components, traditional EBV, and genomic EBV (GEBV). Validation was carried out with the linear regression method (LR) using young-genotyped animals born between 2013 and 2015 without phenotypes in the reduced dataset and with records in the complete dataset. Validation animals were further split into progeny of BA and AA sires to evaluate if their progenies would benefit by including genotypes from AA sires. The Ne was 254 based on pedigree and 197 based on LD, and the average LD (±SD) and distance between adjacent single nucleotide polymorphisms (SNPs) across all chromosomes were 0.27 (±0.27) and 40743.68 bp, respectively. Prediction accuracies with ssGBLUP outperformed BLUP for all traits, improving accuracies by, on average, 16% for BA young bulls and heifers. The GEBV prediction accuracies ranged from 0.37 (total maternal for weaning weight and tick count) to 0.54 (yearling precocity) across all traits, and dispersion (LR coefficients) fluctuated between 0.92 and 1.06. Inclusion of genotyped sires from the AA improved GEBV accuracies by 2%, on average, compared to using only the BA reference population. Our study indicated that genomic information could help us to improve GEBV accuracies and hence genetic progress in the Brazilian Angus population. The inclusion of genotypes from American Angus sires heavily used in Brazil just marginally increased the GEBV accuracies for selection candidates.  相似文献   

10.
旨在比较不同方法对中国荷斯坦牛繁殖性状的基因组预测效果,选择最佳的基因组预测方法及信息矩阵权重组合(τ和ω)用于实际育种。本研究利用北京地区33个牧场1998—2020年荷斯坦牛群繁殖记录,分析了3个重要繁殖性状:产犊至首次配种间隔(ICF)、青年牛配种次数(NSH)和成母牛配种次数(NSC)共98 483~197 764条表型数据。同时收集了8 718头母牛和3 477头公牛的基因芯片数据,根据具有芯片数据的牛群结构划分为公牛验证群和母牛验证群。随后,通过BLUPF90软件的AIREMLF90和BLUPF90模块利用最佳线性无偏预测(BLUP)、基因组最佳线性无偏预测(GBLUP)和一步法(ssGBLUP)对3个性状进行基因组预测,不同方法的预测效果根据准确性和无偏性来评估。结果表明,3个繁殖性状均为低遗传力性状(0.03~0.08);ssGBLUP方法中,各性状信息矩阵的权重取值能够在一定程度上提升基因组预测的效果;ICF、NSH和NSC在母牛验证群下的最佳权重取值分别为:τ=1.3和ω=0,τ=0.5和ω=0.4以及τ=0.5和ω=0;在公牛验证群下最优权重组合分别为:τ=1.5和ω=0,τ=1.3和ω=0.8以及τ=0.5和ω=0;基于最佳权重的ssGBLUP方法准确性较BLUP和GBLUP方法准确性分别提升了0.10~0.39和0.08~0.15,且无偏性最接近于1。综上,使用最佳权重组合的ssGBLUP时,各性状基因组预测结果具有较高准确性和无偏性,建议作为中国荷斯坦牛繁殖性状基因组选择方法。  相似文献   

11.
Data of broiler chickens for 2 pure lines across 3 generations were used for genomic evaluation. A complete population (full data set; FDS) consisted of 183,784 and 164,246 broilers for the 2 lines. The genotyped subsets (SUB) consisted of 3,284 and 3,098 broilers with 57,636 SNP. Genotyped animals were preselected based on more than 20 traits with different index applied to each line. Three traits were analyzed: BW at 6 wk (BW6), ultrasound measurement of breast meat (BM), and leg score (LS) coded 1 = no and 2 = yes for leg defect. Some phenotypes were missing for BM. The training population consisted of the first 2 generations including all animals in FDS or only genotyped animals in SUB. The validation data set contained only genotyped animals in the third generation. Genetic evaluations were performed using 3 approaches: 1) phenotypic BLUP, 2) extending BLUP methodologies to utilize pedigree and genomic information in a single step (ssGBLUP), and 3) Bayes A. Whereas BLUP and ssGBLUP utilized all phenotypic data, Bayes A could use only those of the genotyped subset. Heritabilities were 0.17 to 0.20 for BW6, 0.30 to 0.35 for BM, and 0.09 to 0.11 for LS. The average accuracies of the validation population with BLUP for BW6, BM, and LS were 0.46, 0.30, and <0 with SUB and 0.51, 0.34, and 0.28 with FDS. With ssGBLUP, those accuracies were 0.60, 0.34, and 0.06 with SUB and 0.61, 0.40, and 0.37 with FDS, respectively. With Bayes A, the accuracies were 0.60, 0.36, and 0.09 with SUB. With SUB, Bayes A and ssGBLUP had similar accuracies. For traits of high heritability, the accuracy of Bayes A/SUB and ssGBLUP/FDS were similar, and up to 50% better than BLUP/FDS. However, with low heritability, ssGBLUP/FDS was 4 to 6 times more accurate than Bayes A/SUB and 50% better than BLUP/FDS. An optimal genomic evaluation would be multi-trait and involve all traits and records on which selection is based.  相似文献   

12.
Genetic parameters for wool traits for Columbia, Polypay, Rambouillet, and Targhee breeds of sheep were estimated with single- and multiple-trait analyses using REML with animal models. Traits considered were fleece grade, fleece weight, and staple length. Total number of observations ranged from 11,673 to 34,746 for fleece grade and fleece weight and from 3,500 to 11,641 for staple length for the four breeds. For single-trait analyses, data were divided by age of ewe: young ages (age of 1 yr), middle ages (ages of 2 and 3 yr), and older ages (age greater than 3 yr). Heritability estimates averaged over breeds for fleece grade decreased from .42 at a young age to .37 for older ages. For fleece weight, heritability estimates averaged .52, .57, and .55 within the successively older groups. Heritability estimates for staple length averaged .54 for young and middle age classes. Few older ewes had staple length measurements. After single-trait analyses, new data sets were created for three-trait analyses with traits defined by three age classes when animals were measured. Heritability estimates with three-trait analyses, except for a few cases, were somewhat greater than those from single-trait analyses. For fleece grade, the genetic correlations averaged over breeds were .72 for young with middle, .42 for young with older, and .86 for middle with older age classes. For fleece weight, the average genetic correlations were .81, .83, and .98. For staple length, the average genetic correlation for young with middle age classes was .82. Estimates of genetic correlations across ages varied considerably among breeds. The average estimates of correlations suggest that fleece grade may need to be defined by age, especially for the Columbia and Rambouillet breeds. For fleece weight and staple length, however, the average correlations suggest no need to define those traits by age.  相似文献   

13.
Single-step genomic best linear unbiased prediction with the Algorithm for Proven and Young (APY) is a popular method for large-scale genomic evaluations. With the APY algorithm, animals are designated as core or noncore, and the computing resources to create the inverse of the genomic relationship matrix (GRM) are reduced by inverting only a portion of that matrix for core animals. However, using different core sets of the same size causes fluctuations in genomic estimated breeding values (GEBVs) up to one additive standard deviation without affecting prediction accuracy. About 2% of the variation in the GRM is noise. In the recursion formula for APY, the error term modeling the noise is different for every set of core animals, creating changes in breeding values. While average changes are small, and correlations between breeding values estimated with different core animals are close to 1.0, based on the normal distribution theory, outliers can be several times bigger than the average. Tests included commercial datasets from beef and dairy cattle and from pigs. Beyond a certain number of core animals, the prediction accuracy did not improve, but fluctuations decreased with more animals. Fluctuations were much smaller than the possible changes based on prediction error variance. GEBVs change over time even for animals with no new data as genomic relationships ties all the genotyped animals, causing reranking of top animals. In contrast, changes in nongenomic models without new data are small. Also, GEBV can change due to details in the model, such as redefinition of contemporary groups or unknown parent groups. In particular, increasing the fraction of blending of the GRM with a pedigree relationship matrix from 5% to 20% caused changes in GEBV up to 0.45 SD, with a correlation of GEBV > 0.99. Fluctuations in genomic predictions are part of genomic evaluation models and are also present without the APY algorithm when genomic evaluations are computed with updated data. The best approach to reduce the impact of fluctuations in genomic evaluations is to make selection decisions not on individual animals with limited individual accuracy but on groups of animals with high average accuracy.  相似文献   

14.
The objective of this study was to compare and determine the optimal validation method when comparing accuracy from single‐step GBLUP (ssGBLUP) to traditional pedigree‐based BLUP. Field data included six litter size traits. Simulated data included ten replicates designed to mimic the field data in order to determine the method that was closest to the true accuracy. Data were split into training and validation sets. The methods used were as follows: (i) theoretical accuracy derived from the prediction error variance (PEV) of the direct inverse (iLHS), (ii) approximated accuracies from the accf90(GS) program in the BLUPF90 family of programs (Approx), (iii) correlation between predictions and the single‐step GEBVs from the full data set (GEBVFull), (iv) correlation between predictions and the corrected phenotypes of females from the full data set (Yc), (v) correlation from method iv divided by the square root of the heritability (Ych) and (vi) correlation between sire predictions and the average of their daughters' corrected phenotypes (Ycs). Accuracies from iLHS increased from 0.27 to 0.37 (37%) in the Large White. Approximation accuracies were very consistent and close in absolute value (0.41 to 0.43). Both iLHS and Approx were much less variable than the corrected phenotype methods (ranging from 0.04 to 0.27). On average, simulated data showed an increase in accuracy from 0.34 to 0.44 (29%) using ssGBLUP. Both iLHS and Ych approximated the increase well, 0.30 to 0.46 and 0.36 to 0.45, respectively. GEBVFull performed poorly in both data sets and is not recommended. Results suggest that for within‐breed selection, theoretical accuracy using PEV was consistent and accurate. When direct inversion is infeasible to get the PEV, correlating predictions to the corrected phenotypes divided by the square root of heritability is adequate given a large enough validation data set.  相似文献   

15.
Breeding animals can be accurately evaluated using appropriate genomic prediction models, based on marker data and phenotype information. In this study, direct genomic values (DGV) were estimated for 16 traits of Nordic Total Merit (NTM) Index in Nordic Red cattle population using three models and two different response variables. The three models were as follows: a linear mixed model (GBLUP), a Bayesian variable selection model similar to BayesA (BayesA*) and a Bayesian least absolute shrinkage and selection operator model (Bayesian Lasso). The response variables were deregressed proofs (DRP) and conventional estimated breeding values (EBV). The reliability of genomic predictions was measured on bulls in the validation data set as the squared correlation between DGV and DRP divided by the reliability of DRP. Using DRP as response variable, the reliabilities of DGV among the 16 traits ranged from 0.151 to 0.569 (average 0.317) for GBLUP, from 0.152 to 0.576 (average 0.318) for BayesA* and from 0.150 to 0.570 (average 0.320) for Bayesian Lasso. Using EBV as response variable, the reliabilities ranged from 0.159 to 0.580 (average 0.322) for GBLUP, from 0.157 to 0.578 (average 0.319) for BayesA* and from 0.159 to 0.582 (average 0.325) for Bayesian Lasso. In summary, Bayesian Lasso performed slightly better than the other two models, and EBV performed slightly better than DRP as response variable, with regard to prediction reliability of DGV. However, these differences were not statistically significant. Moreover, using EBV as response variable would result in problems with the scale of the resulting DGV and potential problem due to double counting.  相似文献   

16.
Reference populations for genomic selection usually involve selected individuals, which may result in biased prediction of estimated genomic breeding values (GEBV). In a simulation study, bias and accuracy of GEBV were explored for various genetic models with individuals selectively genotyped in a typical nucleus breeding program. We compared the performance of three existing methods, that is, Best Linear Unbiased Prediction of breeding values using pedigree‐based relationships (PBLUP), genomic relationships for genotyped animals only (GBLUP) and a Single‐Step approach (SSGBLUP) using both. For a scenario with no‐selection and random mating (RR), prediction was unbiased. However, lower accuracy and bias were observed for scenarios with selection and random mating (SR) or selection and positive assortative mating (SA). As expected, bias disappeared when all individuals were genotyped and used in GBLUP. SSGBLUP showed higher accuracy compared to GBLUP, and bias of prediction was negligible with SR. However, PBLUP and SSGBLUP still showed bias in SA due to high inbreeding. SSGBLUP and PBLUP were unbiased provided that inbreeding was accounted for in the relationship matrices. Selective genotyping based on extreme phenotypic contrasts increased the prediction accuracy, but prediction was biased when using GBLUP. SSGBLUP could correct the biasedness while gaining higher accuracy than GBLUP. In a typical animal breeding program, where it is too expensive to genotype all animals, it would be appropriate to genotype phenotypically contrasting selection candidates and use a Single‐Step approach to obtain accurate and unbiased prediction of GEBV.  相似文献   

17.
Genomic selection (GS) is now practiced successfully across many species. However, many questions remain, such as long-term effects, estimations of genomic parameters, robustness of genome-wide association study (GWAS) with small and large datasets, and stability of genomic predictions. This study summarizes presentations from the authors at the 2020 American Society of Animal Science (ASAS) symposium. The focus of many studies until now is on linkage disequilibrium between two loci. Ignoring higher-level equilibrium may lead to phantom dominance and epistasis. The Bulmer effect leads to a reduction of the additive variance; however, the selection for increased recombination rate can release anew genetic variance. With genomic information, estimates of genetic parameters may be biased by genomic preselection, but costs of estimation can increase drastically due to the dense form of the genomic information. To make the computation of estimates feasible, genotypes could be retained only for the most important animals, and methods of estimation should use algorithms that can recognize dense blocks in sparse matrices. GWASs using small genomic datasets frequently find many marker-trait associations, whereas studies using much bigger datasets find only a few. Most of the current tools use very simple models for GWAS, possibly causing artifacts. These models are adequate for large datasets where pseudo-phenotypes such as deregressed proofs indirectly account for important effects for traits of interest. Artifacts arising in GWAS with small datasets can be minimized by using data from all animals (whether genotyped or not), realistic models, and methods that account for population structure. Recent developments permit the computation of P-values from genomic best linear unbiased prediction (GBLUP), where models can be arbitrarily complex but restricted to genotyped animals only, and single-step GBLUP that also uses phenotypes from ungenotyped animals. Stability was an important part of nongenomic evaluations, where genetic predictions were stable in the absence of new data even with low prediction accuracies. Unfortunately, genomic evaluations for such animals change because all animals with genotypes are connected. A top-ranked animal can easily drop in the next evaluation, causing a crisis of confidence in genomic evaluations. While correlations between consecutive genomic evaluations are high, outliers can have differences as high as 1 SD. A solution to fluctuating genomic evaluations is to base selection decisions on groups of animals. Although many issues in GS have been solved, many new issues that require additional research continue to surface.  相似文献   

18.
Effect of different genomic relationship matrices on accuracy and scale   总被引:1,自引:0,他引:1  
Phenotypic data on BW and breast meat area were available on up to 287,614 broilers. A total of 4,113 birds were genotyped for 57,636 SNP. Data were analyzed by a single-step genomic BLUP (ssGBLUP), which accounts for all phenotypic, pedigree, and genomic information. The genomic relationship matrix (G) in ssGBLUP was constructed using either equal (0.5; GEq) or current (GC) allele frequencies, and with all SNP or with SNP with minor allele frequencies (MAF) below multiple thresholds (0.1, 0.2, 0.3, and 0.4) ignored. Additionally, a pedigree-based relationship matrix for genotyped birds (A(22)) was available. The matrices and their inverses were compared with regard to average diagonal (AvgD) and off-diagonal (AvgOff) elements. In A(22), AvgD was 1.004 and AvgOff was 0.014. In GEq, both averages decreased with the increasing thresholds for MAF, with AvgD decreasing from 1.373 to 1.020 and AvgOff decreasing from 0.722 to 0.025. In GC, AvgD was approximately 1.01 and AvgOff was 0 for all MAF. For inverses of the relationship matrices, all AvgOff were close to 0; AvgD was 2.375 in A(22), varied from 11.563 to 12.943 for GEq, and increased from 8.675 to 12.859 for GC as the threshold for MAF increased. Predictive ability with all GEq and GC was similar except that at MAF = 0.4, they declined by 0.01 for BW and improved by 0.01 for breast meat area. Compared with BLUP, EBV in the ssGBLUP were, on average, increased by up to 1 additive SD greater with GEq and decreased by 2 additive SD less with GC. Genotyped animals were biased upward with GEq and downward with GC. The biases and differences in EBV could be controlled by adding a constant to GC; they were eliminated with a constant of 0.014, which corresponds to AvgOff in A(22). Unbiased evaluation in the ssGBLUP may be obtained with GC scaled to be compatible with A(22). The reduction of SNP with small MAF has a small effect on the real accuracy, but it may falsely increase the estimated accuracies by inversion.  相似文献   

19.
基因组选择常用的评估方法GBLUP和ssGBLUP都涉及到基因组亲缘矩阵的求逆,而大规模矩阵求逆运算非常耗时。本研究以提高大型基因组亲缘矩阵求逆运算的效率为目的。本研究通过真实数据和模拟数据构建基因组亲缘矩阵,引入Intel MKL矩阵函数,以减少迭代次数(方法1)和重复分块(方法2)两种方式改良分块迭代求逆算法,编程实现算法并在台式电脑和服务器上测试计算时间。结果表明,利用方法1计算4 000×4 000的基因组亲缘矩阵逆矩阵时,与MKL库函数的加速比为0.898。而16 000×16 000矩阵的计算速度为MKL库函数的1.006倍。利用方法2计算4 000×4 000矩阵的运算速度是MKL库函数的1.084倍;而在更大型的128 000×128 000基因组亲缘矩阵求逆运算时,该方法与MKL直接求逆函数的加速比为1.805倍。相比于MKL直接求逆函数,改进后的两种方法在效率上有一定程度的提升。  相似文献   

20.
Efficient computing techniques allow the estimation of variance components for virtually any traditional dataset. When genomic information is available, variance components can be estimated using genomic REML (GREML). If only a portion of the animals have genotypes, single-step GREML (ssGREML) is the method of choice. The genomic relationship matrix (G) used in both cases is dense, limiting computations depending on the number of genotyped animals. The algorithm for proven and young (APY) can be used to create a sparse inverse of G (GAPY~-1) with close to linear memory and computing requirements. In ssGREML, the inverse of the realized relationship matrix (H−1) also includes the inverse of the pedigree relationship matrix, which can be dense with a long pedigree, but sparser with short. The main purpose of this study was to investigate whether costs of ssGREML can be reduced using APY with truncated pedigree and phenotypes. We also investigated the impact of truncation on variance components estimation when different numbers of core animals are used in APY. Simulations included 150K animals from 10 generations, with selection. Phenotypes (h2 = 0.3) were available for all animals in generations 1–9. A total of 30K animals in generations 8 and 9, and 15K validation animals in generation 10 were genotyped for 52,890 SNP. Average information REML and ssGREML with G−1 and GAPY~-1 using 1K, 5K, 9K, and 14K core animals were compared. Variance components are impacted when the core group in APY represents the number of eigenvalues explaining a small fraction of the total variation in G. The most time-consuming operation was the inversion of G, with more than 50% of the total time. Next, numerical factorization consumed nearly 30% of the total computing time. On average, a 7% decrease in the computing time for ordering was observed by removing each generation of data. APY can be successfully applied to create the inverse of the genomic relationship matrix used in ssGREML for estimating variance components. To ensure reliable variance component estimation, it is important to use a core size that corresponds to the number of largest eigenvalues explaining around 98% of total variation in G. When APY is used, pedigrees can be truncated to increase the sparsity of H and slightly reduce computing time for ordering and symbolic factorization, with no impact on the estimates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号