首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Markov Chain Monte Carlo methods made possible estimation of parameters for complex random regression test‐day models. Models evolved from single‐trait with one set of random regressions to multiple‐trait applications with several random effects described by regressions. Gibbs sampling has been used for models with linear (with respect to coefficients) regressions and normality assumptions for random effects. Difficulties associated with implementations of Markov Chain Monte Carlo schemes include lack of good practical methods to assess convergence, slow mixing caused by high posterior correlations of parameters and long running time to generate enough posterior samples. Those problems are illustrated through comparison of Gibbs sampling schemes for single‐trait random regression test‐day models with different model parameterizations, different functions used for regressions and posterior chains of different sizes. Orthogonal polynomials showed better convergence and mixing properties in comparison with ‘lactation curve’ functions of the same number of parameters. Increasing the order of polynomials resulted in smaller number of independent samples for covariance components. Gibbs sampling under hierarchical model parameterization had a lower level of autocorrelation and required less time for computation. Posterior means and standard deviations of genetic parameters were very similar for chains of different size (from 20 000 to 1 000 000) after convergence. Single‐trait random regression models with large data sets can be analysed by Markov Chain Monte Carlo methods in relatively short time. Multiple‐trait (lactation) models are computationally more demanding and better algorithms are required.  相似文献   

2.
Multiple‐trait and random regression models have multiplied the number of equations needed for the estimation of variance components. To avoid inversion or decomposition of a large coefficient matrix, we propose estimation of variance components by Monte Carlo expectation maximization restricted maximum likelihood (MC EM REML) for multiple‐trait linear mixed models. Implementation is based on full‐model sampling for calculating the prediction error variances required for EM REML. Performance of the analytical and the MC EM REML algorithm was compared using a simulated and a field data set. For field data, results from both algorithms corresponded well even with one MC sample within an MC EM REML round. The magnitude of the standard errors of estimated prediction error variances depended on the formula used to calculate them and on the MC sample size within an MC EM REML round. Sampling variation in MC EM REML did not impair the convergence behaviour of the solutions compared with analytical EM REML analysis. A convergence criterion that takes into account the sampling variation was developed to monitor convergence for the MC EM REML algorithm. For the field data set, MC EM REML proved far superior to analytical EM REML both in computing time and in memory need.  相似文献   

3.
4.
Physiologically based pharmacokinetic (PBPK) models for chemicals in food animals are a useful tool in estimating chemical tissue residues and withdrawal intervals. Physiological parameters such as organ weights and blood flows are an important component of a PBPK model. The objective of this study was to compile PBPK-related physiological parameter data in food animals, including cattle and swine. Comprehensive literature searches were performed in PubMed, Google Scholar, ScienceDirect, and ProQuest. Relevant literature was reviewed and tables of relevant parameters such as relative organ weights (% of body weight) and relative blood flows (% of cardiac output) were compiled for different production classes of cattle and swine. The mean and standard deviation of each parameter were calculated to characterize their variability and uncertainty and to allow investigators to conduct population PBPK analysis via Monte Carlo simulations. Regression equations using weight or age were created for parameters having sufficient data. These compiled data provide a comprehensive physiological parameter database for developing PBPK models of chemicals in cattle and swine to support animal-derived food safety assessment. This work also provides a basis to compile data in other food animal species, including goats, sheep, chickens, and turkeys.  相似文献   

5.
Monte Carlo (MC) methods have been found useful in estimation of variance parameters for large data and complex models with many variance components (VC), with respect to both computer memory and computing time. A disadvantage has been a fluctuation in round‐to‐round values of estimates that makes the estimation of convergence challenging. Furthermore, with Newton‐type algorithms, the approximate Hessian matrix might have sufficient accuracy, but the inaccuracy in the gradient vector exaggerates the round‐to‐round fluctuation to intolerable. In this study, the reuse of the same random numbers within each MC sample was used to remove the MC fluctuation. Simulated data with six VC parameters were analysed by four different MC REML methods: expectation‐maximization (EM), Newton–Raphson (NR), average information (AI) and Broyden's method (BM). In addition, field data with 96 VC parameters were analysed by MC EM REML. In all the analyses with reused samples, the MC fluctuations disappeared, but the final estimates by the MC REML methods differed from the analytically calculated values more than expected especially when the number of MC samples was small. The difference depended on the random numbers generated, and based on repeated MC AI REML analyses, the VC estimates were on average non‐biased. The advantage of reusing MC samples is more apparent in the NR‐type algorithms. Smooth convergence opens the possibility to use the fast converging Newton‐type algorithms. However, a disadvantage from reusing MC samples is a possible “bias” in the estimates. To attain acceptable accuracy, sufficient number of MC samples need to be generated.  相似文献   

6.
Infection prevalence in a population often is estimated from grouped binary data expressed as proportions. The groups can be families, herds, flocks, farms, etc. The observed number of cases generally is assumed to have a Binomial distribution and the estimate of prevalence is then the sample proportion of cases. However, the individual binary observations might not be independent--leading to overdispersion. The goal of this paper was to demonstrate random-effects models for the estimation of infection prevalence from data which are correlated and in particular, to illustrate a nonparametric random-effects model for this purpose. The nonparametric approach is a relatively recent addition to the random-effects class of models and does not appear to have been discussed previously in the veterinary epidemiology literature. The assumptions for a logistic-regression model with a nonparametric random effect were outlined. In a demonstration of the method on data relating to Salmonella infection in Irish pig herds, the nonparametric method resulted in the classification of herds into a small number of distinct prevalence groups (i.e. low, medium and high prevalence) and also estimated the relative frequency of each prevalence category in the population. We compared the estimates from a logistic model with a nonparametric distribution for the random effects with four alternative models: a logistic-regression model with no random effects, a marginal model using a generalised estimating equation (GEE) and two methods of fitting a Normally distributed random effect (the GLIMMIX macro and the NLMIXED procedure both in SAS). Parameter estimates from random-effects models are not readily interpretable in terms of prevalences. Therefore, we outlined two methods for calculating population-averaged estimates of prevalence from random-effects models: one using numerical integration and the other using Monte Carlo simulation.  相似文献   

7.
Properties of threshold model predictions   总被引:4,自引:0,他引:4  
Estimation of genetic parameters and accuracy of threshold model genetic predictions were investigated. Data were simulated for different population structures by using Monte Carlo techniques. Variance components were estimated by using threshold models and linear sire models applied to untransformed data, logarithmically transformed data, and transformation to Snell scores. Effects of number of categories (2, 5, and 10), incidence of categories (extreme, moderate, and normal), heritability in the underlying scale (.04, .20, and .50), and data structure (unbalanced and balanced) on accuracy of genetic prediction were investigated. The real importance of using a threshold model was to estimate genetic parameters. An expected heritability of .20 was estimated to be .22 and .10 by a threshold model and a linear model, respectively. Accuracy increased significantly with a larger number of categories, a more normal distribution of incidences, increased heritability, and more balanced data. Even threshold models were shown to be more efficient with more than two categories (e.g., binomial). Transformation of scale did not accomplish the purpose intended.  相似文献   

8.
The Markov chain Monte Carlo (MCMC) strategy provides remarkable flexibility for fitting complex hierarchical models. However, when parameters are highly correlated in their posterior distributions and their number is large, a particular MCMC algorithm may perform poorly and the resulting inferences may be affected. The objective of this study was to compare the efficiency (in terms of the asymptotic variance of features of posterior distributions of chosen parameters, and in terms of computing cost) of six MCMC strategies to sample parameters using simulated data generated with a reaction norm model with unknown covariates as an example. The six strategies are single-site Gibbs updates (SG), single-site Gibbs sampler for updating transformed (a priori independent) additive genetic values (TSG), pairwise Gibbs updates (PG), blocked (all location parameters are updated jointly) Gibbs updates (BG), Langevin-Hastings (LH) proposals, and finally Langevin-Hastings proposals for updating transformed additive genetic values (TLH). The ranking of the methods in terms of asymptotic variance is affected by the degree of the correlation structure of the data and by the true values of the parameters, and no method comes out as an overall winner across all parameters. TSG and BG show very good performance in terms of asymptotic variance especially when the posterior correlation between genetic effects is high. In terms of computing cost, TSG performs best except for dispersion parameters in the low correlation scenario where SG was the best strategy. The two LH proposals could not compete with any of the Gibbs sampling algorithms. In this study it was not possible to find an MCMC strategy that performs optimally across the range of target distributions and across all possible values of parameters. However, when the posterior correlation between parameters is high, TSG, BG and even PG show better mixing than SG.  相似文献   

9.
First parity calving difficulty scores from Italian Piemontese cattle were analysed using a threshold mixed effects model. The model included the fixed effects of age of dam and sex of calf and their interaction and the random effects of sire, maternal grandsire, and herd‐year‐season. Covariances between sire and maternal grandsire effects were modelled using a numerator relationship matrix based on male ancestors. Field data consisted of 23 953 records collected between 1989 and 1998 from 4741 herd‐year‐seasons. Variance and covariance components were estimated using two alternative approximate marginal maximum likelihood (MML) methods, one based on expectation‐maximization (EM) and the other based on Laplacian integration. Inferences were compared to those based on three separate runs or sequences of Markov Chain Monte Carlo (MCMC) sampling in order to assess the validity of approximate MML estimates derived from data with similar size and design structure. Point estimates of direct heritability were 0.24, 0.25 and 0.26 for EM, Laplacian and MCMC (posterior mean), respectively, whereas corresponding maternal heritability estimates were 0.10, 0.11 and 0.12, respectively. The covariance between additive direct and maternal effects was found to be not different from zero based on MCMC‐derived confidence sets. The conventional joint modal estimates of sire effects and associated standard errors based on MML estimates of variance and covariance components differed little from the respective posterior means and standard deviations derived from MCMC. Therefore, there may be little need to pursue computation‐intensive MCMC methods for inference on genetic parameters and genetic merits using conventional threshold sire and maternal grandsire models for large datasets on calving ease.  相似文献   

10.
The accessibility of Markov Chain Monte Carlo (MCMC) methods for statistical inference have improved with the advent of general purpose software. This enables researchers with limited statistical skills to perform Bayesian analysis. Using MCMC sampling to do statistical inference requires convergence of the MCMC chain to its stationary distribution. There is no certain way to prove convergence; it is only possible to ascertain when convergence definitely has not been achieved. These methods are rather subjective and not implemented as automatic safeguards in general MCMC software. This paper considers a pragmatic approach towards assessing the convergence of MCMC methods illustrated by a Bayesian analysis of the Hui–Walter model for evaluating diagnostic tests in the absence of a gold standard. The Hui–Walter model has two optimal solutions, a property which causes problems with convergence when the solutions are sufficiently close in the parameter space. Using simulated data we demonstrate tools to assess the convergence and mixing of MCMC chains using examples with and without convergence. Suggestions to remedy the situation when the MCMC sampler fails to converge are given. The epidemiological implications of the two solutions of the Hui–Walter model are discussed.  相似文献   

11.
Binary repeated measures data are commonly encountered in both experimental and observational veterinary studies. Among the wide range of statistical methods and software applicable to such data one major distinction is between marginal and random effects procedures. The objective of the study was to review and assess the performance of marginal and random effects estimation procedures for the analysis of binary repeated measures data. Two simulation studies were carried out, using relatively small, balanced, two-level (time within subjects) datasets. The first study was based on data generated from a marginal model with first order autocorrelation, the second on a random effects model with autocorrelated random effects within subjects. Three versions of the models were considered in which a dichotomous treatment was modelled additively, either between or within subjects, or modelled by a time interaction. Among the studied statistical procedures were: generalized estimating equations (GEE), Marginal Quasi Likelihood, likelihood based on numerical integration, penalized quasi-likelihood, restricted pseudo likelihood and Bayesian Markov Chain Monte Carlo. Results for data generated by the marginal model showed autoregressive GEE to be highly efficient when treatment was within subjects, even with strongly correlated responses. For treatment between subjects, random effects procedures also performed well in some situations; however, a relatively small number of subjects with a short time series proved a challenge for both marginal and random effects procedures. Results for data generated by the random effects model showed bias in estimates from random effects procedures when autocorrelation was present in the data, while the marginal procedures generally gave estimates close to the marginal parameters.  相似文献   

12.
Between holding contacts are more common over short distances and this may have implications for the dynamics of disease spread through these contacts. A reliable estimation of how contacts depend on distance is therefore important when modeling livestock diseases. In this study, we have developed a method for analyzing distant dependent contacts and applied it to animal movement data from Sweden. The data were analyzed with two competing models. The first model assumes that contacts arise from a purely distance dependent process. The second is a mixture model and assumes that, in addition, some contacts arise independent of distance. Parameters were estimated with a Bayesian Markov Chain Monte Carlo (MCMC) approach and the model probabilities were compared. We also investigated possible between model differences in predicted contact structures, using a collection of network measures.We found that the mixture model was a much better model for the data analyzed. Also, the network measures showed that the models differed considerably in predictions of contact structures, which is expected to be important for disease spread dynamics. We conclude that a model with contacts being both dependent on, and independent of, distance was preferred for modeling the example animal movement contact data.  相似文献   

13.
A hierarchical model for inferring the parameters of the joint distribution of a trait measured longitudinally and another assessed cross-sectionally, when selection has been applied to the cross-sectional trait, is presented. Distributions and methods for a Bayesian implementation via Markov Chain Monte Carlo procedures are discussed for the case where information about the selection criterion is available for all the individuals, but longitudinal records are available only in the later generations. Alternative specifications of the residual covariance structure are suggested. The procedure is illustrated with an analysis of correlated responses in growth curve parameters in a population of rabbits selected for increased growth rate. Results agree with those obtained in a previous study using both selected and control populations. The high correlation between samples indicates slow mixing, resulting in small effective sample sizes and high Monte Carlo standard errors.  相似文献   

14.
Markov chain Monte Carlo (MCMC) enables fitting complex hierarchical models that may adequately reflect the process of data generation. Some of these models may contain more parameters than can be uniquely inferred from the distribution of the data, causing non‐identifiability. The reaction norm model with unknown covariates (RNUC) is a model in which unknown environmental effects can be inferred jointly with the remaining parameters. The problem of identifiability of parameters at the level of the likelihood and the associated behaviour of MCMC chains were discussed using the RNUC as an example. It was shown theoretically that when environmental effects (covariates) are considered as random effects, estimable functions of the fixed effects, (co)variance components and genetic effects are identifiable as well as the environmental effects. When the environmental effects are treated as fixed and there are other fixed factors in the model, the contrasts involving environmental effects, the variance of environmental sensitivities (genetic slopes) and the residual variance are the only identifiable parameters. These different identifiability scenarios were generated by changing the formulation of the model and the structure of the data and the models were then implemented via MCMC. The output of MCMC sampling schemes was interpreted in the light of the theoretical findings. The erratic behaviour of the MCMC chains was shown to be associated with identifiability problems in the likelihood, despite propriety of posterior distributions, achieved by arbitrarily chosen uniform (bounded) priors. In some cases, very long chains were needed before the pattern of behaviour of the chain may signal the existence of problems. The paper serves as a warning concerning the implementation of complex models where identifiability problems can be difficult to detect a priori. We conclude that it would be good practice to experiment with a proposed model and to understand its features before embarking on a full MCMC implementation.  相似文献   

15.
Sampling genotype configurations in a large complex pedigree   总被引:1,自引:1,他引:0  
Many genetic problems can be solved by Monte Carlo method. This often requires sampling genotype configurations over pedigree. Current available samplers are inefficient for large animal pedigrees. A new sampler suitable for large complex pedigrees has been developed and evaluated. The sampler uses simple and iterative peeling algorithms alternately. The sampler was compared to two other samplers on hypothetical pedigree of 79 individuals and recessive disease. The behaviour of the sampler was evaluated in four experimental designs on real bovine pedigree of 907,903 animals. The application of the sampler was also exemplified in identical by descent study.  相似文献   

16.
常见病原微生物对抗微生物药物的耐药性正逐渐增加,为了达到最佳治疗效果,临床用药必须根据药动学与药效学数据调整给药方案。药动学能够提供药物浓度在组织、体液和感染部位的经时过程,而药效学则反映药物对致病菌的杀灭或抑制能力。蒙特卡罗模拟法则是利用统计学抽样来获得数学方程的近似解的一种方法,目前采用蒙特卡罗模拟法进行实时模拟正成为国际上研究抗微生物药物的药动学和药效学的热点。论文就蒙特卡罗模拟法的原理、拟合过程及其在估算细菌对药物的敏感性折点、比较药动-药效参数以选择最优药物等方面做一综述。  相似文献   

17.
Quantitative risk assessments are now required to support many regulatory decisions involving infectious diseases of animals. Current methods, however, do not consider the relative values of historical and recent data. A Markov-chain model can use specific disease characteristics to estimate the present value of disease information collected in the past. Uncertainty about the disease characteristics and variability among animals and herds can be accounted for with Monte Carlo simulation modeling. This results in a transparent method of valuing historical testing information for use in risk assessments. We constructed such a model to value historical testing information in a more-transparent and -reproducible manner. Applications for this method include trade, food safety, and domestic animal-health regulations.  相似文献   

18.
Robust threshold models with multivariate Student's t or multivariate Slash link functions were employed to infer genetic parameters of clinical mastitis at different stages of lactation, with each cow defining a cluster of records. The robust fits were compared with that from a multivariate probit model via a pseudo‐Bayes factor and an analysis of residuals. Clinical mastitis records on 36 178 first‐lactation Norwegian Red cows from 5286 herds, daughters of 245 sires, were analysed. The opportunity for infection interval, going from 30 days pre‐calving to 300 days postpartum, was divided into four periods: (i) ?30 to 0 days pre‐calving; (ii) 1–30 days; (iii) 31–120 days; and (iv) 121–300 days of lactation. Within each period, absence or presence of clinical mastitis was scored as 0 or 1 respectively. Markov chain Monte Carlo methods were used to draw samples from posterior distributions of interest. Pseudo‐Bayes factors strongly favoured the multivariate Slash and Student's t models over the probit model. The posterior mean of the degrees of freedom parameter for the Slash model was 2.2, indicating heavy tails of the liability distribution. The posterior mean of the degrees of freedom for the Student's t model was 8.5, also pointing away from a normal liability for clinical mastitis. A residual was the observed phenotype (0 or 1) minus the posterior mean of the probability of mastitis. The Slash and Student's t models tended to have smaller residuals than the probit model in cows that contracted mastitis. Heritability of liability to clinical mastitis was 0.13–0.14 before calving, and ranged from 0.05 to 0.08 after calving in the robust models. Genetic correlations were between 0.50 and 0.73, suggesting that clinical mastitis resistance is not the same trait across periods, corroborating earlier findings with probit models.  相似文献   

19.
Efficiency of selection strategies for halothane-negative gene   总被引:1,自引:0,他引:1  
Use of a method to estimate the frequency of the halothane-negative allele in boars is illustrated for different sampling schemes for boar testing programs and for testing within closed breeding populations. This method uses information not only on the individual, but also on all mates and relatives including parents, siblings and offspring. Accuracy of the estimates of the allelic frequency in boars was measured through use of Monte Carlo simulation. The selection differential in real frequency of the halothane-negative allele when boars were selected on estimated allelic frequency was used as the criterion for accuracy. In the progeny testing situation, phenotypes of base boars and one generation of offspring were available. The average selection differentials with 90% selection (i.e., culling 10% of boars on estimated allelic frequency) when 2 and 10 litters of two boars each were tested were .017 and .044 in base boars and .013 and .025 in the offspring. The value of the boar's own phenotype was small. Higher selection differentials were found in the closed herd situation, where data on two generations were available. The selection differential in base boars when 10 litters were tested increased from .046 to .066 when the proportion of boars selected decreased from 90% to 50%. No improvement in selection differential with proportion selected was found in the progeny testing situation. Intense selection is most effective when the number of litters per boar is large and data over several generations are used. The estimation procedure for allelic frequencies in boars should improve current screening and selection programs to reduce halothane sensitivity in pigs.  相似文献   

20.
Sources of variation in measures of reproductive performance in dairy cattle were evaluated using data collected from 3207 lactations in 1570 cows in 50 herds from five geographic regions of Reunion Island (located off the east coast of Madagascar). Three continuously distributed reproductive parameters (intervals from calving-to-conception, calving-to-first-service and first-service-to-conception) were considered, along with one Binomial outcome (first-service-conception risk). Multilevel models which take into account the hierarchical nature of the data were used to fit all models. For the overall measure of calving-to-conception interval, 86% of the variation resided at the lactation level with only 7, 6 and 2% at the cow, herd and regional levels, respectively. The proportion of variance at the herd and cow levels were slightly higher for the calving-to-first-service interval (12 and 9%, respectively) - but for the other two parameters (first-service-conception risk and first-service-to-conception interval), >90% of the variation resided at the lactation level. For the three continuous dependent variables, comparison of results between models based on log-transformed data and Box-Cox-transformed data suggested that minor departures from the assumption of normality did not have a substantial effect on the variance estimates. For the Binomial dependent variable, five different estimation procedures (penalised quasi-likelihood, Markov-Chain Monte Carlo, parametric and non-parametric bootstrap estimates and maximum-likelihood) yielded substantially different results for the estimate of the cow-level variance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号