首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 78 毫秒
In this paper, we introduce a novel discrete Gamma Markov random field (MRF) prior for modeling spatial relations among regions in geo-referenced health data. Our proposition is incorporated into a generalized linear mixed model zero-inflated (ZI) framework that accounts for excess zeroes not explained by usual parametric (Poisson or Negative Binomial) assumptions. The ZI framework categorizes subjects into low-risk and high-risk groups. Zeroes arising from the low-risk group contributes to structural zeroes, while the high-risk members contributes to random zeroes. We aim to identify explanatory covariates that might have significant effect on (i) the probability of subjects in low-risk group, and (ii) intensity of the high risk group, after controlling for spatial association and subject-specific heterogeneity. Model fitting and parameter estimation are carried out under a Bayesian paradigm through relevant Markov chain Monte Carlo (MCMC) schemes. Simulation studies and application to a real data on hypertensive disorder of pregnancy confirms that our model provides superior fit over the widely used conditionally auto-regressive proposition.  相似文献   

We consider a model-based clustering approach to examining abundance trends in a metapopulation. When examining trends for an animal population with management goals in mind one is often interested in those segments of the population that behave similarly to one another with respect to abundance. Our proposed trend analysis incorporates a clustering method that is an extension of the classic Chinese Restaurant Process, and the associated Dirichlet process prior, which allows for inclusion of distance covariates between sites. This approach has two main benefits: (1) nonparametric spatial association of trends and (2) reduced dimension of the spatio-temporal trend process. We present a transdimensional Gibbs sampler for making Bayesian inference that is efficient in the sense that all of the full conditionals can be directly sampled from save one. To demonstrate the proposed method we examine long term trends in northern fur seal pup production at 19 rookeries in the Pribilof Islands, Alaska. There was strong evidence that clustering of similar year-to-year deviation from linear trends was associated with whether rookeries were located on the same island. Clustering of local linear trends did not seem to be strongly associated with any of the distance covariates. In the fur seal trends analysis an overwhelming proportion of the MCMC iterations produced a 73–79 % reduction in the dimension of the spatio-temporal trend process, depending on the number of cluster groups.  相似文献   

This paper develops a Bayesian approach for spatial inference on animal density from line transect survey data. We model the spatial distribution of animals within a geographical area of interest by an inhomogeneous Poisson process whose intensity function incorporates both covariate effects and spatial smoothing of residual variation. Independently thinning the animal locations according to their estimated detection probabilities results into another spatial Poisson process for the sightings (the observations). Prior distributions are elicited for all unknown model parameters. Due to the sparsity of data in the application we consider, eliciting sensible prior distributions is important in order to get meaningful estimation results. A reversible jump Markov Chain Monte Carlo (MCMC) algorithm for simulation of the posterior distribution is developed. We present results for simulated data and a real data set of minke whale pods from Antarctic waters. The main advantages of our method compared to design-based analyses are that it can use data arising from sources other than specifically designed surveys and its ability to link covariate effects to variation of animal density. The Bayesian paradigm provides a coherent framework for quantifying uncertainty in estimation results.  相似文献   

We develop a novel modeling strategy for analyzing data with repeated binary responses over time as well as time-dependent missing covariates. We assume that covariates are missing at random (MAR). We use the generalized linear mixed logistic regression model for the repeated binary responses and then propose a joint model for time-dependent missing covariates using information from different sources. A Monte Carlo EM algorithm is developed for computing the maximum likelihood estimates. We propose an extended version of the AIC criterion to identify the important factors that m a y explain the binary responses. A real plant dataset is used to motivate and illustrate the proposed methodology.  相似文献   

贺倩      汪明      刘凯     《水土保持研究》2022,29(3):396-403+410
Logistic回归模型(Logistic Regression,LR)在滑坡敏感性评价上应用广泛,但目前对于模型参数不确定性的研究较为缺乏。马尔可夫链蒙特卡罗(Markov Chain Monte Carlo,MCMC)方法能够结合参数的先验信息得到其后验分布,从而对估计参数的不确定性进行分析。为探索MCMC方法在Logistic滑坡敏感性模型构建中的有效性; 量化模型参数估计值的不确定性,以西南地区2013年4·20芦山地震,2017年8·8九寨沟地震和2014年8·3鲁甸地震为例,基于MCMC方法对Logistic回归模型的回归系数进行估计。构建了区域的地震滑坡敏感性模型,对模型参数的估计值进行了不确定性分析,并绘制了区域的滑坡敏感性图。结果表明:在芦山地震案例中,模型参数估计值的不确定性都比较低; 在九寨沟案例中,岩性因子的参数估计值不确定性较高; 在鲁甸地震中,岩性、剖面曲率和平面曲率的参数不确定性较高。总的来说,模型中的大多数参数估计值不确定性都较低。所构建的Logistic回归模型在三次地震滑坡事件中的预测精度都较高,AUC(Area Under ROC Curve)值均在0.9以上,这证明了MCMC方法对Logistic模型参数估计的准确性。在三次地震滑坡事件中,因子相对重要性最大的为高程,其次为距离断层的距离以及修正麦卡利烈度。研究为利用LR模型进行滑坡敏感性评价提供了一种新的思路和方法。  相似文献   

We propose a Bayesian model for mixed ordinal and continuous multivariate data to evaluate a latent spatial Gaussian process. Our proposed model can be used in many contexts where mixed continuous and discrete multivariate responses are observed in an effort to quantify an unobservable continuous measurement. In our example, the latent, or unobservable measurement is wetland condition. While predicted values of the latent wetland condition variable produced by the model at each location do not hold any intrinsic value, the relative magnitudes of the wetland condition values are of interest. In addition, by including point-referenced covariates in the model, we are able to make predictions at new locations for both the latent random variable and the multivariate response. Lastly, the model produces ranks of the multivariate responses in relation to the unobserved latent random field. This is an important result as it allows us to determine which response variables are most closely correlated with the latent variable. Our approach offers an alternative to traditional indices based on best professional judgment that are frequently used in ecology. We apply our model to assess wetland condition in the North Platte and Rio Grande River Basins in Colorado. The model facilitates a comparison of wetland condition at multiple locations and ranks the importance of in-field measurements.  相似文献   

The uncertainty in estimation of spatial animal density from line transect surveys depends on the degree of spatial clustering in the animal population. To quantify the clustering we model line transect data as independent thinnings of spatial shot-noise Cox processes. Likelihood-based inference is implemented using Markov chain Monte Carlo (MCMC) methods to obtain efficient estimates of spatial clustering parameters. Uncertainty is addressed using parametric bootstrap or by consideration of posterior distributions in a Bayesian setting. Maximum likelihood estimation and Bayesian inference are compared in an example concerning minke whales in the northeast Atlantic.  相似文献   

The three most common techniques to interpolate soil properties at a field scale—ordinary kriging (OK), regression kriging with multiple linear regression drift model (RK + MLR), and regression kriging with principal component regression drift model (RK + PCR)—were examined. The results of the performed study were compiled into an algorithm of choosing the most appropriate soil mapping technique. Relief attributes were used as the auxiliary variables. When spatial dependence of a target variable was strong, the OK method showed more accurate interpolation results, and the inclusion of the auxiliary data resulted in an insignificant improvement in prediction accuracy. According to the algorithm, the RK + PCR method effectively eliminates multicollinearity of explanatory variables. However, if the number of predictors is less than ten, the probability of multicollinearity is reduced, and application of the PCR becomes irrational. In that case, the multiple linear regression should be used instead.  相似文献   

Association analysis in important crop species has generated heightened interest for its potential in dissecting complex traits by utilizing diverse mapping populations. However, the mixed linear model approach is currently limited to single marker analysis, which is not suitable for studying multiple QTL effects, epistasis and gene by environment interactions. In this paper, we propose the adaptive mixed LASSO method that can incorporate a large number of predictors (genetic markers, epistatic effects, environmental covariates, and gene by environment interactions) while simultaneously accounting for the population structure. We show that the adaptive mixed LASSO estimator possesses the oracle property of adaptive LASSO. Algorithms are developed to iteratively estimate the regression coefficients and variance components. Our results demonstrate that the adaptive mixed LASSO method is very promising in modeling multiple genetic effects when a large number of markers are available and the population structure cannot be ignored. It is expected to be a powerful tool for studying the architecture of complex traits in important plant species. Supplemental materials for this article are available from the journal website.  相似文献   

Geostatistical estimates of a soil property by kriging are equivalent to the best linear unbiased predictions (BLUPs). Universal kriging is BLUP with a fixed‐effect model that is some linear function of spatial co‐ordinates, or more generally a linear function of some other secondary predictor variable when it is called kriging with external drift. A problem in universal kriging is to find a spatial variance model for the random variation, since empirical variograms estimated from the data by method‐of‐moments will be affected by both the random variation and that variation represented by the fixed effects. The geostatistical model of spatial variation is a special case of the linear mixed model where our data are modelled as the additive combination of fixed effects (e.g. the unknown mean, coefficients of a trend model), random effects (the spatially dependent random variation in the geostatistical context) and independent random error (nugget variation in geostatistics). Statisticians use residual maximum likelihood (REML) to estimate variance parameters, i.e. to obtain the variogram in a geostatistical context. REML estimates are consistent (they converge in probability to the parameters that are estimated) with less bias than both maximum likelihood estimates and method‐of‐moment estimates obtained from residuals of a fitted trend. If the estimate of the random effects variance model is inserted into the BLUP we have the empirical BLUP or E‐BLUP. Despite representing the state of the art for prediction from a linear mixed model in statistics, the REML–E‐BLUP has not been widely used in soil science, and in most studies reported in the soils literature the variogram is estimated with methods that are seriously biased if the fixed‐effect structure is more complex than just an unknown constant mean (ordinary kriging). In this paper we describe the REML–E‐BLUP and illustrate the method with some data on soil water content that exhibit a pronounced spatial trend.  相似文献   

We examine the hypothesis of an increase of humus disintegration by analyzing chemical substances measured in the seepage water of a German forest. Problems arise because of a large percentage of missing observations. We use a regression model with spatial and temporal effects constructed in an exploratory data analysis. Spatial dependencies are modeled by random effects and an autoregressive structure for observations in distinct soil depths resulting in a recursive linear mixed model structure. Temporal dependencies are included by an autoregressive structure of the random effects. For parameter estimation an EM algorithm is deduced assuming the errors to be Gaussian. As a result of the data analysis we specify chemical substances which possibly affect the process of humus disintegration. In particular, we find evidence that the presence of aluminum ions is important, but because of the high correlations among the regressors this might be due to confounding with iron.  相似文献   

Black Carbon (BC) is an important carbon pool due to its relative stability in soil. Thus, it is essential to determine the amount of BC in soil to have a better understanding of the global carbon cycle. The spatial distribution of BC was determined in the central region of France in relation to the main controlling factors. BC was measured for topsoil at 158 sites in the French soil monitoring network on a regular 16 × 16‐km grid. A linear mixed model (LMM) which included fixed effects (linear relationships between BC content and covariates) and spatially correlated random effects was used for mapping BC to aid explanation. Covariates were selected from a set of factors linked to the BC cycle using the Akaike Information Criterion (AIC). The results show high variability in BC content with a minimum of 0.9%, a maximum of 32% and an average of 5.3% for total organic carbon. The fine‐earth fraction and clay content gave the best statistical explanation for the spatial distribution of BC. Data on these covariates were not available in total for the whole study area, and therefore we reselected covariates using the fine‐earth amount and density of fires from burning crop residues.  相似文献   

Clustered data, either as an explicit part of the study design or due to the natural distribution of habitats, populations, and so on, are frequently encountered by biologists. Mixed effect models provide a framework that can handle clustered data by estimating cluster-specific random effects and introducing correlated residual structures. General parametric models have been shown not to suit all biological problems, resulting in an increased popularity for local regression procedures, such as LOESS and splines. To evaluate similar biological problems for clustered data with cluster-specific random effects and potential dependencies between within-cluster residuals, we suggest a local linear mixed model (LLMM). The LLMM approach is a local version of a linear mixed-effect model (LME), and the LLMM approach produces: (1) local shared predictions, (2) local cluster-specific predictions, and (3) estimates of cluster-specific random effects conditioned on the covariates. Thus, in addition to the local estimates of the expected response, we obtain information about how the cluster-specific random variability depends on the values of the covariate. Ovary data are used to illustrate the flexibility and potential of this procedure in biological contexts.  相似文献   

In this paper we consider generalized linear latent variable models that can handle overdispersed counts and continuous but non-negative data. Such data are common in ecological studies when modelling multivariate abundances or biomass. By extending the standard generalized linear modelling framework to include latent variables, we can account for any covariation between species not accounted for by the predictors, notably species interactions and correlations driven by missing covariates. We show how estimation and inference for the considered models can be performed efficiently using the Laplace approximation method and use simulations to study the finite-sample properties of the resulting estimates. In the overdispersed count data case, the Laplace-approximated estimates perform similarly to the estimates based on variational approximation method, which is another method that provides a closed form approximation of the likelihood. In the biomass data case, we show that ignoring the correlation between taxa affects the regression estimates unfavourably. To illustrate how our methods can be used in unconstrained ordination and in making inference on environmental variables, we apply them to two ecological datasets: abundances of bacterial species in three arctic locations in Europe and abundances of coral reef species in Indonesia.Supplementary materials accompanying this paper appear on-line.  相似文献   

In some finite sampling situations, there is a primary variable that is sampled, and there are measurements on covariates for the entire population. A Bayesian hierarchical model for estimating totals for finite populations is proposed. A nonparametric linear model is assumed to explain the relationship between the dependent variable of interest and covariates. The regression coefficients in the linear model are allowed to vary as a function of a subset of covariates nonparametrically based on B-splines. The generality of this approach makes it robust and applicable to data collected using a variety of sampling techniques, provided the sample is representative of the finite population. A simulation study is carried out to evaluate the performance of the proposed model for the estimation of the population total. Results indicate accurate estimation of population totals using the approach. The modeling approach is used to estimate the total production of avocado for a large group of groves in Mexico.  相似文献   

This article considers logistic regression analysis of binary data that are measured on a spatial lattice and repeatedly over discrete time points. We propose a spatial-temporal autologistic regression model and draw statistical inference via maximum likelihood. Due to an unknown normalizing constant in the likelihood function, we use Monte Carlo to obtain maximum likelihood estimates of the model parameters and predictive distributions at future time points. We also use path sampling to estimate the unknown normalizing constant and approximate an information criterion for model assessment. The methodology is illustrated by the analysis of a dataset of mountain pine beetle outbreaks in western Canada.  相似文献   

Much of animal ecology is devoted to studies of abundance and occurrence of species, based on surveys of spatially referenced sample units. These surveys frequently yield sparse counts that are contaminated by imperfect detection, making direct inference about abundance or occurrence based on observational data infeasible. This article describes a flexible hierarchical modeling framework for estimation and inference about animal abundance and occurrence from survey data that are subject to imperfect detection. Within this framework, we specify models of abundance and detectability of animals at the level of the local populations defined by the sample units. Information at the level of the local population is aggregated by specifying models that describe variation in abundance and detection among sites. We describe likelihood-based and Bayesian methods for estimation and inference under the resulting hierarchical model. We provide two examples of the application of hierarchical models to animal survey data, the first based on removal counts of stream fish and the second based on avian quadrat counts. For both examples, we provide a Bayesian analysis of the models using the software WinBUGS.  相似文献   

Spatial Regression Modeling for Compositional Data With Many Zeros   总被引:1,自引:0,他引:1  
Compositional data analysis considers vectors of nonnegative-valued variables subject to a unit-sum constraint. Our interest lies in spatial compositional data, in particular, land use/land cover (LULC) data in the northeastern United States. Here, the observations are vectors providing the proportions of LULC types observed in each 3 km×3 km grid cell, yielding order 104 cells. On the same grid cells, we have an additional compositional dataset supplying forest fragmentation proportions. Potentially useful and available covariates include elevation range, road length, population, median household income, and housing levels. We propose a spatial regression model that is also able to capture flexible dependence among the components of the observation vectors at each location as well as spatial dependence across the locations of the simplex-restricted measurements. A key issue is the high incidence of observed zero proportions for the LULC dataset, requiring incorporation of local point masses at 0. We build a hierarchical model prescribing a power scaling first stage and using latent variables at the second stage with spatial structure for these variables supplied through a multivariate CAR specification. Analyses for the LULC and forest fragmentation data illustrate the interpretation of the regression coefficients and the benefit of incorporating spatial smoothing.  相似文献   

Motivated by the need to produce small area estimates for the National Resources Inventory survey, we develop a spatial hierarchical model based on the generalized Dirichlet distribution to construct small area estimators of compositional proportions in several mutually exclusive and exhaustive landcover categories. At the observation level, the standard design-based estimators of the proportions are assumed to follow the generalized Dirichlet distribution. After proper transformation of the design-based estimators, beta regression is applicable. We consider a logit mixed model for the expectation of the beta distribution, which incorporates covariates through fixed effects and spatial effect through a conditionally autoregressive process. In a design-based evaluation study, the proposed model-based estimators are shown to have smaller root-mean-square error and relative root-mean-square error than design-based estimators and multinomial model-based estimators. Supplementary materials accompanying this paper appear online.  相似文献   

We present a unified framework for modeling bird survey data collected at spatially replicated survey sites in the form of repeated counts or detection history counts, through which we model spatial dependence in bird density and variation in detection probabilities due to changes in covariates across the landscape. The models have a complex hierarchical structure that makes them suited to Bayesian analysis using Markov chain Monte Carlo (MCMC) algorithms. For computational efficiency, we use a form of conditional autogressive model for modeling spatial dependence. We apply the models to survey data for two bird species in the Great Smoky Mountains National Park. The algorithms converge well for the more abundant and easily detected of the two species, but some simplification of the spatial model is required for convergence for the second species. We show how these methods lead to maps of estimated relative density which are an improvement over those that would follow from past approaches that ignored spatial dependence. This work also highlights the importance of good survey design for bird species mapping studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号