首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Marginalized zero-inflated count regression models (Long et al. in Stat Med 33(29):5151–5165, 2014) provide direct inference on overall exposure effects. Unlike standard zero-inflated models, marginalized models specify a regression model component for the marginal mean in addition to a component for the probability of an excess zero. This study proposes a score test for testing a marginalized zero-inflated Poisson model against a marginalized zero-inflated negative binomial model for model selection based on an assessment of over-dispersion. The sampling distribution and empirical power of the proposed score test are investigated via a Monte Carlo simulation study, and the procedure is illustrated with data from a horticultural experiment. Supplementary materials accompanying this paper appear on-line.  相似文献   

2.
Delay differential equations (DDEs) are widely used in ecology, physiology and many other areas of applied science. Although the form of the DDE model is usually proposed based on scientific understanding of the dynamic system, parameters in the DDE model are often unknown. Thus it is of great interest to estimate DDE parameters from noisy data. Since the DDE model does not usually have an analytic solution, and the numeric solution requires knowing the history of the dynamic process, the traditional likelihood method cannot be directly applied. We propose a semiparametric method to estimate DDE parameters. The key feature of the semiparametric method is the use of a flexible nonparametric function to represent the dynamic process. The nonparametric function is estimated by maximizing the DDE-defined penalized likelihood function. Simulation studies show that the semiparametric method gives satisfactory estimates of DDE parameters. The semiparametric method is demonstrated by estimating a DDE model from Nicholson’s blowfly population data.  相似文献   

3.
The analysis of clustered binary data is a common task in many areas of application. Parametric approaches to the analysis of such data are numerous, but there has been much recent interest in nonparametric and semiparametric approaches. When cluster sizes are unequal, an assumption is often made of compatibility of marginal distributions in order for semiparametric approaches to be developed when there is little replication for different cluster sizes. Here, we use the marginal compatibility assumption to extend flexible semiparametric Bayesian methods able to shrink towards a “parametric backbone” to the situation where there are few replicated observations for distinct cluster sizes and each distinct value of a covariate. A motivating application is the analysis of developmental toxicology data where pregnant laboratory animals are exposed to a dose of some potentially toxic compound and interest lies in describing the distribution, as a function of the dose level, of the number of fetuses exhibiting some characteristic abnormality. Flexible semiparametric methods are required here, as the data typically exhibit overdispersion and complex structure. We also consider a further extension appropriate to the analysis of clustered binary data in the situation where there is little or no replication for distinct covariate values.  相似文献   

4.
Pathway-based analysis has the ability to detect subtle changes in response variables that could be missed when using gene-based analysis. Since genes interact with other covariates such as environmental or clinical variables, so do pathways, which are sets of genes that serve particular cellular or physiological functions. However, since pathways are sets of genes and since environmental or clinical variables do not have parametric relationships with response variables, it is difficult to model unknown interaction terms between high-dimensional variables and low-dimensional variables as environmental or clinical variables. In this paper, we propose a semiparametric interaction model for two unknown functions to evaluate the interaction between a pathway and environmental or clinical variable: for the pathway, we use an unknown high-dimensional function, and for environmental or clinical variable, we use an unknown low-dimensional function. We model the environmental or clinical variable nonparametrically via a natural cubic spline. We model both the pathway effect and the interaction between the pathway and environmental or clinical effect nonparametrically via a kernel machine. Since both interactions among genes within the same pathway and the interaction between the pathway and the environmental or clinical variables are complex, we allow for the possibility that a pathway is interacting with environmental or clinical variables and the genes within the same pathway are interacting with each other. We illustrate our approach using simulated data and genetic pathway data for type II diabetes. Supplementary materials accompanying this paper appear online.  相似文献   

5.
This paper develops a Bayesian approach for spatial inference on animal density from line transect survey data. We model the spatial distribution of animals within a geographical area of interest by an inhomogeneous Poisson process whose intensity function incorporates both covariate effects and spatial smoothing of residual variation. Independently thinning the animal locations according to their estimated detection probabilities results into another spatial Poisson process for the sightings (the observations). Prior distributions are elicited for all unknown model parameters. Due to the sparsity of data in the application we consider, eliciting sensible prior distributions is important in order to get meaningful estimation results. A reversible jump Markov Chain Monte Carlo (MCMC) algorithm for simulation of the posterior distribution is developed. We present results for simulated data and a real data set of minke whale pods from Antarctic waters. The main advantages of our method compared to design-based analyses are that it can use data arising from sources other than specifically designed surveys and its ability to link covariate effects to variation of animal density. The Bayesian paradigm provides a coherent framework for quantifying uncertainty in estimation results.  相似文献   

6.
This paper is concerned with the analysis of clustered data from developmental toxicity studies with mixed responses, i.e., where each member of the cluster has binary and continuous outcomes. A copula-based random effects model is proposed that accounts for associations between binary and/or continuous outcomes within clusters, including the intrinsic association between the mixed outcomes for the same subject. The approach allows the adoption of flexible distributions for the mixed outcomes as well as for the random effects. The model includes the correlated probit model of Gueorguieva and Agresti (2001) and the generalized linear mixed models of Faes et al. (2008), and Faes, Geys, and Catalano (2009) as special cases. Maximum likelihood estimation of our model parameters is implemented using standard software such as PROC NLMIXED in SAS. The proposed methodology is motivated by and illustrated using a developmental toxicity study of ethylene glycol in mice. This article has supplementary material online.  相似文献   

7.
The main goal of this article is to present a flexible statistical modelling framework to deal with multivariate count data along with longitudinal and repeated measures structures. The covariance structure for each response variable is defined in terms of a covariance link function combined with a matrix linear predictor involving known matrices. In order to specify the joint covariance matrix for the multivariate response vector, the generalized Kronecker product is employed. We take into account the count nature of the data by means of the power dispersion function associated with the Poisson–Tweedie distribution. Furthermore, the score information criterion is extended for selecting the components of the matrix linear predictor. We analyse a data set consisting of prey animals (the main hunted species, the blue duiker Philantomba monticola and other taxa) shot or snared for bushmeat by 52 commercial hunters over a 33-month period in Pico Basilé, Bioko Island, Equatorial Guinea. By taking into account the severely unbalanced repeated measures and longitudinal structures induced by the hunters and a set of potential covariates (which in turn affect the mean and covariance structures), our method can be used to indicate whether there was statistical evidence of a decline in blue duikers and other species hunted during the study period. Determining whether observed drops in the number of animals hunted are indeed true is crucial to assess whether species depletion effects are taking place in exploited areas anywhere in the world. We suggest that our method can be used to more accurately understand the trajectories of animals hunted for commercial or subsistence purposes and establish clear policies to ensure sustainable hunting practices.Supplementary materials accompanying this paper appear online.  相似文献   

8.
Aboveground biomass estimation in short-rotation forestry plantations is an essential step in the development of crop management strategies as well as allowing the economic viability of the crop to be determined prior to harvesting. Hence, it is important to develop new methodologies that improve the accuracy of predictions, using only a minimum set of easily obtainable information i.e., diameter and height. Many existing models base their predictions only on diameter (mainly due to the complexity of including further covariates), or rely on complicated equations to obtain biomass predictions. However, in tree species, it is important to include height when estimating aboveground biomass because this will vary from one genotype to another. This work proposes the use of a more flexible and easy to implement model for predicting aboveground biomass (stem, branches and total) as a smooth function of height and diameter using smooth additive mixed models which preserve the additive property necessary to model the relationship within wood fractions, and allows the inclusion of random effects and interaction terms. The model is applied to the analysis of three trials carried out in Spain, where nine clones at three different sites are compared. Also, an analysis of slash pine data is carried out in order to compare with the approach proposed by Parresol (Can J For Res 31:865–878, 2001).Supplementary materials accompanying this paper appear on-line  相似文献   

9.
We present a novel method for calculating the opportunity costs to fishers from their displacement by the establishment of marine protected areas (MPAs). We used a fishing community in Kubulau District, Fiji to demonstrate this method. We modelled opportunity costs as a function of food fish abundance and probability of catch, based on gear type and market value of species. Count models (including Poisson, negative binomial and two zero-inflated models) were used to predict spatial abundance of preferred target fish species and were validated against field surveys. A profit model was used to investigate the effect of restricted access to transport on costs to fishers. Spatial distributions of fish within the three most frequently sighted food fish families (Acanthuridae, Lutjanidae, Scaridae) varied, with greatest densities of Lutjanidae and Acanthuridae on barrier forereefs and greatest densities of Scaridae on submerged reefs. Modelled opportunity cost indicated that highest costs to fishers arise from restricting access to the barrier forereefs. We included our opportunity cost model in Marxan, a decision support tool used for MPA design, to examine potential MPA configurations for Kubulau District, Fiji Islands. We identified optimum areas for protection in Kubulau with: (a) the current MPA network locked in place; and (b) a clean-slate approach. Our method of modelling opportunity cost gives an unbiased estimate for multiple gear types in a marine environment and can be applied to other regions using existing species data.  相似文献   

10.
In the search for important determinants of disease, epidemiologists often face the challenging task of retrospectively estimating exposures of interest. Such is the case in modern studies of the lung cancer risk posed by residential radon—a naturally occurring radioactive gas. Assessment of past radon exposures is limited because measurements are not generally available for the locations at which study subjects spent time prior to enrollment. In such settings, there is a need for prediction at unmeasured geopraphic sites and time periods. We develop a hierarchical Bayesian goestatistical model for predicting unmeasured radon concentrations over space and time. Our work arises from a study of residential radon in Iowa, where measurements were taken as yearly averages and subject to detector measurement error. Much attention has been given lately to geostatistical methods for data that are obtained as integrated averages over geographic regions. We show how these techniques work in the time domain as well. Unlike the numerical approximations that are needed to integrate over geographic regions, we are able to provide closed-form solutions for the integration that must be performed over temporal periods. Our approach is illustrated with radon concentrations measured from 614 different geographic sites and 799 time periods.  相似文献   

11.
The few distance sampling studies that use Bayesian methods typically consider only line transect sampling with a half-normal detection function. We present a Bayesian approach to analyse distance sampling data applicable to line and point transects, exact and interval distance data and any detection function possibly including covariates affecting detection probabilities. We use an integrated likelihood which combines the detection and density models. For the latter, densities are related to covariates in a log-linear mixed effect Poisson model which accommodates correlated counts. We use a Metropolis-Hastings algorithm for updating parameters and a reversible jump algorithm to include model selection for both the detection function and density models. The approach is applied to a large-scale experimental design study of northern bobwhite coveys where the interest was to assess the effect of establishing herbaceous buffers around agricultural fields in several states in the US on bird densities. Results were compared with those from an existing maximum likelihood approach that analyses the detection and density models in two stages. Both methods revealed an increase of covey densities on buffered fields. Our approach gave estimates with higher precision even though it does not condition on a known detection function for the density model.  相似文献   

12.

Purpose

We investigated the application of Kohonen Neural Networks (KNNs) in order to estimate sediment yield based on runoff and climatological data in a semiarid region of Brazil. Accurate estimations of sediment yield are essential to improve the management of soil erosion in semiarid areas, where large quantities of sediments tend to be produced only periodically.

Materials and methods

The case study is an erosion plot within the São João do Cariri Experimental Basin, which is located in the semiarid portion of Paraíba State, Brazil. KNNs are unsupervised neural networks capable of reducing a multidimensional data set to a bidimensional matrix of features, which can be used for analysis and prediction purposes. A total of 60 rainfall events, which occurred between 1999 and 2002, were used to calibrate and test the model. The application of a multivariate linear regression (MLR) model was also carried out.

Results and discussion

Statistical indexes were used as criteria for evaluating the performance of the KNN and MLR models for the test data set. The correlation and relative bias of the KNN model estimations with those from observed data were 0.90 and ?4.39 %, respectively. A correlation of 0.70 and a relative bias of 15.63 % were found from the comparison of sediment yields obtained by the MLR model with those of the observed data. Analysis of the outcomes indicates that the KNN model, which is capable of detecting and extracting nonlinear trends, produced more reliable results than the regression model.

Conclusions

The KNN model results appear to be superior to those generated by the MLR model and suggest that the developed methodology may be applied to similar case studies.  相似文献   

13.
In this note, it is shown that the integrated likelihood for the Royle–Nichols model with a Poisson mixing distribution can be expressed as a finite rather than an infinite sum of terms. The advantages which so accrue are discussed and explored by means of two examples. The finite sum formulation of the likelihood is also shown to hold for negative binomial and zero-inflated mixing distributions. Results based on these two mixing distributions proved disappointing however and their use is not recommended unless extensive data are available.  相似文献   

14.
Abundance and standard error estimates in surveys of fishery resources typically employ classical design-based approaches, ignoring the influences of non-design factors such as varying catchability. We developed a Bayesian approach for estimating abundance and associated errors in a fishery survey by incorporating sampling and non-sampling variabilities. First, a zero-inflated spatial model was used to quantify variance components due to non-sampling factors; second, the model was used to calibrate the estimated abundance index and its variance using pseudo empirical likelihood. The approach was applied to a winter dredge survey conducted to estimate the abundance of blue crabs (Callinectes sapidus) in the Chesapeake Bay. We explored the properties of the calibration estimators through a limited simulation study. The variance estimator calibrated on posterior sample performed well, and the mean estimator had comparable performance to design-based approach with slightly higher bias and lower (about 15% reduction) mean squared error. The results suggest that application of this approach can improve estimation of abundance indices using data from design-based fishery surveys.  相似文献   

15.
Modeling complex collective animal movement presents distinct challenges. In particular, modeling the interactions between animals and the nonlinear behaviors associated with these interactions, while accounting for uncertainty in data, model, and parameters, requires a flexible modeling framework. To address these challenges, we propose a general hierarchical framework for modeling collective movement behavior with multiple stages. Each of these stages can be thought of as processes that are flexible enough to model a variety of complex behaviors. For example, self-propelled particle (SPP) models (e.g., Vicsek et al. in Phys Rev Lett 75:1226–1229, 1995) represent collective behavior and are often applied in the physics and biology literature. To date, the study and application of these models has almost exclusively focused on simulation studies, with less attention given to rigorously quantifying the uncertainty. Here, we demonstrate our general framework with a hierarchical version of the SPP model applied to collective animal movement. This structure allows us to make inference on potential covariates (e.g., habitat) that describe the behavior of agents and rigorously quantify uncertainty. Further, this framework allows for the discrete time prediction of animal locations in the presence of missing observations. Due to the computational challenges associated with the proposed model, we develop an approximate Bayesian computation algorithm for estimation. We illustrate the hierarchical SPP methodology with a simulation study and by modeling the movement of guppies.Supplementary materials accompanying this paper appear online.  相似文献   

16.
When analyzing animal movement, it is important to account for interactions between individuals. However, statistical models for incorporating interaction behavior in movement models are limited. We propose an approach that models dependent movement by augmenting a dynamic marginal movement model with a spatial point process interaction function within a weighted distribution framework. The approach is flexible, as marginal movement behavior and interaction behavior can be modeled independently. Inference for model parameters is complicated by intractable normalizing constants. We develop a double Metropolis–Hastings algorithm to perform Bayesian inference. We illustrate our approach through the analysis of movement tracks of guppies (Poecilia reticulata).  相似文献   

17.
Spatial Regression Modeling for Compositional Data With Many Zeros   总被引:1,自引:0,他引:1  
Compositional data analysis considers vectors of nonnegative-valued variables subject to a unit-sum constraint. Our interest lies in spatial compositional data, in particular, land use/land cover (LULC) data in the northeastern United States. Here, the observations are vectors providing the proportions of LULC types observed in each 3 km×3 km grid cell, yielding order 104 cells. On the same grid cells, we have an additional compositional dataset supplying forest fragmentation proportions. Potentially useful and available covariates include elevation range, road length, population, median household income, and housing levels. We propose a spatial regression model that is also able to capture flexible dependence among the components of the observation vectors at each location as well as spatial dependence across the locations of the simplex-restricted measurements. A key issue is the high incidence of observed zero proportions for the LULC dataset, requiring incorporation of local point masses at 0. We build a hierarchical model prescribing a power scaling first stage and using latent variables at the second stage with spatial structure for these variables supplied through a multivariate CAR specification. Analyses for the LULC and forest fragmentation data illustrate the interpretation of the regression coefficients and the benefit of incorporating spatial smoothing.  相似文献   

18.
19.
In many environmental and agricultural studies, data are collected on both linear and circular random variables, with possible dependence between the variables. Classically, the analysis of such data has been carried out in a classical regression framework. We propose a Bayesian hierarchical framework to handle all forms of uncertainty arising in a linear-circular data set. One novelty of our multivariate linear-circular model is that, marginally, the circular component is assumed to be a mixture model with an unknown number of von Mises (or circular normal) distributions. We use the Dirichlet process to introduce variability in the model dimensionality, and develop a simple Gibbs sampling algorithm for simulating the mixture components. Although we illustrate our methodology on von Mises mixtures, it is widely applicable. We thus avoid complicated reversible-jump Markov chain Monte Carlo methods, which are considered ideal for analyzing mixtures of unknown number of distributions. We illustrate our methodologies with simulated and real data sets. Using pseudo-Bayes factors, we also compare different models associated with both fixed and variable numbers of von Mises distributions. Our findings suggest that models associated with varying numbers of mixture components perform at least as well as those with known numbers of mixture components. We tentatively argue that model averaging associated with variable number of mixture components improves the model’s predictive power, which compensates for the lack of knowledge of the actual number of mixture components.  相似文献   

20.
Multivariate hierarchical Bayesian models provide a flexible framework for comprehensive study of biological systems with more than one outcome. Recent methodological developments facilitate modeling of heterogeneous associations between outcomes by specifying a linear mixed model on (co)variances at different levels of the data structure. Motivated by previous evidence for heterogeneous correlations in animal agriculture, we apply the proposed hierarchical Bayesian models to study the nature of the correlations between key performance outcomes in dairy cattle production systems, namely milk yield and reproduction. That is, the association between these outcomes might depend upon various fixed and random effect sources of heterogeneity both at the individual cow (residual) level as well as the herd (cluster) level. We thus propose a sequential modeling approach based on the deviance information criterion to select relevant explanatory variables on both types of associations. Furthermore, we extend the proposed methodology to accommodate right-censored outcomes, as common for dairy reproduction data, and use it to analyze field data from the Michigan dairy industry. The nature of the associations between milk production and reproduction in dairy cattle was inferred to be strongly heterogeneous and driven by multiple farm management practices and herd attributes, as well as by random clustering effects, at both cow and herd levels, thereby suggesting potential between-herd and within-herd intervention strategies to optimize performance of dairy production systems. Supplementary materials are available online.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号