首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Pathway-based analysis has the ability to detect subtle changes in response variables that could be missed when using gene-based analysis. Since genes interact with other covariates such as environmental or clinical variables, so do pathways, which are sets of genes that serve particular cellular or physiological functions. However, since pathways are sets of genes and since environmental or clinical variables do not have parametric relationships with response variables, it is difficult to model unknown interaction terms between high-dimensional variables and low-dimensional variables as environmental or clinical variables. In this paper, we propose a semiparametric interaction model for two unknown functions to evaluate the interaction between a pathway and environmental or clinical variable: for the pathway, we use an unknown high-dimensional function, and for environmental or clinical variable, we use an unknown low-dimensional function. We model the environmental or clinical variable nonparametrically via a natural cubic spline. We model both the pathway effect and the interaction between the pathway and environmental or clinical effect nonparametrically via a kernel machine. Since both interactions among genes within the same pathway and the interaction between the pathway and the environmental or clinical variables are complex, we allow for the possibility that a pathway is interacting with environmental or clinical variables and the genes within the same pathway are interacting with each other. We illustrate our approach using simulated data and genetic pathway data for type II diabetes. Supplementary materials accompanying this paper appear online.  相似文献   

2.
The identification of sea regimes from environmental multivariate times series is complicated by the mixed linear?Ccircular support of the data, by the occurrence of missing values, by the skewness of some variables, and by the temporal autocorrelation of the measurements. We address these issues simultaneously by a hidden Markov approach, and segment the data into pairs of toroidal and skew-elliptical clusters by means of the inferred sequence of latent states. Toroidal clusters are defined by a class of bivariate von Mises densities, while skew-elliptical clusters are defined by mixed linear models with positive random effects. The core of the classification procedure is an EM algorithm accounting for missing measurements, unknown cluster membership, and random effects as different sources of incomplete information. Moreover, standard simulation routines allow for the efficient computation of bootstrap standard errors. The proposed procedure is illustrated for a multivariate marine time series, and identifies a number of wintertime regimes in the Adriatic Sea.  相似文献   

3.
When toxicity data are not available for a chemical mixture of concern, U.S. Environmental Protection Agency (EPA) guidelines allow risk assessment to be based on data for a surrogate mixture considered “sufficiently similar” in terms of chemical composition and component proportions. As a supplementary approach, using statistical equivalence testing logic and mixed model theory we have developed methodology to define sufficient similarity in dose—response for mixtures of many chemicals containing the same components with different ratios. Dose—response data from a mixture of 11 xenoestrogens and the endogenous hormone, 17ß-estradiol are used to illustrate the method.  相似文献   

4.
When an interaction has been detected among the chemicals in a mixture, it may be of interest to predict the interaction threshold. A method is presented for estimation of an interaction threshold along a mixture ray which allows differences in the shapes of the dose-response curves of the individual components (e.g., mixtures of full and partial agonists with differing response maxima). A point estimate and confidence interval for the interaction threshold may be estimated. The methods are illustrated with data from a study of a mixture of 18 polyhalogenated aromatic hydrocarbons (PHAHs) in rats exposed by oral gavage for four consecutive days. Serum total thyroxine (T4) was the response variable. Previous analysis of these data demonstrated a dose-dependent interaction among the 18 chemicals in the mixture, with additivity suggested in the lower portion of the dose-response curve and synergy (greater than additive response) in the higher portion of the dose-response curve. The present work builds on this analysis by construction of an interaction threshold model along the mixture ray. This interaction threshold model has two components: an implicit additivity region and an explicit region that describes the departure from additivity; the interaction threshold is the boundary between the two regions. Estimation of the interaction threshold within the observed experimental region suggested evidence of additivity in the low dose region. Total doses of the mixture that exceed the upper limit of the confidence interval on the interaction threshold were associated with a greater-than-additive interaction.  相似文献   

5.
Anisotropic models are often used in spatial statistics to analyze spatially referenced data. Within a Bayesian framework we develop default priors for the anisotropic Gaussian random field model with and without including a nugget parameter accounting for the effects of microscale variations and measurement errors. We present Jeffreys priors and a reference prior and study their posterior propriety. Moreover, we obtain that the predictive distributions at ungauged locations have finite variance. We also show that the seemingly uninformative uniform prior for the anisotropy parameters, ratio and angle, yields an improper posterior. Finally, we find that the proposed priors have good frequentist properties and we illustrate our approach by analyzing two data sets for which we discuss model choice as well as predictions and uncertainty estimates.  相似文献   

6.
This article investigates multivariate spatial process models suitable for predicting multiple forest attributes using a multisource forest inventory approach. Such data settings involve several spatially dependent response variables arising in each location. Not only does each variable vary across space, they are likely to be correlated among themselves. Traditional approaches have attempted to model such data using simplifying assumptions, such as a common rate of decay in the spatial correlation or simplified cross-covariance structures among the response variables. Our current focus is to produce spatially explicit, tree species specific, prediction of forest biomass per hectare over a region of interest. Modeling such associations presents challenges in terms of validity of probability distributions as well as issues concerning identifiability and estimability of parameters. Our template encompasses several models with different correlation structures. These models represent different hypotheses whose tenability are assessed using formal model comparisons. We adopt a Bayesian hierarchical approach offering a sampling-based inferential framework using efficient Markov chain Monte Carlo methods for estimating model parameters.  相似文献   

7.
We consider the problem of analyzing long-term experiments with panels of nonlinear time-series data in the framework of generalized additive models. Our approach is developed for testing and estimating the (partial) common dynamic structure across treatment groups. We illustrate our approach with a detailed analysis of an ecotoxicological experiment on the effect of sublethal doses of a toxic substance (cadmium) on the long-run dynamic structure of the greenbottle blowfly (Lucilia sericata). The general model for the blowfly experiment is a generalized additive model which is derived from a stage-structured ecological model. We discuss the relationship between the components of the generalized additive model and those of the underlying stage-structured model. In particular, our proposed approach casts new insights on the effect of toxic diet on the population dynamic structure of the blowfly.  相似文献   

8.
Understanding how species distributions respond as a function of environmental gradients is a key question in ecology, and will benefit from a multi-species approach. Multi-species data are often high dimensional, in that the number of species sampled is often large relative to the number of sites, and are commonly quantified as either presence–absence, counts of individuals, or biomass of each species. In this paper, we propose a novel approach to the analysis of multi-species data when the goal is to understand how each species responds to their environment. We use a finite mixture of regression models, grouping species into “Archetypes” according to their environmental response, thereby significantly reducing the dimension of the regression model. Previous research introduced such Species Archetype Models (SAMs), but only for binary assemblage data. Here, we extend this basic framework with three key innovations: (1) the method is expanded to handle count and biomass data, (2) we propose grouping on the slope coefficients only, whilst the intercept terms and nuisance parameters remain species-specific, and (3) we develop model diagnostic tools for SAMs. By grouping on environmental responses only, the model allows for inter-species variation in terms of overall prevalence and abundance. The application of our expanded SAM framework data is illustrated on marine survey data and through simulation. Supplementary materials accompanying this paper appear on-line.  相似文献   

9.
In this paper, we propose a semiparametric regression approach for identifying pathways related to zero-inflated clinical outcomes, where a pathway is a gene set derived from prior biological knowledge. Our approach is developed by using a Bayesian hierarchical framework. We model the pathway effect nonparametrically into a zero-inflated Poisson hierarchical regression model with an unknown link function. Nonparametric pathway effect was estimated via a kernel machine, and the unknown link function was estimated by transforming a mixture of the beta cumulative density function. Our approach provides flexible nonparametric settings to describe the complicated association between gene expressions and zero-inflated clinical outcomes. The Metropolis-within-Gibbs sampling algorithm and Bayes factor were adopted to make statistical inferences. Our simulation results support that our semiparametric approach is more accurate and flexible than zero-inflated Poisson regression with the canonical link function, which is especially true when the number of genes is large. The usefulness of our approach is demonstrated through its applications to the Canine data set from Enerson et al. (Toxicol Pathol 34:27–32, 2006). Our approach can also be applied to other settings where a large number of highly correlated predictors are present.Supplementary materials accompanying this paper appear on-line.  相似文献   

10.
When API-9 kaolinite and Willalooka illite clays were mixed in various proportions, the pore-size distributions obtained using nitrogen sorption and mercury injection techniques were found to be characteristic of neither of the components but showed a progressive reduction in the pore size as the concentration of illite was increased. The plate separation of the mixtures showed a marked decrease with initial added illite, approaching that of the illite when the mixture contained approximately 40 per cent illite. This is indicative of a homogeneous mixture in which the fine illite particles fill in the pores bounded by the relatively coarse kaolinite particles. The relationship between particle dimensions (derived from crystallographic parameters and specific surface area) and pore size of single clays and clay mixtures is consistent with a model in which slit-shaped pores result from the parallel interleaving of clay crystals for the single clays. Some deviations occur for the mixtures.  相似文献   

11.
Many biological phenomena undergo developmental changes in time and space. Functional mapping, which is aimed at mapping genes that affect developmental patterns, is instrumental for studying the genetic architecture of biological changes. Often biological processes are mediated by a network of developmental and physiological components and, therefore, are better described by multiple phenotypes. In this article, we develop a multivariate model for functional mapping that can detect and characterize quantitative trait loci (QTLs) that simultaneously control multiple dynamic traits. Because the true genotypes of QTLs are unknown, the measurements for the multiple dynamic traits are modeled using a mixture distribution. The functional means of the multiple dynamic traits are estimated using the nonparametric regression method, which avoids any parametric assumption on the functional means. We propose the profile likelihood method to estimate the mixture model. A likelihood ratio test is exploited to test for the existence of pleiotropic effects on distinct but developmentally correlated traits. A simulation study is implemented to illustrate the finite sample performance of our proposed method. We also demonstrate our method by identifying QTLs that simultaneously control three dynamic traits of soybeans. The three dynamic traits are the time-course biomass of the leaf, the stem, and the root of the whole soybean. The genetic linkage map is constructed with 950 microsatellite markers. The new model can aid in our comprehension of the genetic control mechanisms of complex dynamic traits over time.  相似文献   

12.
Spatially nested sampling and the associated nested analysis of variance by spatial scale is a well-established methodology for the exploratory investigation of soil variation over multiple, disparate scales. The variance components that can be estimated this way can be accumulated to approximate the variogram. This allows us to identify the important scales of variation, and the general form of the spatial dependence, in order to plan more detailed sampling by design-based or model-based methods. Implicit in the standard analyses of nested sample data is the assumption of homogeneity in the variance, i.e. that all variations from sub-station means at some scale represent a random variable of uniform variance. If this assumption fails then the comparable assumption of stationarity in the variance, which is an important assumption in geostatistics, will also be implausible. However, data from nested sampling may be analysed with a linear mixed model in which the variance components are parameters which can be estimated by residual maximum likelihood (REML). Within this framework it is possible to propose an alternative variance parameterization in which the variance depends on some auxiliary variable, and so is not generally homogeneous. In this paper we demonstrate this approach, using data from nested sampling of chemical and biogeochemical soil properties across a region in central England, and use land use as our auxiliary variable to model non-homogeneous variance components. We show how the REML analysis allows us to make inferences about the need for a non-homogeneous model. Variances of soil pH and cation exchange capacity at different scales differ between these land uses, but a homogeneous variance model is preferable to such non-homogeneous models for the variance of soil urease activity at standard concentrations of urea.  相似文献   

13.
We attempt to estimate the size of a population of female loggerhead turtles. In traditional capture-recapture experiments to estimate the size of an animal population, individual animals are tagged and the information about which individuals are captured repeatedly is crucial. For these loggerhead turtle data, information about individual turtles is not available. Rather, we observe only the counts of successful and failed nestings at a location over a series of days (in our case, three). We view the turtles’ nesting behavior as an alternating renewal process, model it using parametric distributions, and then derive probability distributions that describe the behavior of the turtles during the three days via a 3-way contingency table. We adopt a Bayesian approach, formulating our model in terms of parameters about which strong prior information is available. We use a Gibbs sampling algorithm to sample from the posterior distribution of our random quantities, the most crucial of which is the number of turtles remaining offshore during the entire sampling period. We illustrate the method using data sets from loggerhead turtle sites along the South Carolina coast. We provide a simulation study which illustrates the quality and robustness of the method and investigates sensitivity to prior parameter specification.  相似文献   

14.
We consider a spatial generalized linear latent variable model with and without normality distributional assumption on the latent variables. When the latent variables are assumed to be multivariate normal, we apply a Laplace approximation. To relax the assumption of marginal normality in favor of a mixture of normals, we construct a multivariate density with Gaussian spatial dependence and given multivariate margins. We use the pairwise likelihood to estimate the corresponding spatial generalized linear latent variable model. The properties of the resulting estimators are explored by simulations. In the analysis of an air pollution data set the proposed methodology uncovers weather conditions to be a more important source of variability than air pollution in explaining all the causes of non-accidental mortality excluding accidents.  相似文献   

15.
This paper develops a Bayesian approach for spatial inference on animal density from line transect survey data. We model the spatial distribution of animals within a geographical area of interest by an inhomogeneous Poisson process whose intensity function incorporates both covariate effects and spatial smoothing of residual variation. Independently thinning the animal locations according to their estimated detection probabilities results into another spatial Poisson process for the sightings (the observations). Prior distributions are elicited for all unknown model parameters. Due to the sparsity of data in the application we consider, eliciting sensible prior distributions is important in order to get meaningful estimation results. A reversible jump Markov Chain Monte Carlo (MCMC) algorithm for simulation of the posterior distribution is developed. We present results for simulated data and a real data set of minke whale pods from Antarctic waters. The main advantages of our method compared to design-based analyses are that it can use data arising from sources other than specifically designed surveys and its ability to link covariate effects to variation of animal density. The Bayesian paradigm provides a coherent framework for quantifying uncertainty in estimation results.  相似文献   

16.
In this paper we consider generalized linear latent variable models that can handle overdispersed counts and continuous but non-negative data. Such data are common in ecological studies when modelling multivariate abundances or biomass. By extending the standard generalized linear modelling framework to include latent variables, we can account for any covariation between species not accounted for by the predictors, notably species interactions and correlations driven by missing covariates. We show how estimation and inference for the considered models can be performed efficiently using the Laplace approximation method and use simulations to study the finite-sample properties of the resulting estimates. In the overdispersed count data case, the Laplace-approximated estimates perform similarly to the estimates based on variational approximation method, which is another method that provides a closed form approximation of the likelihood. In the biomass data case, we show that ignoring the correlation between taxa affects the regression estimates unfavourably. To illustrate how our methods can be used in unconstrained ordination and in making inference on environmental variables, we apply them to two ecological datasets: abundances of bacterial species in three arctic locations in Europe and abundances of coral reef species in Indonesia.Supplementary materials accompanying this paper appear on-line.  相似文献   

17.
This article introduces a hierarchical model for compositional analysis. Our approach models both source and mixture data simultaneously, and accounts for several different types of variation: these include measurement error on both the mixture and source data; variability in the sample from the source distributions; and variability in the mixing proportions themselves, generally of main interest. The method is an improvement on some existing methods in that estimates of mixing proportions (including their interval estimates) are sure to lie in the range [0, 1]; in addition, it is shown that our model can help in situations where identification of appropriate source data is difficult, especially when we extend our model to include a covariate. We first study the likelihood surface of a base model for a simple example, and then include prior distributions to create a Bayesian model that allows analysis of more complex situations via Markov chain Monte Carlo sampling from the likelihood. Application of the model is illustrated with two examples using real data: one concerning chemical markers in plants, and another on water chemistry.  相似文献   

18.
We develop a new Bayesian two-stage space-time mixture model to investigate the effects of air pollution on asthma. The two-stage mixture model proposed allows for the identification of temporal latent structure as well as the estimation of the effects of covariates on health outcomes. In the paper, we also consider spatial misalignment of exposure and health data. A simulation study is conducted to assess the performance of the 2-stage mixture model. We apply our statistical framework to a county-level ambulatory care asthma data set in the US state of Georgia for the years 1999?C2008.  相似文献   

19.
Techniques for the gas and liquid chromatographic separation of complex mixtures of triglycerides have evolved over the past two decades, as reviewed in detail by Huang et al. (J. Agric. Food Chem. 1995, 43, 1834-1844; J. Agric. Food Chem. 1997, 45, 1770-1778). A novel method for the quantitative partitioning of complex mixtures of triglycerides into functionally related groups is developed and applied to a low-calorie triglyceride mixture [namely, Benefat S or Salatrim plus mid-chain (C(6,8,10,12)) fatty acids]. The method is based on a nonlinear calibration of retention times (RTs) of a suite of standard triglycerides on their acyl carbon numbers [(ACNs), the sum of all the acyl carbon atoms in a given triglyceride] to estimate all of the intermediate ACNs (from 6 to 66). With the calibrated ACN scale and identifications of some components of a complex mixture's composition, ACN-based partitions were established and a Benefat S-triglyceride chromatogram was partitioned into seven functionally related groups. This method is provisional in the sense that it would typically be employed when the identifications of many components of a complex, homologous series were unknown, yet functionally related groups needed to be quantified. This method has proven to be particularly useful in the intercalibration of research laboratories with production facility laboratories during complex ( approximately 50-90 compounds) and large-scale ( approximately 20 ton) syntheses because of the high reproducibility of the ACN-based partitioning of complex chromatograms. This carbon number and statistically based method can be generally applicable to other complex mixtures of organic compounds and is readily adaptable to laboratory intercalibration efforts.  相似文献   

20.
为了分析针盘式粉碎机静圆柱齿钉的气动力学特性,利用计算流体力学(diesel engine emission experiments,CFD)方法,对粉碎机中二维单圆柱齿钉的绕流问题进行研究。在进行计算之前,采用均匀来流中的圆柱绕流问题对计算模型进行验证,并且对数值计算中的网格和时间步长独立性均进行了算例考核。结果表明:在雷诺数小于200的范围内,流动出现3种不同的流态;在稳定分离区中圆柱齿钉上的附着涡最大长度约等于齿钉直径,且附着涡随雷诺数增加出现了消亡的现象;在不稳定泄涡区流动只在较小的雷诺下呈现出周期性,圆柱齿钉上受到流体周期性的升力和阻力均很小,在较大雷诺数下,流动的无量纲频率增加,且不再由单个频率主导,出现了一个频带;在不稳定泄涡区的无量纲频率比均匀来流中的圆柱绕流情况小一个数量级。该文为研究粉碎机圆柱齿钉在高雷诺数下的受力情况提供了参考。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号