Editors' ChoiceBiostatistics

A Better Means of Modeling Data

See allHide authors and affiliations

Science Translational Medicine  25 Aug 2010:
Vol. 2, Issue 46, pp. 46ec133
DOI: 10.1126/scitranslmed.3001602

Linear regression—a model of the relationship between an outcome in a study and one or more measures of interest—is one of the most popular statistical analysis techniques and is applied in a wide range of research studies. For the results of a linear regression analysis to be valid, the data must be independent and the model errors approximately normally distributed. Many studies violate these assumptions by repeatedly collecting data on the same subject or on animals from the same litter. The normality assumption is often violated in studies with yes/no outcomes or when counting the number of events on a subject over a period of time (such as the number of hospitalizations in a year).

The generalized linear mixed modeling (GLMM) framework is an extension to linear regression that addresses these two shortcomings. However, inference from these models is often unreliable when the sample size is small. Bayesian approaches to estimating these methods in this situation offer an attractive solution. However, such approaches require specification of prior distributions (distributions defining a priori knowledge about the parameters in the GLMM) and are computationally challenging and time-consuming.

Recently, Fong et al. showed how to chose a set of prior distributions and how to apply an alternative estimation technique called integrated nested Laplace approximation to get results from GLMMs. The authors showed that their suggested approximation produces similar results as compared with standard non-Bayesian estimation techniques (when the results should be similar) and Markov chain Monte Carlo estimation approaches, but with a much faster computational speed (seconds versus minutes or hours). Thus, this work offers a reasonable alternative to existing estimation approaches and provides much guidance on how to implement a Bayesian approach to GLMMs, making Bayesian approaches to a common data analysis platform more approachable to the data analyst.

Y. Fong et al., Bayesian inference for generalized linear mixed models. Biostatistics 11, 397–412 (2010). [Abstract]

Navigate This Article