Statistical Modelling 6 (2006), 352–372

Bayesian modeling for genetic association in case-control studies: accounting for unknown population substructure

Li Zhang, Bhramar Mukherjee, Malay Ghosh, and Rongling Wu
Department of Statistics
University of Florida
P.O. Box 118545
Gainesville, FL 32611-8545
USA
eMail: mukherjee@stat.ufl.edu

Abstract:

A two-stage parametric Bayesian method is proposed to examine the association between a candidate gene and the occurrence of a disease after accounting for population substructure. This procedure, implemented via a Markov chain Monte Carlo numerical integration technique, first estimates the posterior probability of different unknown population substructures and then integrates this information into a disease-gene association model through the technique of Bayesian model averaging. The model relaxes certain assumptions of previous analyses and provides a unified computational framework to obtain an estimate of the log odds ratio parameter corresponding to the genetic factor after allowing for the allele frequencies to vary across subpopulations. The uncertainty in estimating the population substructure is taken into account while providing credible intervals for parameters in the disease-gene association model. Simulations on unmatched case-control studies that mimic an admixed Argentinean population are performed to demonstrate the statistical properties of our model. The method is also applied to a real data set coming from a genetic association study on obesity.

Keywords:

Bayesian model averaging; gene-disease association; linkage equilibrium; Markov chain Monte Carlo; obesity

Downloads:

Data and software in zipped archive


back