Bayesian methodology for genetics of complex diseases

Chen, Carla Chia-Ming (2010) Bayesian methodology for genetics of complex diseases. PhD by Publication, Queensland University of Technology.


Genetic research of complex diseases is a challenging, but exciting, area of research. The early development of the research was limited, however, until the completion of the Human Genome and HapMap projects, along with the reduction in the cost of genotyping, which paves the way for understanding the genetic composition of complex diseases. In this thesis, we focus on the statistical methods for two aspects of genetic research: phenotype definition for diseases with complex etiology and methods for identifying potentially associated Single Nucleotide Polymorphisms (SNPs) and SNP-SNP interactions. With regard to phenotype definition for diseases with complex etiology, we firstly investigated the effects of different statistical phenotyping approaches on the subsequent analysis. In light of the findings, and the difficulties in validating the estimated phenotype, we proposed two different methods for reconciling phenotypes of different models using Bayesian model averaging as a coherent mechanism for accounting for model uncertainty. In the second part of the thesis, the focus is turned to the methods for identifying associated SNPs and SNP interactions. We review the use of Bayesian logistic regression with variable selection for SNP identification and extended the model for detecting the interaction effects for population based case-control studies. In this part of study, we also develop a machine learning algorithm to cope with the large scale data analysis, namely modified Logic Regression with Genetic Program (MLR-GEP), which is then compared with the Bayesian model, Random Forests and other variants of logic regression.

Impact and interest:

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

518 since deposited on 18 Jul 2011
43 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 43357
Item Type: QUT Thesis (PhD by Publication)
Supervisor: Mengersen, Kerrie & Keith, Jonathan
Keywords: Bayesian, statistics, genetics, phenotype analysis, complex diseases, complex etiology, model comparison, latent class analysis, grade of membership, fuzzy clustering, item response theory, migraine, twin study, heritability, genome-wide linkage analysis, deviance information criteria, model averaging, MCMC, genomewide association studies, epistasis, logistic regression, stochastic search algorithm, case-control studies, Type I diabetes, single nucleotide polymorphism, gene expression programming, logic tree, logicFS, Monte Carlo logic regression, genetic programming for association study, random forest, GENICA
Divisions: Past > QUT Faculties & Divisions > Faculty of Science and Technology
Institution: Queensland University of Technology
Deposited On: 18 Jul 2011 05:18
Last Modified: 18 Jul 2011 05:18

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page