Epistasis Blog

From the Computational Genetics Laboratory at Dartmouth Medical School (www.epistasis.org)

Monday, November 19, 2007

Random Chemistry

Our paper with Dr. Maggie Eppstein on using Random Chemistry to detect epistasis has been published in Genetic Programming and Evolvable Machines. The paper can be found here. Email me for the PDF if you can't access it from the journal.

Margaret J. Eppstein, Joshua L. Payne, Bill C. White and Jason H. Moore. Genomic mining for complex disease traits with “random chemistry”. Genetic Programming and Evolvable Machines 8:395-411 (2007).

Abstract: Our rapidly growing knowledge regarding genetic variation in the human genome offers great potential for understanding the genetic etiology of disease. This, in turn, could revolutionize detection, treatment, and in some cases prevention of disease. While genes for most of the rare monogenic diseases have already been discovered, most common diseases are complex traits, resulting from multiple gene–gene and gene-environment interactions. Detecting epistatic genetic interactions that predispose for disease is an important, but computationally daunting, task currently facing bioinformaticists. Here, we propose a new evolutionary approach that attempts to hill-climb from large sets of candidate epistatic genetic features to smaller sets, inspired by Kauffman’s “random chemistry” approach to detecting small auto-catalytic sets of molecules from within large sets. Although the algorithm is conceptually straightforward, its success hinges upon the creation of a fitness function able to discriminate large sets that contain subsets of interacting genetic features from those that don’t. Here, we employ an approximate and noisy fitness function based on the ReliefF data mining algorithm. We establish proof-of-concept using synthetic data sets, where individual features have no marginal effects. We show that the resulting algorithm can successfully detect epistatic pairs from up to 1,000 candidate single nucleotide polymorphisms in time that is linear in the size of the initial set, although success rate degrades as heritability declines. Research continues into seeking a more accurate fitness approximator for large sets and other algorithmic improvements that will enable us to extend the approach to larger data sets and to lower heritabilities.

Tuesday, November 13, 2007

2nd GECCO Workshop on Open-Source Software for Applied Genetic and Evolutionary Computation (SoftGEC)

My workshop on Open-Source Software for Applied Genetic and Evolutionary Computation (SoftGEC) will be held for a second year in conjunction with the 2008 Genetic and Evolutionary Computation Conference (GECCO) in Atlanta (July 12-16). The web page for SoftGEC'07 can be found here. I will be posting a SoftGEC'08 web page soon with updated information about invited speakers, etc.

I will be chairing (with Dr. Clare Bates Congdon) the track on Bioinformatics and Computational Biology. We are interested in any papers that use biologically-inspired algorithms or software for biological or biomedical problem-solving. The general GECCO'08 call for papers can be found here. Papers need to submitted by January 16th. I hope to see some papers on the use of genetic and evolutionary computation for genetic analysis.

In addition, I will be giving a two-hour tutorial on Bioinformatics at GECCO'08. More information can be found here.

This is a great conference with a wide diversity of attendees and presenters that are doing creative work in the computational sciences. I highly recommend it! Let me know if you have any questions.

Saturday, November 03, 2007

EvoBIO 2008 - Deadline Extended to Nov. 11th

We have extended the deadline for paper submissions to EvoBIO until Nov. 11th. For details about the conference please see my post from Oct. 3rd below. All papers using bioinformatics methods to solve biomedical problems are welcome. Let me know if you have any questions.

Friday, November 02, 2007

Bases, Bits and Disease

My short editorial on the use of information theory for the genetic analysis of epistasis has been published in the European Journal of Human Genetics.

Moore JH. Bases, bits and disease: A mathematical theory of human genetics. European Journal of Human Genetics, in press (2007) [PubMed] [PDF]

This editorial comments on a new paper by Dong et al. to appear in the European Journal of Human Genetics.

Dong C, Chu X, Wang Y, Wang Y, Jin L, Shi T, Huang W, Li Y. Exploration of gene-gene interaction effects using entropy-based methods. European Journal of Human Genetics, in press (2007) [PubMed] [PDF]

Gene-gene interaction may play important roles in complex disease studies, in which interaction effects coupled with single-gene effects are active. Many interaction models have been proposed since the beginning of the last century. However, the existing approaches including statistical and data mining methods rarely consider genetic interaction models, which make the interaction results lack biological or genetic meaning. In this study, we developed an entropy-based method integrating two-locus genetic models to explore such interaction effects. We performed our method to simulated and real data for evaluation. Simulation results show that this method is effective to detect gene-gene interaction and, furthermore, it is able to identify the best-fit model from various interaction models. Moreover, our method, when applied to malaria data, successfully revealed negative epistatic effect between sickle cell anemia and alpha(+)-thalassemia against malaria.