Epistasis Blog

From the Computational Genetics Laboratory at Dartmouth Medical School (www.epistasis.org)

Tuesday, July 31, 2012

Top 10 Reasons to Study Bioinformatics

The following list of reasons to do a Ph.D. or postdoc in bioinformatics or computational biology appeared on Casey Bergman's blog.

0. Computing is the key skill set for 21st century biology
1. Computational skills are highly transferrable
2. Computing will help improve your core scientific skills
3. You should use you Ph.D./Post-Doc to develop new skills
4. You will develop a more unique skill set in Biology
5. You will publish more papers
6. You will have more flexibility in your research
7. You will have more flexibility in working practices
8. Computational research is cost-effective
9. A successful scientist ends up in an office
[10. You will understand why lists should start with the number zero.]

Thursday, July 26, 2012

Two opposing views of academic life

I personally think being a professor, researcher and educator is a fantastic career. I wouldn't want to do anything else. Here are two opposing views.

Why give up an excellent tenured faculty position for the grind of corporate life?

There's been a lot of talk recently about the pressures of higher education. Actually, says one postgrad student, university life is a joy.

Monday, July 16, 2012

Risk estimation and risk prediction using machine-learning methods

Great new paper on machine learning analysis in human genetics.

Kruppa J, Ziegler A, König IR. Risk estimation and risk prediction using machine-learning methods. Hum Genet. 2012 Jul 3, in press. [PubMed]


After an association between genetic variants and a phenotype has been established, further study goals comprise the classification of patients according to disease risk or the estimation of disease probability. To accomplish this, different statistical methods are required, and specifically machine-learning approaches may offer advantages over classical techniques. In this paper, we describe methods for the construction and evaluation of classification and probability estimation rules. We review the use of machine-learning approaches in this context and explain some of the machine-learning algorithms in detail. Finally, we illustrate the methodology through application to a genome-wide association analysis on rheumatoid arthritis.

Tuesday, July 03, 2012

Machine Learning that Matters

Machine learning has a very important role to play in human genetics and genetic epidemiology. However, the computer science-based machine learning community has come under fire for writing and publishing papers that lack real-world application and impactful interpretation of results. The following is a must read for anyone interested in developing machine learning methods.

Machine Learning that Matters [PDF]

Kiri L. Wagstaff
Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 91109 USA

To appear in Proceedings of the 29 th International Conference on Machine Learning, Edinburgh, Scotland, UK, 2012.



Much of current machine learning (ML) research has lost its connection to problems of import to the larger world of science and society. From this perspective, there exist glaring limitations in the data sets we investigate, the metrics we employ for evaluation, and the degree to which results are communicated back to their originating domains. What changes are needed to how we conduct research to increase the impact that ML has? We present six Impact Challenges to explicitly focus the field’s energy and attention, and we discuss existing obstacles that must be addressed. We aim to inspire ongoing discussion and focus on ML that matters.

Sunday, July 01, 2012

Biological Basis of Epistasis

We are investing more time into understanding the biological basis of epistasis. The following are two recent papers that we have particpated in that help address this important question.

Zhang X, Cowper-Sal Lari R, Bailey SD, Moore JH, Lupien M. Integrative functional genomics identifies an enhancer looping to the SOX9 gene disrupted by the 17q24.3 prostate cancer risk locus. Genome Res. 2012 Jun 4. [PubMed]

Akhtar-Zaidi B, Cowper-Sal-lari R, Corradin O, Saiakhova A, Bartels CF, Balasubramanian D, Myeroff L, Lutterbaugh J, Jarrar A, Kalady MF, Willis J, Moore JH, Tesar PJ, Laframboise T, Markowitz S, Lupien M, Scacheri PC. Epigenomic enhancer profiling defines a signature of colon cancer. Science. 2012 May 11;336(6082):736-9. [PubMed]