Development of algorithms and software for high-performance
computing in genetic analysis of complex human traits
The goal of the project is to develop a series of linked algorithms and software programs for high-performance
computing in genetic analysis of complex human traits. This software should make semi-automatic discovery of genes involved in complex
diseases possible. The algorithms must take into account evidence coming from different levels of genetic analysis (linkage, association
studies, knowledge of the sequence of the human genome, literature data about disease). We will focus on exploiting highly parallelizable
computation techniques in genetic analysis, building upon our previous join research. The software will specifically target the analysis
of large pedigrees spanning 5 or more generations, as can be found in human isolated populations and life stock. A parallel computer
system (cluster) will be constructed to support software testing and high-performance computing . The algorithms and software will
be tested and validated using data simulated under various genetic models. Also, commercial software and software available in the
public domain (if available) will be used as a golden standard for comparison. Finally, the data will be applied to the numerous data
sets that have been obtained in ongoing research projects of Erasmus and IC&G.
Project leaders
Prof. C. M. van Duijn
Erasmus MC Rotterdam
PO Box 1738 3000 DR Rotterdam
The Netherlands
Phone:+31 10 704 3394; Fax:+31 10 704 4657;
e-mail: c.vanduijn@erasmusmc.nl
Prof. T. I. Axenovich
Institute of Cytology and Genetics SD RAS
Lavrentjeva ave 10
630090 Novosibirsk
Russian Federation
Phone:+7 383 3332813; Fax:+7 383 3331278;
e-mail: aks@bionet.nsc.ru
Periodic reports
Publications
Journals
- Kirichenko AV. An algorithm of step-by-step pedigree drawing. Genetika. 2004 Oct;40(10):1425-8. PubMed // Russian Journal of Genetics. 2004; 40(10): 1176-1178. Abstract
- Aulchenko YS, Bertoli-Avella AM, van Duijn CM. A method for pooling alleles from different genotyping experiments. Ann Hum Genet. 2005 Mar;69(Pt 2):233-8. PubMed
- Axenovich TI. Invited Review: Genetic mapping of common human diseases. Russian Journal of Clinical Genetics (in press)
- Axenovich TI, Zykovich AS. Power estimation for in silico mapping. Genetika. 2006 Jun;42(6):850-7. PubMed // Russian Journal of Genetics. 2006; 42(6): 696-702. Abstract
- Aulchenko YS, Axenovich TI, Mackay I, van Duijn CM. MiLD and booLD programs for calculation and analysis of corrected linkage disequilibrium. Ann Hum Genet 2003; 67: 372-275. PubMed
- Aulchenko YS, Axenovich TI. Mapping genes for complex human disease: problems and perspectives. Vestnik VOGiS [in Russian]2006; 10: 189-202.
- Axenovich TI, Aulchenko YS. Solution for underflow problem in linkage and segregation analysis. Comput Biol Chem 2006; 30:382-385. PubMed
- Liu F, Elefante S, van Duijn CM, Aulchenko YS. Ignoring Distant Genealogic Loops Leads to False-positives in Homozygosity Mapping. Ann Hum Genet 2006; 70: 965-970 PubMed
- Marie Josee E. van Rijn, Anna F.C. Schuta, Yurii S. Aulchenko, Jaap Deinum, Fakhredin A. Sayed-Tabatabaei, Mojgan Yazdanpanah, Aaron Isaacs, Tatiana I. Axenovich, Irina V. Zorkoltseva, M. Carola Zillikens, Huib A.P. Pols, Jacqueline C.M. Witteman, Ben A. Oostra and Cornelia M. van Duijn. Heritability of blood pressure traits and the genetic contribution to blood pressure variance explained by four blood-pressure-related genes. Journal of Hypertension. 2007; 25(3): 565-570. Full text
- Axenovich TI, Zorkoltseva IV, Liu F, Kirichenko AV, Aulchenko YS. Breaking loops in large complex pedigrees. Hum Hered. 2008, 65:57-65 PubMed
- Liu F, Kirichenko A, Axenovich TI, van Duijn CM, Aulchenko YS. An approach for cutting large and complex pedigrees for linkage analysis. Eur J Hum Genet. 2008, 16: 854-860 PubMed
- Liu F, Pardo LM, Schuur M, Sanchez-Juan P, Isaacs A, Sleegers K, de Koning I, Zorkoltseva IV, Axenovich TI, Witteman JC, Janssens AC, van Swieten JC, Aulchenko YS, Oostra BA, van Duijn CM. The apolipoprotein E gene and its age-specific effects on cognitive function. Neurobiol Aging. 2010 Oct;31(10):1831-3. PubMed
- Johansson A, Marroni F, Hayward C, Franklin CS, Kirichenko AV, Jonasson I, Hicks AA, Vitart V, Isaacs A, Axenovich T, Campbell S, Dunlop MG, Floyd J, Hastie N, Hofman A, Knott S, Kolcic I, Pichler I, Polasek O, Rivadeneira F, Tenesa A, Uitterlinden AG, Wild SH, Zorkoltseva IV, Meitinger T, Wilson JF, Rudan I, Campbell H, Pattaro C, Pramstaller P, Oostra BA, Wright AF, Duijn CM, Aulchenko YS, Gyllensten U. Common variants in the JAZF1 gene associated with height identified by linkage and genome-wide association analysis. Hum Mol Genet. 2009, 18: 373-380 PubMed
- Hicks AA, Pramstaller PP, Johansson A, Vitart V, Rudan I, Ugocsai P, Aulchenko Y, Franklin CS, Liebisch G, Erdmann J, Jonasson I, Zorkoltseva IV, Pattaro C, Hayward C, Isaacs A, Hengstenberg C, Campbell S, Gnewuch C, Janssens AC, Kirichenko AV, König IR, Marroni F, Polasek O, Demirkan A, Kolcic I, Schwienbacher C, Igl W, Biloglav Z, Witteman JC, Pichler I, Zaboli G, Axenovich TI, Peters A, Schreiber S, Wichmann HE, Schunkert H, Hastie N, Oostra BA, Wild SH, Meitinger T, Gyllensten U, van Duijn CM, Wilson JF, Wright A, Schmitz G, Campbell H. Genetic determinants of circulating sphingolipid concentrations in European populations. PLoS Genet. 2009 Oct;5(10):e1000672. PubMed
- Kirichenko AV, Belonogova NM, Aulchenko YS, Axenovich TI. PedStr software for cutting large pedigrees for haplotyping, IBD computation and multipoint linkage analysis. Ann Hum Genet. 2009, 73:527-31. PubMed
- Axenovich TI, Zorkoltseva IV, Belonogova NM, Struchalin MV, Kirichenko AV, Kayser M, Oostra BA, van Duijn CM, Aulchenko YS. Linkage analysis of adult height in a large pedigree from a Dutch genetically isolated population. Hum Genet. 2009, 126:457-71. PubMed
- Aulchenko YS, Struchalin MV, Belonogova NM, Axenovich TI, Weedon MN, Hofman A, Uitterlinden AG, Kayser M, Oostra BA, van Duijn CM, Janssens AC, Borodin PM. Predicting human height by Victorian and genomic methods. Eur J Hum Genet. 2009, 17:1070-5. PubMed
- Svischeva GR, Axenovich TI. Analytical estimation of linkage power in large human pedigrees. Russian Journal of Genetics, 2010, 46:105-112. PubMed
- Belonogova NM, Axenovich TI, Aulchenko YS. A powerful genome-wide feasible approach to detect parent-of-origin effects in studies of quantitative traits. Eur J Hum Genet. 2010, 18:379-84. PubMed
- Axenovich TI, Aulchenko YS. MQScore_SNP software for multipoint parametric linkage analysis of quantitative traits in large pedigrees. Ann Hum Genet. 2010, 74:286-9. PubMed
- Schol-Gelok S, Janssens AC, Tiemeier H, Liu F, Lopez-Leon S, Zorkoltseva IV, Axenovich TI, van Swieten JC, Uitterlinden AG, Hofman A, Aulchenko YS, Oostra BA, van Duijn CM. A genome-wide screen for depression in two independent Dutch populations. Biol Psychiatry. 2010, 68:187-96. PubMed
- Johansson A, Marroni F, Hayward C, Franklin CS, Kirichenko AV, Jonasson I, Hicks AA, Vitart V, Isaacs A, Axenovich T, Campbell S, Floyd J, Hastie N, Knott S, Lauc G, Pichler I, Rotim K, Wild SH, Zorkoltseva IV, Wilson JF, Rudan I, Campbell H, Pattaro C, Pramstaller P, Oostra BA, Wright AF, van Duijn CM, Aulchenko YS, Gyllensten U; EUROSPAN Consortium. Linkage and genome-wide association analysis of obesity-related phenotypes: association of weight with the MGAT1 gene. Obesity (Silver Spring). 2010, 18:803-8.
- Henneman P, Aulchenko YS, Frants RR, Zorkoltseva IV, Zillikens MC, Frolich M, Oostra BA, van Dijk KW, van Duijn CM. Genetic architecture of plasma adiponectin overlaps with the genetics of metabolic syndrome-related traits. Diabetes Care. 2010, 33:908-13. PubMed
Technical reports
Software
A number of programs were developed for data quality control and management and descriptive analysis. RECODE_PED program tests the errors of pedigree structure and converts the data to Linkage format.
Program RECODE_SNP recodes alphanumerically coded SNP to numbered alleles.
Program AFFY2MEGA converts SNP data from Affymetrix to Mega2 and Merlin formats.
Program PHENO_QC tests the consistency of phenotypes records, performs descriptive statistical analysis and convert alphanumeric binary and qualitative data to numeric format.
Program PRE_PEDCHECK prepares pedigree and genotypes data for PEDCHECK program.
Program GENOT_QC is an interface to standart genotypic quality control program PEDCHECK.
Program GENOT_QC_X tests the Mendelian errors in X chromosome genotypes.
Program POOL_STR pools Short Tandem Repeat (microsatellite) data coming from different genotyping experiments.
Program FCN can be used to describe complex pedigree structures.
Program PEDPEEL prepares pedigree data for calculation of Elston-Stewarts' likelihood function. It finds an optimal way to peel.
Program PEDCUT cuts deep pedigrees where patients are distantly related into computable sub-pedigrees based on user-specified MaxBit.
Program PED_STR cuts complex pedigrees with large number of patients which are close related into computable sub-pedigrees based on user-specified MaxBit size.
A set of programs have been developed for breaking loops in pedigrees of arbitrary structure with multiple loops. These programs achieve high performance through parallel computations, using LAM/MPI.
The classical Kruskal algorithm was used in package LOOP_EDGE.
Algorithm based on the step by step breaking loops was used in package LOOP_PED. On every step, breaker was choosen in accordance with the size of looped part of pedigree after the removing of this breaker.
Algorithm described by Vitezica et al, HumHered 2004,57:1-9, was used in package LOOP_STAR.
MAN_H_PG is a program for complex segregation analysis of quantitative traits on large pedigrees without loops.
MQscore_SNP is a program for multipoint parametric linkage analysis of quantitative traits and SNPs on large pedigrees without loops.
Ped_Outlier is a program for automatic identification of within-family outliers.
PedigreeQuery is a program for drawing pedigrees step-by-step.
GenABEL is a library for R statistical analysis software was designed for the purposes of genome-wide association analysis.
ProbABEL is a R library for GWA analysis of imputed SNPs.
MetABEL is a R library for GWA meta-analysis.
GenABEL is a R library for genome-wide association analysis.
Program DSEC_STA makes basic descriptive statistics for samples from the normal distribution.
Program TASK_MANAGER runs several tasks on multiprocessor platform on Linux system with openMosix.