TY - JOUR
T1 - Automated workflow-based exploitation of pathway databases provides new insights into genetic associations of metabolite profiles
AU - Dharuri, Harish
AU - Henneman, Peter
AU - Demirkan, Ayse
AU - van Klinken, Jan B.
AU - Mook-Kanamori, Dennis O.
AU - Wang-Sattler, Rui
AU - Gieger, Christian
AU - Adamski, Jerzy
AU - Hettne, Kristina
AU - Roos, Marco
AU - Suhre, Karsten
AU - Van Duijn, Cornelia M.
AU - van Dijk, Ko W.
AU - 't Hoen, Peter A.C.
AU - Ugocsai, Peter
AU - Isaacs, Aaron
AU - Pramstaller, Peter P.
AU - Liebisch, Gerhard
AU - Wilson, James F.
AU - Johansson, Åsa
AU - Rudan, Igor
AU - Aulchenko, Yurii S.
AU - Kirichenko, Anatoly V.
AU - Janssens, A. Cecile J.W.
AU - Jansen, Ritsert C.
AU - Gnewuch, Carsten
AU - Domingues, Francisco S.
AU - Pattaro, Cristian
AU - Wild, Sarah H.
AU - Jonasson, Inger
AU - Polasek, Ozren
AU - Zorkoltseva, Irina V.
AU - Hofman, Albert
AU - Karssen, Lennart
AU - Struchalin, Maksim
AU - Floyd, James
AU - Igl, Wilmar
AU - Biloglav, Zrinka
AU - Broer, Linda
AU - Pfeufer, Arne
AU - Pichler, Irene
AU - Campbell, Susan
AU - Zaboli, Ghazal
AU - Kolcic, Ivana
AU - Rivadeneira, Fernando
AU - Huffman, Jennifer
AU - Hastie, Nicholas D.
AU - Uitterlinden, Andre
AU - Franke, Lude
AU - Franklin, Christopher S.
AU - Vitart, Veronique
AU - Witteman, Jacqueline C.M.
AU - Axenovich, Tatiana
AU - Oostra, Ben A.
AU - Meitinger, Thomas
AU - Hicks, Andrew A.
AU - Hayward, Caroline
AU - Wright, Alan F.
AU - Gyllensten, Ulf
AU - Campbell, Harry
AU - Schmitz, Gerd
N1 - Funding Information:
This study is funded by the European Community’s Seventh Framework Programme (FP7/2007-2013) ENGAGE, the Centre for Medical Systems Biology (CMSB) and Netherlands Consortium for Systems Biology (NCSB), both within the framework of the Netherlands Genomics Initiative (NGI)/ Netherlands Organisation for Scientific Research (NWO) and the European Commission Seventh Framework Programme Wf4Ever (Digital Libraries and Digital Preservation area ICT-2009.4.1 project reference 270192). EUROSPAN consortium members are: Ayşe Demirkan, Cornelia M van Duijn, Peter Ugocsai, Aaron Isaacs, Peter P Pramstaller,Gerhard Liebisch, James F Wilson, Åsa Johansson, Igor Rudan, Yurii S Aulchenko, Anatoly V Kirichenko, A Cecile JW Janssens, Ritsert C Jansen, Carsten Gnewuch, Francisco S Domingues, Cristian Pattaro, Sarah H Wild, Inger Jonasson, Ozren Polasek, Irina V Zorkoltseva, Albert Hofman, Lennart Karssen, Maksim Struchalin, James Floyd, Wilmar Igl, Zrinka Biloglav, Linda Broer, Arne Pfeufer, Irene Pichler,Susan Campbell, Ghazal Zaboli, Ivana Kolcic, Fernando Rivadeneira, Jennifer Huffman, Nicholas D Hastie, Andre Uitterlinden, Lude Franke, Christopher S Franklin, Veronique Vitart, Jacqueline CM Witteman, Tatiana Axenovich, Ben A Oostra, Thomas Meitinger, Andrew A Hicks, Caroline Hayward, Alan F Wright, Ulf Gyllensten, Harry Campbell, Gerd Schmitz.
PY - 2013/12/9
Y1 - 2013/12/9
N2 - Background: Genome-wide association studies (GWAS) have identified many common single nucleotide polymorphisms (SNPs) that associate with clinical phenotypes, but these SNPs usually explain just a small part of the heritability and have relatively modest effect sizes. In contrast, SNPs that associate with metabolite levels generally explain a higher percentage of the genetic variation and demonstrate larger effect sizes. Still, the discovery of SNPs associated with metabolite levels is challenging since testing all metabolites measured in typical metabolomics studies with all SNPs comes with a severe multiple testing penalty. We have developed an automated workflow approach that utilizes prior knowledge of biochemical pathways present in databases like KEGG and BioCyc to generate a smaller SNP set relevant to the metabolite. This paper explores the opportunities and challenges in the analysis of GWAS of metabolomic phenotypes and provides novel insights into the genetic basis of metabolic variation through the re-analysis of published GWAS datasets. Results: Re-analysis of the published GWAS dataset from Illig et al. (Nature Genetics, 2010) using a pathway-based workflow (http://www.myexperiment.org/packs/319.html), confirmed previously identified hits and identified a new locus of human metabolic individuality, associating Aldehyde dehydrogenase family1 L1 (ALDH1L1) with serine/glycine ratios in blood. Replication in an independent GWAS dataset of phospholipids (Demirkan et al., PLoS Genetics, 2012) identified two novel loci supported by additional literature evidence: GPAM (Glycerol-3 phosphate acyltransferase) and CBS (Cystathionine beta-synthase). In addition, the workflow approach provided novel insight into the affected pathways and relevance of some of these gene-metabolite pairs in disease development and progression. Conclusions: We demonstrate the utility of automated exploitation of background knowledge present in pathway databases for the analysis of GWAS datasets of metabolomic phenotypes. We report novel loci and potential biochemical mechanisms that contribute to our understanding of the genetic basis of metabolic variation and its relationship to disease development and progression.
AB - Background: Genome-wide association studies (GWAS) have identified many common single nucleotide polymorphisms (SNPs) that associate with clinical phenotypes, but these SNPs usually explain just a small part of the heritability and have relatively modest effect sizes. In contrast, SNPs that associate with metabolite levels generally explain a higher percentage of the genetic variation and demonstrate larger effect sizes. Still, the discovery of SNPs associated with metabolite levels is challenging since testing all metabolites measured in typical metabolomics studies with all SNPs comes with a severe multiple testing penalty. We have developed an automated workflow approach that utilizes prior knowledge of biochemical pathways present in databases like KEGG and BioCyc to generate a smaller SNP set relevant to the metabolite. This paper explores the opportunities and challenges in the analysis of GWAS of metabolomic phenotypes and provides novel insights into the genetic basis of metabolic variation through the re-analysis of published GWAS datasets. Results: Re-analysis of the published GWAS dataset from Illig et al. (Nature Genetics, 2010) using a pathway-based workflow (http://www.myexperiment.org/packs/319.html), confirmed previously identified hits and identified a new locus of human metabolic individuality, associating Aldehyde dehydrogenase family1 L1 (ALDH1L1) with serine/glycine ratios in blood. Replication in an independent GWAS dataset of phospholipids (Demirkan et al., PLoS Genetics, 2012) identified two novel loci supported by additional literature evidence: GPAM (Glycerol-3 phosphate acyltransferase) and CBS (Cystathionine beta-synthase). In addition, the workflow approach provided novel insight into the affected pathways and relevance of some of these gene-metabolite pairs in disease development and progression. Conclusions: We demonstrate the utility of automated exploitation of background knowledge present in pathway databases for the analysis of GWAS datasets of metabolomic phenotypes. We report novel loci and potential biochemical mechanisms that contribute to our understanding of the genetic basis of metabolic variation and its relationship to disease development and progression.
KW - Bioinformatics
KW - Genome-wide association
KW - Genotype-phenotype prioritization
KW - Metabolite
KW - Pathway databases
UR - http://www.scopus.com/inward/record.url?scp=84889858142&partnerID=8YFLogxK
U2 - 10.1186/1471-2164-14-865
DO - 10.1186/1471-2164-14-865
M3 - Article
C2 - 24320595
AN - SCOPUS:84889858142
SN - 1471-2164
VL - 14
JO - BMC Genomics
JF - BMC Genomics
IS - 1
M1 - 865
ER -