TY - JOUR
T1 - Accurate distinction of pathogenic from benign CNVs in mental retardation
AU - Hehir-Kwa, Jayne Y.
AU - Wieskamp, Nienke
AU - Webber, Caleb
AU - Pfundt, Rolph
AU - Brunner, Han G.
AU - Gilissen, Christian
AU - de Vries, Bert B.A.
AU - Ponting, Chris P.
AU - Veltman, Joris A.
PY - 2010/4
Y1 - 2010/4
N2 - Copy number variants (CNVs) have recently been recognized as a common form of genomic variation in humans. Hundreds of CNVs can be detected in any individual genome using genomic microarrays or whole genome sequencing technology, but their phenotypic consequences are still poorly understood. Rare CNVs have been reported as a frequent cause of neurological disorders such as mental retardation (MR), schizophrenia and autism, prompting widespread implementation of CNV screening in diagnostics. In previous studies we have shown that, in contrast to benign CNVs, MR-associated CNVs are significantly enriched in genes whose mouse orthologues, when disrupted, result in a nervous system phenotype. In this study we developed and validated a novel computational method for differentiating between benign and MR-associated CNVs using structural and functional genomic features to annotate each CNV. In total 13 genomic features were included in the final version of a Naïve Bayesian Tree classifier, with LINE density and mouse knock-out phenotypes contributing most to the classifier's accuracy. After demonstrating that our method (called GECCO) perfectly classifies CNVs causing known MR-associated syndromes, we show that it achieves high accuracy (94%) and negative predictive value (99%) on a blinded test set of more than 1,200 CNVs from a large cohort of individuals with MR. These results indicate that this classification method will be of value for objectively prioritizing CNVs in clinical research and diagnostics
AB - Copy number variants (CNVs) have recently been recognized as a common form of genomic variation in humans. Hundreds of CNVs can be detected in any individual genome using genomic microarrays or whole genome sequencing technology, but their phenotypic consequences are still poorly understood. Rare CNVs have been reported as a frequent cause of neurological disorders such as mental retardation (MR), schizophrenia and autism, prompting widespread implementation of CNV screening in diagnostics. In previous studies we have shown that, in contrast to benign CNVs, MR-associated CNVs are significantly enriched in genes whose mouse orthologues, when disrupted, result in a nervous system phenotype. In this study we developed and validated a novel computational method for differentiating between benign and MR-associated CNVs using structural and functional genomic features to annotate each CNV. In total 13 genomic features were included in the final version of a Naïve Bayesian Tree classifier, with LINE density and mouse knock-out phenotypes contributing most to the classifier's accuracy. After demonstrating that our method (called GECCO) perfectly classifies CNVs causing known MR-associated syndromes, we show that it achieves high accuracy (94%) and negative predictive value (99%) on a blinded test set of more than 1,200 CNVs from a large cohort of individuals with MR. These results indicate that this classification method will be of value for objectively prioritizing CNVs in clinical research and diagnostics
UR - http://www.scopus.com/inward/record.url?scp=77954575178&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1000752
DO - 10.1371/journal.pcbi.1000752
M3 - Article
C2 - 20421931
AN - SCOPUS:77954575178
SN - 1553-734X
VL - 6
JO - PLoS Computational Biology
JF - PLoS Computational Biology
IS - 4
ER -