Detecting dispersed duplications in high-throughput sequencing data using a database-free approach

M. Kroon, E. W. Lameijer, N. Lakenberg, J. Y. Hehir-Kwa, D. T. Thung, P. E. Slagboom, J. N. Kok, K. Ye

Onderzoeksoutput: Bijdrage aan tijdschriftArtikelpeer review

14 Citaten (Scopus)

Samenvatting

Motivation: Dispersed duplications (DDs) such as transposon element insertions and copy number variations are ubiquitous in the human genome. They have attracted the interest of biologists as well as medical researchers due to their role in both evolution and disease. The efforts of discovering DDs in high-throughput sequencing data are currently dominated by database-oriented approaches that require pre-existing knowledge of the DD elements to be detected. Results: We present dd-detection, a database-free approach to finding DD events in high-throughput sequencing data. dd-detection is able to detect DDs purely from paired-end read alignments. We show in a comparative study that this method is able to compete with database-oriented approaches in recovering validated transposon insertion events. We also experimentally validate the predictions of dd-detection on a human DNA sample, showing that it can find not only duplicated elements present in common databases but also DDs of novel type. Availability and implementation: The software presented in this article is open source and available from https://bitbucket.org/mkroon/dd-detection Supplementary information: Supplementary data are available at Bioinformatics online.

Originele taal-2Engels
Pagina's (van-tot)505-510
Aantal pagina's6
TijdschriftBioinformatics
Volume32
Nummer van het tijdschrift4
DOI's
StatusGepubliceerd - 15 feb. 2016
Extern gepubliceerdJa

Vingerafdruk

Duik in de onderzoeksthema's van 'Detecting dispersed duplications in high-throughput sequencing data using a database-free approach'. Samen vormen ze een unieke vingerafdruk.

Citeer dit