TY - JOUR
T1 - The construction of genome-based transcriptional units
AU - Van Hooff, Sander R.
AU - Koster, Jan
AU - Hulsen, Tim
AU - Van Schaik, Barbera D.C.
AU - Roos, Marco
AU - Van Batenburg, Marcel F.
AU - Versteeg, Rogier
AU - Van Kampen, Antoine H.C.
PY - 2009/4/1
Y1 - 2009/4/1
N2 - Gene-oriented sequence clusters (transcriptional units) have found many applications in genomics research including the construction of transcriptome maps and identification of splice variants. We developed a new method to construct transcriptional that uses the genomic sequence as a template. We present and discuss our method in detail together with an evaluation of the transcriptional units for human. We constructed 33,007 and 27,792 transcriptional units for human and mouse, respectively. The sensitivity (81) and specificity (90) of our method compares favorably to other established methods. We evaluated the representation of experimentally validated and predicted intergenic spliced transcripts in humans and show that we correctly represent a large fraction of these cases by single transcriptional units. Our method performs well, but the evaluation of the final set of transcriptional units show that improvements to the algorithm are still possible. However, because the precise number and types of errors are difficult to track, it is not obvious how to significantly improve the algorithm. We believe that ongoing research efforts are necessary to further improve current methods. This should include detailed documentation, comparison, and evaluation of current methods.
AB - Gene-oriented sequence clusters (transcriptional units) have found many applications in genomics research including the construction of transcriptome maps and identification of splice variants. We developed a new method to construct transcriptional that uses the genomic sequence as a template. We present and discuss our method in detail together with an evaluation of the transcriptional units for human. We constructed 33,007 and 27,792 transcriptional units for human and mouse, respectively. The sensitivity (81) and specificity (90) of our method compares favorably to other established methods. We evaluated the representation of experimentally validated and predicted intergenic spliced transcripts in humans and show that we correctly represent a large fraction of these cases by single transcriptional units. Our method performs well, but the evaluation of the final set of transcriptional units show that improvements to the algorithm are still possible. However, because the precise number and types of errors are difficult to track, it is not obvious how to significantly improve the algorithm. We believe that ongoing research efforts are necessary to further improve current methods. This should include detailed documentation, comparison, and evaluation of current methods.
UR - http://www.scopus.com/inward/record.url?scp=64549130865&partnerID=8YFLogxK
U2 - 10.1089/omi.2008.0036
DO - 10.1089/omi.2008.0036
M3 - Article
C2 - 19320556
AN - SCOPUS:64549130865
SN - 1536-2310
VL - 13
SP - 105
EP - 114
JO - OMICS A Journal of Integrative Biology
JF - OMICS A Journal of Integrative Biology
IS - 2
ER -