TY - JOUR
T1 - Trecode: A FAIR Eco-System for the Analysis and Archiving of Omics Data in a Combined Diagnostic and Research Setting
AU - Kerstens, Hinri
AU - Hehir - Kwa, Jayne
AU - van de Geer, Ellen
AU - van Run, Chris
AU - Badloe, Shashi
AU - Janse, Alex
AU - Baker-Hernandez, John
AU - de Vos, Sam
AU - van der Leest, Douwe F.M.
AU - Verwiel, Eugene T. P.
AU - Tops, Bastiaan
AU - Kemmeren, Patrick
PY - 2022/12/23
Y1 - 2022/12/23
N2 - The increase in speed, reliability, and cost-effectiveness of high-throughput sequencing has led to the widespread clinical application of genome (WGS), exome (WXS), and transcriptome analysis. WXS and RNA sequencing is now being implemented as the standard of care for patients and for patients included in clinical studies. To keep track of sample relationships and analyses, a platform is needed that can unify metadata for diverse sequencing strategies with sample metadata whilst supporting automated and reproducible analyses, in essence ensuring that analyses are conducted consistently and data are Findable, Accessible, Interoperable, and Reusable (FAIR).We present “Trecode”, a framework that records both clinical and research sample (meta) data and manages computational genome analysis workflows executed for both settings, thereby achieving tight integration between analysis results and sample metadata. With complete, consistent, and FAIR (meta) data management in a single platform, stacked bioinformatic analyses are performed automatically and tracked by the database, ensuring data provenance, reproducibility, and reusability, which is key in worldwide collaborative translational research. The Trecode data model, codebooks, NGS workflows, and client programs are publicly available. In addition, the complete software stack is coded in an Ansible playbook to facilitate automated deployment and adoption of Trecode by other users.
AB - The increase in speed, reliability, and cost-effectiveness of high-throughput sequencing has led to the widespread clinical application of genome (WGS), exome (WXS), and transcriptome analysis. WXS and RNA sequencing is now being implemented as the standard of care for patients and for patients included in clinical studies. To keep track of sample relationships and analyses, a platform is needed that can unify metadata for diverse sequencing strategies with sample metadata whilst supporting automated and reproducible analyses, in essence ensuring that analyses are conducted consistently and data are Findable, Accessible, Interoperable, and Reusable (FAIR).We present “Trecode”, a framework that records both clinical and research sample (meta) data and manages computational genome analysis workflows executed for both settings, thereby achieving tight integration between analysis results and sample metadata. With complete, consistent, and FAIR (meta) data management in a single platform, stacked bioinformatic analyses are performed automatically and tracked by the database, ensuring data provenance, reproducibility, and reusability, which is key in worldwide collaborative translational research. The Trecode data model, codebooks, NGS workflows, and client programs are publicly available. In addition, the complete software stack is coded in an Ansible playbook to facilitate automated deployment and adoption of Trecode by other users.
U2 - 10.3390/biomedinformatics3010001
DO - 10.3390/biomedinformatics3010001
M3 - Article
SN - 2673-7426
SP - 1
EP - 16
JO - BioMedInformatics
JF - BioMedInformatics
M1 - 1
ER -