A computational framework to explore large-scale biosynthetic diversity

Jorge C. Navarro-Muñoz, Nelly Selem-Mojica, Michael W. Mullowney, Satria A. Kautsar, James H. Tryon, Elizabeth I. Parkinson, Emmanuel L.C. De Los Santos, Marley Yeong, Pablo Cruz-Morales, Sahar Abubucker, Arne Roeters, Wouter Lokhorst, Antonio Fernandez-Guerra, Luciana Teresa Dias Cappelini, Anthony W. Goering, Regan J. Thomson, William W. Metcalf, Neil L. Kelleher, Francisco Barona-Gomez, Marnix H. Medema

Research output: Contribution to journalArticlepeer-review

503 Citations (Scopus)

Abstract

Genome mining has become a key technology to exploit natural product diversity. Although initially performed on a single-genome basis, the process is now being scaled up to mine entire genera, strain collections and microbiomes. However, no bioinformatic framework is currently available for effectively analyzing datasets of this size and complexity. In the present study, a streamlined computational workflow is provided, consisting of two new software tools: the ‘biosynthetic gene similarity clustering and prospecting engine’ (BiG-SCAPE), which facilitates fast and interactive sequence similarity network analysis of biosynthetic gene clusters and gene cluster families; and the ‘core analysis of syntenic orthologues to prioritize natural product gene clusters’ (CORASON), which elucidates phylogenetic relationships within and across these families. BiG-SCAPE is validated by correlating its output to metabolomic data across 363 actinobacterial strains and the discovery potential of CORASON is demonstrated by comprehensively mapping biosynthetic diversity across a range of detoxin/rimosamide-related gene cluster families, culminating in the characterization of seven detoxin analogues.

Original languageEnglish
Pages (from-to)60-68
Number of pages9
JournalNature Chemical Biology
Volume16
Issue number1
DOIs
Publication statusPublished - 1 Jan 2020
Externally publishedYes

Keywords

  • Actinobacteria/genetics
  • Algorithms
  • Biological Products
  • Biosynthetic Pathways/genetics
  • Cluster Analysis
  • Computational Biology/methods
  • Data Mining/methods
  • Genome, Bacterial
  • Genomics
  • Metabolomics
  • Microbiota
  • Multigene Family
  • Phylogeny
  • Reproducibility of Results
  • Software

Fingerprint

Dive into the research topics of 'A computational framework to explore large-scale biosynthetic diversity'. Together they form a unique fingerprint.

Cite this