Metadata modeling in phylogenetic software to increase data reuse and reproducibility

Stöver BC, Brech P, Wiechers S, Müller KF

Poster

Abstract

Phylogenetics, phylogenomics and related fields have become data-intensive due to increasingly cheaper high-throughput sequencing technologies, the digitization of large biological collections or data contributions from citizen science. An increasing number of computationally accessible methods for analyses that produce derived data, like phylogenetic trees, further contribute to the production of large quantities of potentially reusable data. This opens up new opportunities for data-intense studies, but also creates new challenges for cyberinfrastructure and method development such as meaningful and ideally machine-interpretable annotation of published data to allow easy reuse and automated large-scale data collection.Here we present new functionality of our phylogenetic tree editor, TreeGraph 2, to store, edit and visualize any type of metadata attached to phylogenetic trees or their nodes and branches. TreeGraph 2 has become widely used since its first release in 2008 and offers versatile editing and formatting features in a user-friendly graphical user interface. It now supports the metadata model of NeXML that is based on the Resource Description Framework (RDF) and allows unambiguously describing the relation of metadata to phylogenetic data. The new functionality simplifies necessary metadata annotation of phylogenetic trees to allow their optimal reuse, to document the workflow used to infer them, and to link respective raw data.A similar extension of the metadata model of our multiple sequence alignment editor PhyDE is currently underway. In combination, PhyDE and TreeGraph 2 provide necessary annotation functionality for all major datatypes of phylogenetics. The new functionality is based on our software libraries JPhyloIO and LibrAlign that exhibit their functionality to third-party applications.

Details zur Publikation

Release year: 2018
Language in which the publication is writtenEnglish