Jutta Buschbom
Statistical Genetics, February 26-28, 2022
Poster presented at the 4th Annual Meeting in Conservation Genetics – from Genomes to Application, Frankfurt a.M., Germany, Feb. 26-28, 2020.
Abstract
Many conservation genetic tasks require the reliable identification of the geographic origin or within-species genetic lineage of individuals. In evolutionary systems dominated by a multitude of processes and noise, advanced population-level inference approaches continuously progress towards ever more finely identifying and extracting from large genomic datasets the genetic signal that is specific to a given question and its scale.
The collection of the thereby required global reference datasets of genetic diversity from natural populations is an extensive cooperative effort that requires many logistic and analytical steps. These cover the whole process, from project and sample design, sample collection and management, lab work and genomics to statistical analyses and reporting, as well as, finally, conclusions, which form the basis for further research and applied action.
Integrated into a single, user-friendly chain-of-custody of interoperable modules, these many steps can be managed even by smaller working groups focusing on non-model organisms from across the Tree of Life. Quality control and assessment are substantiated by standards and certified procedures, forming a data infrastructure and work environment. Throughout the chain, build-in functionality for efficient data cleaning and exploration, as well as, the reproducibility of analyses, allows for effective error detection and removal. Likewise, workflows for evaluations of model sensitivity and checks of model usefulness provide the basis for the validation of data and models.
Such a chain-of-custody provides the foundation for robust and reliable, as well as, scalable conservation genomic tools. These enable communities, governments and conservation activists to protect, manage and sustainably use biodiversity.