: Publication 8894

Publication 8894

Title:	Inferring whole-genome histories in large population datasets
Journal:	Nature Genetics
Published:	2 Sep 2019
Pubmed:	https://pubmed.ncbi.nlm.nih.gov/31477934/
DOI:	https://doi.org/10.1038/s41588-019-0483-y
Citations:	221 (85 in last 2 years) as of 8 Aug 2024

WARNING: the interactive features of this website use CSS3, which your browser does not support. To use the full features of this website, please update your browser.

Abstract

Inferring the full genealogical history of a set of DNA sequences is a core problem in evolutionary biology, because this history encodes information about the events and forces that have influenced a species. However, current methods are limited, and the most accurate techniques are able to process no more than a hundred samples. As datasets that consist of millions of genomes are now being collected, there is a need for scalable and efficient inference methods to fully utilize these resources. Here we introduce an algorithm that is able to not only infer whole-genome histories with comparable accuracy to the state-of-the-art but also process four orders of magnitude more sequences. The approach also provides an 'evolutionary encoding' of the data, enabling efficient calculation of relevant statistics. We apply the method to human data from the 1000 Genomes Project, Simons Genome Diversity Project and UK Biobank, showing that the inferred genealogies are rich in biological signal and efficient to process.</p>

14 Keywords

Algorithms
Computer Simulation
Datasets as Topic
Evolution, Molecular
Genetics, Population
Genome, Human
Haplotypes
Humans
Models, Genetic
Mutation
Pedigree
Polymorphism, Single Nucleotide
Population Density
Selection, Genetic

6 Authors

Jerome Kelleher
Yan Wong
Anthony W. Wohns
Chaimaa Fadil
Patrick K. Albers
Gil McVean

1 Application

Application ID	Title
12788	Assessing the history and health consequences of rare variants

Enabling scientific discoveries that improve human health