: Publication 8482

Publication 8482

Title:	Will Big Data Close the Missing Heritability Gap?
Journal:	Genetics
Published:	11 Sep 2017
Pubmed:	https://pubmed.ncbi.nlm.nih.gov/28893854/
DOI:	https://doi.org/10.1534/genetics.117.300271
URL:	https://europepmc.org/articles/pmc5676235?pdf=render
Citations:	71 (10 in last 2 years) as of 8 Aug 2024

Abstract

Despite the important discoveries reported by genome-wide association (GWA) studies, for most traits and diseases the prediction R-squared (R-sq.) achieved with genetic scores remains considerably lower than the trait heritability. Modern biobanks will soon deliver unprecedentedly large biomedical data sets: Will the advent of big data close the gap between the trait heritability and the proportion of variance that can be explained by a genomic predictor? We addressed this question using Bayesian methods and a data analysis approach that produces a surface response relating prediction R-sq. with sample size and model complexity (e.g., number of SNPs). We applied the methodology to data from the interim release of the UK Biobank. Focusing on human height as a model trait and using 80,000 records for model training, we achieved a prediction R-sq. in testing (n = 22,221) of 0.24 (95% C.I.: 0.23-0.25). Our estimates show that prediction R-sq. increases with sample size, reaching an estimated plateau at values that ranged from 0.1 to 0.37 for models using 500 and 50,000 (GWA-selected) SNPs, respectively. Soon much larger data sets will become available. Using the estimated surface response, we forecast that larger sample sizes will lead to further improvements in prediction R-sq. We conclude that big data will lead to a substantial reduction of the gap between trait heritability and the proportion of interindividual differences that can be explained with a genomic predictor. However, even with the power of big data, for complex traits we anticipate that the gap between prediction R-sq. and trait heritability will not be fully closed.</p>

Application ID	Title
15326	Compressed Sensing and high-dimensional statistical methods in complex trait genomics

Application ID

Title

15326

Compressed Sensing and high-dimensional statistical methods in complex trait genomics

Abstract

14 Keywords

5 Authors

1 Application