Title: | Imputed Genotypes Versus Sequenced Genotypes for the Association Analysis of Rare Variants |
Journal: | Russian Journal of Genetics |
Published: | 25 Nov 2024 |
DOI: | https://doi.org/10.1134/s1022795424701126 |
Title: | Imputed Genotypes Versus Sequenced Genotypes for the Association Analysis of Rare Variants |
Journal: | Russian Journal of Genetics |
Published: | 25 Nov 2024 |
DOI: | https://doi.org/10.1134/s1022795424701126 |
WARNING: the interactive features of this website use CSS3, which your browser does not support. To use the full features of this website, please update your browser.
Exome-sequenced genotypes provide the most informative material for the analysis of rare genetic variants. However, their widespread use is currently limited by the relatively small number of sequenced samples compared to imputed samples and the lack of free access to personal genotypes. This latter drawback of sequenced data is not critical for imputed data that combine genotypes collected on microarray platforms and missing genotypes reconstructed using reference haplotype panels. The results of genome-wide association studies (GWAS) of imputed genotypes are freely available for thousands of traits and millions of genetic variants. These data can be used for gene-based association analysis, which is the primary tool for studying rare variants. However, imputed genotypes have disadvantages compared to sequenced genotypes. The number and quality of imputed genotypes are lower than those of the sequenced genotypes. We aimed to test how these disadvantages affect the results of rare variant analysis. We considered 188 236 participants in the UK Biobank project who had both imputed and sequenced genotypes. The results of the single-variant association analysis showed a high quality of imputation. Inflation factors for 47 traits were around 1, and p-values were very close to those obtained for sequenced genotypes (r2 = 0.994). We performed the gene-based association analysis using imputed and sequenced genotypes. The number of association signals identified using imputed data was approximately half that for sequenced data. It is expected that if the sample of imputed genotypes is twice as large as the sample of sequenced data, the power of the imputed data analysis should be equivalent to that of the sequenced data for the protein-coding variants.</p>
Application ID | Title |
---|---|
59345 | The impact of rare genetic variants on cardiovascular diseases and their risk factors |
Enabling scientific discoveries that improve human health