Abstract
Large-scale sequencing and genotyping data provide an opportunity to integrate external samples as controls to improve power of association tests. However, due to the systematic differences between genotyped samples from different studies, naively aggregating the controls could lead to inflation in Type I error rates. There has been recent effort to integrate external controls while adjusting for batch effect, such as the integrating External Controls into Association Test (iECAT) and its score-based single variant tests. Building on the original iECAT framework, we propose an iECAT-Score region-based test that increases power for rare-variant tests when integrating external controls. This method assesses the systematic batch effect between internal and external samples at each variant and constructs compound shrinkage score statistics to test for the joint genetic effect within a gene or a region, while adjusting for covariates and population stratification. Through simulation studies, we demonstrate that the proposed method controls for Type I error rates and improves power in rare-variant tests. The application of the proposed method to the association studies of age-related macular degeneration (AMD) from the International AMD Genomics Consortium and UK Biobank revealed novel rare-variant associations in gene DXO. Through the incorporation of external controls, the iECAT methods offer a powerful suite to identify disease-associated genetic variants, further shedding light on future directions to investigate roles of rare variants in human diseases.
1 Application
Application ID | Title |
45227 | Scalable and Robust methods for biobank data analysis |