About
Measuring a trait in a cohort of genotyped individuals allows to identify genomic loci statistically associated with it. This, depending on the nature of the phenotype, is the basis of genome-wide association studies (GWAS) and quantitative trait loci (QTL) mapping analyses. However, while genotypes are well-defined biological entities, phenotypes are usually defined more subjectively and may be related to a wide variety of biological processes. Indeed, multi-trait phenotypes are widespread in biology: levels of blood lipids (LDL, HDL, triglycerides), cellular composition of a tissue, traits that define a given neurological disorder, body measures (height and weight), expression of the genes in the same pathway, abundances of the splicing isoforms of a gene, etc. Nonetheless, despite the multivariate nature of many biological phenotypes, GWAS and QTL analyses are generally performed one phenotype at a time. This approach does not take into account the correlation structure of the studied traits, which often translates in a lack of power to detect the true associations. In addition, although some of the currently available multivariate methods have been occasionally applied to GWAS analyses, they present several limitations (increased complexity, lack of interpretability, strong model assumptions, large amount of computation required, etc.) which hinder their broad usage by the community. In this scenario, we have developed a fast, non-parametric method for multivariate distance matrix regression, extending the statistical framework originally proposed by Anderson (DOI: 10.1111/j.1442-9993.2001.01070.pp.x and 10.1111/1467-842X.00285). It allows to assess significance of the association between a quantitative multivariate response and a set of explanatory variables using the asymptotic null distribution of the test statistics. To evaluate our approach, although we are potentially interested in all kinds of multivariate phenotypes, we have thought of neuroimaging phenotypes (e.g. brain areas' volumes, connectivity, presence of lesions such as white matter hyperintensities, etc.), intrinsically multivariate, for a proof-of-concept GWAS analysis. Along the duration of the project, planned up to 3 years, our goal is to assess the performance of our approach and compare it to other univariate and multivariate strategies, as well as identifying those genetic variants that alter human brain structures, which may reveal new biological mechanisms underlying cognition and neuropsychiatric disorders.