Whole genome sequencing data is publicly available for 200,000 participants, with release of data for the full cohort anticipated in late 2023.

500,000 participants were sequenced between 2018-2021 using Illumina NovaSeq technology. Sequencing was completed in two phases:

The whole genome sequencing data were quality controlled; direct output files from some quality confirmation steps are available in Category 180 and specific metrics in tabular format are made available in Category 187.

Joint variant calling has been performed on the first 150,000 Main Phase samples by deCODE Genetics, and subsequently repeated to include the 50,000 samples sequenced in the Vanguard Phase. Details of the joint variant calling and the main phase of the sequencing project are available in Halldorsson et al, Nature 607, 732-740 (2022)