Data providers and dates of data availability

This page gives details of the various data providers for linked death, hospital inpatient and cancer records, including the data codings used and the period for which data is available.

The censoring date for each data provider is the date up to when UK Biobank estimate that data received from that provider is complete. The censoring dates below have been calculated using the following rule:

The censoring date is the last day of the month for which the number of records is greater than 90% of the mean of the number of records for the previous three months, except where the data for that month is known to be incomplete in which case the censoring date is the last day of the previous month.

Note that the final part of this rule has been added in response to the COVID-19 pandemic causing an increase in deaths which would, in the absence of this proviso, place the censoring date after the last data point.

The censoring dates are not applied by UK Biobank to the data made available to researchers which will always contain the latest data regardless of censoring dates, and may include incomplete data after the dates below. These dates are intended for guidance only. Once a researcher has received their data, they should censor outcomes based on their own research protocol.

Note that in the case of the death and hospital inpatient data the dates below reflect the data available for download through the Data Portal, which will in general be more up to date than that available in a main dataset. For example, English inpatient data is currently available up to March 2020 on the Data Portal but only March 2017 in a main dataset.

Death data

Death

Data Provider

International Classification of Diseases (ICD)

Period of data currently available

Censoring date

ICD9

ICD10

England & Wales NHS Digital 2006 onwards April 2006 onwards 30 April 2020 *
Scotland NHS Central Register, National Records of Scotland 2006 onwards April 2006 onwards 30 April 2020 *


* Note that in both cases the data continues well into May 2020 but is not complete for that month.

Hospital inpatient data

Hospital Admissions (Inpatients)

Data Provider

International Classification of Diseases (ICD)

Classification of Interventions and Procedures (OPCS)

Period of data currently available

Censoring date

ICD9

ICD10

OPCS3

OPCS4

Hospital Episode Statistics for England (HES) NHS Digital 1996 onwards 1996 onwards 1996 onwards 31 March 2020 *
Scottish Morbidity Record (SMR) Information and Statistics Division (ISD), Scotland 1981 - 1996 1996 onwards 1977 - 1988 1989 onwards 1981 onwards 31 October 2016 **
Patient Episode Database for Wales (PEDW) Secure Anonymised Information Linkage (SAIL), Wales 1999 onwards 1999 onwards 1998 onwards 29 February 2016


* Note that we have overriden the normal rule to arrive at this censoring date since the number of records for March 2020 only constitutes just over 87% of the average of December 2019 to February 2020. However, we believe this drop is likely to be because of the cancellation of routine admissions due to the pandemic.

Note also that we have held back a very small proportion of inpatient data for April 2017 to March 2020 (approximately 0.25%, or around 600 episodes for each year) due to an incomplete linkage match. After they have been scrutinised further, some of these records may be released at a future date.

** The Scottish hospital inpatient data does not currently include psychiatric or maternity admissions.


Cancer data

Cancer

Data Provider

International Classification of Diseases (ICD)

Period of data currently available

Censoring date

ICD9

ICD10

England & Wales NHS Digital 1979 - 1994 1995 onwards 1971 onwards 31 March 2016
Scotland National Records of Scotland, NHS Central Register 1980 - 1996 1997 onwards 1957 onwards 31 October 2015



Improving the health of future generations