COVID-19 test results data


COVID-19 tests results data are being provided to UK Biobank by Public Health England (PHE).

Researchers who have registered for this data can access it via the Data Portal in the table covid19_result. See COVID-19 data for registration and access details and Resource 1758 for further information.

The structure of the covid19_result table is shown below, along with links to the relevant data codings on Data Showcase.


Column name Description Data type Data coding
eid Encoded participant ID Integer
specdate Date the specimen was taken Date
spectype Specimen type as recorded on the laboratory request form (e.g. nasal, nose and throat, sputum) Integer Data-Coding 1853
laboratory Laboratory that processed the sample Integer Data-Coding 1856
origin A derived field that attempts to indicate whether the patient was an inpatient when the sample was taken.

Public Health England (PHE) Collindale via NHS microbiological labs (Second Generation Surveillance System or SGSS) have attempted to identify whether a SARS-CoV-2 test originated from a hospital or elsewhere. The method that has been used is detailed in the Inpatient indicator construction section below.

We recommend that researchers verify inpatient status with hospital episode statistics (HES) data.

Integer Data-Coding 1855
result Whether the sample was reported as positive or negative for SARS-CoV-2 Integer Data-Coding 1854
reqorg The requesting organisation description. This is used in the construction of the 'origin' field. Integer Data-Coding 3311
acute Set true if the requesting organisation is from an organisation known to provide acute (emergency) care, otherwise false. This is used in the construction of the 'origin' field. Integer Data-Coding 12
hosaq Whether the sample is recorded as being hospital acquired. This is used in the construction of the 'origin' field. Integer Data-Coding 21

Please note that the structure of the table might change in the future.

Any test result records that are exact duplicates of other records (arising from results reaching PHE via different routes) will not be made available.

Please also note that not all laboratories are reporting negative results, and those that do report negative results after a certain date do not necessarily update previously submitted results to include negative tests.

As of the update on 27 May 2020, this data now includes 'pillar 2' positives. Pillar 2 is the 'superlab' testing carried out by commercial partners, used for staff screening and for care homes. For further details, please see the GOV.UK information page.

Guidance for use of the COVID-19 data

We are releasing COVID-19 test results from 16 March 2020 onwards. Please note that the nature of the dataset evolves as testing scales up in line with the national testing strategy. UK testing was initially largely restricted to those with symptoms in hospital and a positive result was therefore a reasonable proxy for severe disease. Testing capacity has now increased to include more community testing under pillar 2 of the national strategy, and as of 27 April, NHS England has directed hospitals to test all non-elective patients admitted overnight, including asymptomatic patients. These data should therefore be analysed within the context of changing testing capacity. In order to fully ascertain cases and to evaluate disease severity, data should also be used from linked medical records (i.e. primary care, hospital inpatient records and death records). See the section below on identifying inpatients for more information.

More information on UK testing capacity, including time-series data, can be found in government guidance.

It takes about 4 days from a sample being taken from the patient, through its transport to a laboratory, testing, reporting and transfer into the PHE data system which supplies data to UK Biobank. UK Biobank updates its data approximately once every two weeks, so researchers should take this time lag into consideration.

The vast majority of samples tested for COVID-19 disease are from combined nose/throat swabs, that are transported in a medium suitable for viruses (a balanced salt solution), for PCR to be performed. In intensive care settings, lower respiratory samples may also be analysed.

These data should not be used to estimate prevalence rates or to model projected rates of COVID-19 infection, as UK Biobank is not a nationally representative sample.

A proportion of UK Biobank participants will have had biological samples collected from various sites (upper or lower respiratory tract or other) and times of collection in relation to course of their SARS-CoV-2 infection. The timeline of SARS-CoV-2 test positivity may vary across biological samples.

Construction of the inpatient indicator

The construction of the "origin" field is based on information provided on the specimen request form. If the specimen was marked as being from an acute (emergency) care provider, an A&E department, an inpatient location, or resulted from health care associated infection, it is recorded by PHE as an inpatient sample. Tests marked as being from "Healthcare Worker Testing" are never recorded as inpatient samples, though some may also carry an acute flag.

The aim of designating inpatient status for the SARS-CoV-2 test was to indicate severity of COVID-19 infection. SARS-CoV-2 tests taken in hospital can be undertaken for several reasons including symptomatic patients requiring hospital admission, or general inpatient screening, which includes asymptomatic patients. Furthermore, the algorithm used to flag inpatient status may not necessarily indicate inpatient care in all cases. For example, some tests flagged as coming from "acute" trusts will likely not be inpatients, since these trusts may also perform tests on behalf of GPs and others, and tests requested by A&E may be for patients who are not then admitted.

The flow chart below illustrates the logic used by PHE to generate an indicator of whether a test result was obtained from a hospital inpatient or not (depicted as the "origin" field in the covid19_result table). The fields used to construct the "origin" field have also been released to enable researchers to replicate it and construct their own alternatives if desired.

PHE's designation of inpatient status can be compared to hospital episodes statistics (HES) dates of admission and discharge made by NHS trusts. A comparison between SARS-CoV-2 positive inpatient status versus inpatient diagnosis codes (ICD-10 diagnosis codes U071 or U072) for COVID-19 (from HES) can also be made. For further details, see here.

Flowchart for construction of inpatient indicator

