Detailed Statistics from the 'Get Data Out' programme > Data
In order to fulfil its duty as a public health agency responsible for cancer prevention and control in England, the National Cancer Registration and Analysis Service of Public Health England is expected to produce evidence about cancer incidence, diagnosis, treatment and survival. We fulfil this critical function with a range of outputs, including official statistics, reports, and support for public health research on cancer.
As part of this broader mission, the Get Data Out (GDO) programme has produced key cancer statistics for small groups of patients; these outputs are meant for use by patients, the public and any general user, and anonymisation standards are designed in to these outputs by aggregation at the outset.
Get Data Out (GDO) tables are currently produced for four statistical areas
(incidence, treatment, survival and routes to diagnosis) and for the following tumour groups:
Download Get Data Out (GDO) tables
|GDO_data_wide.csv||A collated .csv file of all the latest statistics in wide format (one row for each group of patients, with all statistics for that group as columns)|
|GDO_data_thin.csv||A collated .csv file of all the latest statistics in thin format (one row for each statistic, with each group of patients having many rows of data)|
|A collated .json file of all the latest statistics [coming soon]|
|GDO_metadata.csv||A .csv file with the metadata for each statistic including full descriptions and units|
|GDO_releases.csv||A .csv file listing the releases that make up the Get Data Out table, including release dates and lists of documentation|
|A .json file listing the releases that make up the Get Data Out table, including release dates and lists of documentation [coming soon]|
|GDO_structure.csv||A .csv file defining the tree structure of the partition|
|GDO_units.csv||A .csv file with the metadata for each unit|
|GDO_missing.csv||A .csv file containing the look ups for the missing data codes|
The Get Data Out tables were last updated on 2019-12-18. A list of all data releases is available at the bottom of the page.
Using the Data
The data can be downloaded from the links above. Alternatively tools and webpages can be pointed directly to the data at our static URLs. Click here for more detail for developers about the data structures and accessing the data.
The data is signed off as non-disclosive and is released under an Open Government Licence. You are free to copy, publish, distribute and transmit the information, and to adapt it and include it in your own products. The attribution statement that must be included with any reuse of the data is:
Data for this [study/ project/ report/tool] is based on patient-level information collected by the NHS, as part of the care and support of cancer patients. The data is collated, maintained and quality assured by the National Cancer Registration and Analysis Service, which is part of Public Health England (PHE). The data is taken from the Get Data Out tables.
Understanding the Get Data Out groupings
The Get Data Out programme partitions diagnoses of cancer into many small groups, where each group contains approximately 100 people with the same characteristics.
The grouping process can be imagined as a rooted branching tree, where the first node is the group 'all tumours', and each branch point divides by a dimension of interest (e.g. age, region, sex). If a node contains too few tumours then it cannot be divided further without making groups of less than 100, and so the tree terminates there. If the node has enough tumours it branches again by the next dimension of interest. We have visualised the trees for each tumour type, and you can view them on the pages dedicated to each tumour type.
The Get Data Out tumour groupings are explained in more detail in the documents below:
- Brain, meningeal and other primary CNS tumours grouping
- Head and neck cancer grouping
- Ovarian, fallopian tube and primary peritoneal carcinomas grouping
- Pancreatic grouping
- Prostate grouping
- Testicular tumours including post-pubertal teratomas grouping
Statistics available in the Get Data Out table
There are currently four statistical releases available in the Get Data Out table.
Incidence. Statistics are provided on the number of new tumours diagnosed in each group and the incidence rate of cancer in this group with upper and lower confidence intervals.
Treatment. Statistics are provided on the number of tumours treated with surgery, chemotherapy, radiotherapy and all combinations of these treatments in each group, the % of tumours treated, and the upper and lower confidence intervals around the percentage.
Survival. Statistics are provided on the number of tumours included in the survival calculation and the net and crude survival rates in each group at 3, 6, 9, 12, 24, 36 and 48 months after diagnosis, with upper and lower confidence intervals.
Routes to Diagnosis. Statistics are provided on the number of tumours diagnosed by each 'route to diagnosis' and the % of tumours diagnosed by each route with the upper and lower confidence intervals. The eight standard diagnostic routes - two week wait; GP referral; screening; other outpatient; inpatient elective; emergency presentation; death certificate only and unknown - are provided, along with a 'not classified' group. Please visit: http://ncin.org.uk/publications/routes_to_diagnosis to find out more.
If you have feedback on this pilot, or any other queries about the Get Data Out tables, please email us here. It will help us to get your query to the right people if you mention 'Get Data Out' in your email.