Abstract
Background and research aim Lung cancer is a research priority in the UK. Early diagnosis of lung cancer can improve patients’ survival outcomes. The DART-QResearch project is part of a larger academic-industrial collaborative initiative, using big data and artificial intelligence to improve patient outcomes with thoracic diseases. There are two general research aims in the DART-QResearch project: (1) to understand the natural history of lung cancer, (2) to develop, validate, and evaluate risk prediction models to select patients at high risk for lung cancer screening.
Methods This population-based cohort study uses the QResearch® database (version 45) and includes patients aged between 25 and 84 years old and without a diagnosis of lung cancer at cohort entry (study period: 1 January 2005 to 31 December 2020). The team conducted a literature review (with additional clinical input) to inform the inclusion of variables for data extraction from the QResearch database. The following statistical techniques will be used for different research objectives, including descriptive statistics, multi-level modelling, multiple imputation for missing data, fractional polynomials to explore non-linear relationships between continuous variables and the outcome, and Cox regression for the prediction model. We will update our QCancer (lung, 10-year risk) algorithm, and compare it with the other two mainstream models (LLP and PLCOM2012) for lung cancer screening using the same dataset. We will evaluate the discrimination, calibration, and clinical usefulness of the prediction models, and recommend the best one for lung cancer screening for the English primary care population.
Discussion The DART-QResearch project focuses on both symptomatic presentation and asymptomatic patients in the lung cancer care pathway. A better understanding of the patterns, trajectories, and phenotypes of symptomatic presentation may help GPs consider lung cancer earlier. Screening asymptomatic patients at high risk is another route to achieve earlier diagnosis of lung cancer. The strengths of this study include using large-scale representative population-based clinical data, robust methodology, and a transparent research process. This project has great potential to contribute to the national cancer strategic plan and yields substantial public and societal benefits through earlier diagnosis of lung cancer.
Competing Interest Statement
JHC is an unpaid director of QResearch, a not-for-profit organisation in a partnership between the University of Oxford and EMIS Health, who supply the QResearch database for this work. JHC is a founder and shareholder of ClinRisk Ltd and was its medical director until 31 May 2019. ClinRisk Ltd produces open and closed source software to implement clinical risk algorithms into clinical computer systems including the original QCancer algorithms referred to above. CC was a statistical consultant for ClinRisk Ltd. Other authors have no interests to declare for this submitted work.
Funding Statement
The DART project is funded by Innovate UK (UK Research and Innovation, grant reference: 40255). QResearch received funding from the NIHR Biomedical Research Centre, Oxford, grants from John Fell Oxford University Press Research Fund, grants from Cancer Research UK (Grant number C5255/A18085), through the Cancer Research UK Oxford Centre, grants from the Oxford Wellcome Institutional Strategic Support Fund (204826/Z/16/Z).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The DART-QResearch project has obtained approval from the QResearch Scientific Committee on 8 March 2021. QResearch is a Research Ethics Approved Research Database, confirmed from the East Midlands Derby Research Ethics Committee (Research ethics reference: 18/EM/0400, project reference: OX37 DART).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
Due to the sensitive nature of anonymised patient level data (electronic health records), the study data are only accessible to the named researchers approved by the ethics committee.