Abstract
Background Primary care data in the UK are widely used for cancer research, but the reliability of recording key events such as diagnoses remains uncertain. Data linkage can mitigate these uncertainties; however, researchers may avoid linkage due to high costs, tight timelines, and sample size limitations. Hence, this study aimed to assess the quality of prostate cancer (PCa) diagnoses in primary care. We utilised Clinical Practice Research Datalink (CPRD) primary care data linked to National Cancer Registration and Analysis Service (NCRAS) and Hospital Episode Statistics (HES) in England. We compared accuracy, completeness, and timing of diagnosis recording between sources to facilitate decision-making regarding data source selection for future research.
Methods Incident PCa diagnoses (2000-2016) for males aged ≥46 years recorded in at least one study data source were examined. The accuracy of a data source was estimated by the proportion of diagnoses recorded in the specific source that was also confirmed by any linked source. Completeness was estimated by identifying the proportion of all diagnoses in linked sources with a matching diagnosis in the specific source.
Results The study included 51,487 PCa patients from either source. CPRD demonstrated 86.9% accuracy and 68.2% completeness against NCRAS and 75.1% accuracy and 61.1% completeness against HES. Overall, CPRD showed the highest accuracy (93%) but the lowest completeness (60.7%). Diagnosis dates in CPRD were more concordant with NCRAS (90.6% within 6 months) than with HES (61.2%). Over time, accuracy and completeness improved, especially after 2004. Discrepancies in diagnosis dates revealed a median delay of 2 weeks in CPRD than NCRAS and 1 week than HES. CPRD Aurum exhibited better quality compared to GOLD.
Conclusions While the accuracy of PCa diagnoses in CPRD compared to linked sources was high, completeness was low. Therefore, linking to HES or NCRAS should be considered for improved case capture, acknowledging their inherent limitations.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This research was funded by the University of Surrey, UK as part of the Doctoral studentship awarded to Gayasha.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The study was approved by the Medicines and Healthcare Products Regulatory Agency (MHRA) Independent Scientific Advisory Committee (protocol number 19_050R).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
The data utilised in this study were obtained from the CPRD, facilitated by the UK MHRA. However, the authors' license for using these data does not permit the sharing of raw data with third parties. For information regarding access to CPRD data, interested parties may refer to the following link: Research applications | CPRD.