ABSTRACT
Background Primary care electronic health records provide a rich source of information for inequalities research. However, the reliability and validity of the research derived from these records depends on the completeness and resolution of the codelists used to identify marginalised populations.
Aim The aim of this project was to develop comprehensive codelists for identifying ethnic minorities, people with learning disabilities (LD), people with severe mental illness (SMI) and people who are transgender.
Design and setting This study was a codelist development project, conducted using primary care data from the United Kingdom.
Method Groups of interest were defined a priori. Relevant clinical codes were identified by searching Clinical Practice Research Datalink (CPRD) publications, codelist repositories and the CPRD code browser. Relevant codelists were downloaded and merged according to marginalised group. Duplicates were removed and remaining codes reviewed by two general practitioners. Comprehensiveness was assessed in a representative CPRD population of 10,966,759 people, by comparing the frequencies of individuals identified when using the curated codelists, compared to commonly used alternatives.
Results A total of 52 codelists were identified. 1,420 unique codes were selected after removal of duplicates and GP review. Compared with comparator codelists, an additional 48,017 (76.6%), 52,953 (68.9%) and 508 (36.9%) people with a LD, SMI or transgender code were identified. The frequencies identified for ethnicity were consistent with expectations for the UK population.
Conclusion The codelists curated through this project will improve inequalities research by improving standards of identifying marginalised groups in primary care data.
HOW THIS FITS IN
The reliability and validity of primary care data for inequalities research depends on the comprehensiveness of the codes used to identify people from marginalised groups.
This study set out to develop comprehensive codelists for the identification of four key groups, known to experience health inequalities.
We developed comprehensive codelists for identifying ethnic minorities, learning disabilities, severe mental illness and people who are transgender, using a systematic approach.
The codelists were validated by two general practitioners, assessed in a representative sample, and can now be used in primary care practice and research, both nationally and internationally.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work is funded by the National Institute for Health Research (NIHR) Policy Research Programme, conducted through the Policy Research Unit in Cancer Awareness, Screening and Early Diagnosis, NIHR206132. The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care. This work was supported by Breast Cancer Now [Grant Ref no: 2023FebIFS1615].
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
Funding. This work is funded by the National Institute for Health Research (NIHR) Policy Research Programme, conducted through the Policy Research Unit in Cancer Awareness, Screening and Early Diagnosis, NIHR206132. The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.
This work was supported by Breast Cancer Now [Grant Ref no: 2023FebIFS1615].
Ethical approval. This project did not involve the collection of primary data. As such, ethical approval was not required.
Data availability. The codelists developed through this work are available from Open Science Framework: https://osf.io/8skze/
Competing interests. The authors have no conflicts of interest to declare.
Data Availability
The codelists developed through this work are available from Open Science Framework.