Clinical coding of long COVID in English primary care: a federated analysis of 58 million patient records in situ using OpenSAFELY ================================================================================================================================== * The OpenSAFELY Collaborative * Alex J Walker * Brian MacKenna * Peter Inglesby * Christopher T Rentsch * Helen J Curtis * Caroline E Morton * Jessica Morley * Amir Mehrkar * Seb Bacon * George Hickman * Chris Bates * Richard Croker * David Evans * Tom Ward * Jonathan Cockburn * Simon Davy * Krishnan Bhaskaran * Anna Schultze * Elizabeth J Williamson * William J Hulme * Helen I McDonald * Laurie Tomlinson * Rohini Mathur * Rosalind M Eggo * Kevin Wing * Angel YS Wong * Harriet Forbes * John Tazare * John Parry * Frank Hester * Sam Harper * Shaun O’Hanlon * Alex Eavis * Richard Jarvis * Dima Avramov * Paul Griffiths * Aaron Fowles * Nasreen Parkes * Ian J Douglas * Stephen JW Evans * Liam Smeeth * Ben Goldacre ## Abstract **Background** Long COVID is a term to describe new or persistent symptoms at least four weeks after onset of acute COVID-19. Clinical codes to describe this phenomenon were released in November 2020 in the UK, but it is not known how these codes have been used in practice. **Methods** Working on behalf of NHS England, we used OpenSAFELY data encompassing 96% of the English population. We measured the proportion of people with a recorded code for long COVID, overall and by demographic factors, electronic health record software system, and week. We also measured variation in recording amongst practices. **Results** Long COVID was recorded for 23,273 people. Coding was unevenly distributed amongst practices, with 26.7% of practices having not used the codes at all. Regional variation was high, ranging between 20.3 per 100,000 people for East of England (95% confidence interval 19.3-21.4) and 55.6 in London (95% CI 54.1-57.1). The rate was higher amongst women (52.1, 95% CI 51.3-52.9) compared to men (28.1, 95% CI 27.5-28.7), and higher amongst practices using EMIS software (53.7, 95% CI 52.9-54.4) compared to TPP software (20.9, 95% CI 20.3-21.4). **Conclusions** Long COVID coding in primary care is low compared with early reports of long COVID prevalence. This may reflect under-coding, sub-optimal communication of clinical terms, under-diagnosis, a true low prevalence of long COVID diagnosed by clinicians, or a combination of factors. We recommend increased awareness of diagnostic codes, to facilitate research and planning of services; and surveys of clinicians’ experiences, to complement ongoing patient surveys. ## Background Long COVID has been broadly defined as new or persistent symptoms of COVID-19 beyond the acute phase of SARS-CoV-2 infection. The National Institute for Health and Care Excellence (NICE) have produced guidance on managing the long-term effects of COVID-19 as these symptoms can have a significant effect on a person’s quality of life.1NICE recognise that as long COVID is such a new condition the exact clinical definition and treatments are evolving. NICE developed their definitions and clinical guidelines using a “living” approach based on early data. This means that the guidelines will be continuously reviewed and updated and it is therefore critical to continue studying the long-term effects of COVID-19 as data accrue, and refine the guidelines appropriately. To support this need, long COVID SNOMED-CT codes (the “diagnostic codes” in Table 3) were developed and released in the UK in November 2020. To support clinical care and implementation of NICE guidance, distinct SNOMED-CT codes are available which distinguish between the length of ongoing symptoms. Symptoms between 4-12 weeks are defined as “ongoing symptomatic COVID-19”, and symptoms continuing beyond 12 weeks as “post-COVID-19 syndrome”.2There are also 3 assessment codes and 10 referral codes relating to long COVID. We set out to describe the use of long COVID codes in English primary care since their introduction, in a cohort of covering approximately 96% of the English population - those covered by the two largest electronic health record providers, EMIS and TPP. We describe the distribution of use amongst general practices, demographics and over time. ## Methods ### Study design and data sources We conducted a study calculating the period prevalence of long COVID recording in electronic health record (EHR) data from primary care practices using TPP or EMIS software, linked to Secondary Uses Service (SUS) data (containing hospital records) through OpenSAFELY. Primary care records managed by the GP software providers EMIS and TPP were accessed through OpenSAFELY, an open source data analytics platform created by our team on behalf of NHS England to address urgent COVID-19 research questions ([https://opensafely.org](https://opensafely.org)). OpenSAFELY provides a secure software interface allowing a federated analysis of pseudonymized primary care patient records from England in near real-time within the EMIS and TPP highly secure data environments. Non-disclosive, aggregated results are exported to GitHub where further data processing and analysis takes place. This avoids the need for large volumes of potentially disclosive pseudonymised patient data to be transferred off-site. This, in addition to other technical and organisational controls, minimizes any risk of re-identification. The dataset available to the platform includes pseudonymised data such as coded diagnoses, medications and physiological parameters. No free text data are included. All activity on the platform is publicly logged and all analytic code and supporting clinical coding lists are automatically published. In addition, the framework provides assurance that the analysis is reproducible and reusable. Further details on our information governance and platform can be found in the Appendix under information governance and ethics. ### Population We included all people registered with a general practice on the 1st November 2020. ### Outcome The outcome was any record of long COVID in the primary care record. This was defined using a list of 15 UK SNOMED codes, which are listed in Table 3 and categorised as diagnostic (2 codes), referral (3) and assessment (10).3The outcome was measured between the study start date (2020-02-01) and the end date (2021-04-25). Though the start date is before the codes were created, it’s possible to backdate diagnostic codes in a GP system. Timing of outcomes was determined by the first record of a SNOMED code for each person, as determined by the date recorded by the clinician. ### Stratifiers Demographic variables were extracted including age (in categories), sex, geographic region, Index of Multiple Deprivation (IMD, divided into quintiles), ethnicity and previous COVID/SARS-CoV-2 (categorised into hospitalised and tested positive for SARS-CoV-2 but not hospitalised). Counts and rates of recorded events were stratified by each demographic variable. Recording of each SNOMED code was assessed individually, in this case counting every recorded code including repeated codes, rather than one per patient. ### Statistical methods We calculated proportions of patients with long COVID codes over the whole study period per 100,000 patients, 95% confidence intervals of those proportions, and the distribution of codes by each stratification variable. ### Software and Reproducibility Data management and analysis was performed using the OpenSAFELY software libraries and Jupyter notebooks, both implemented using Python 3. More details are available in the Supplementary Materials. This is an analysis delivered using federated analysis through the OpenSAFELY platform: codelists and code for data management and data analysis were specified once using the OpenSAFELY tools; then transmitted securely from the OpenSAFELY jobs server to the OpenSAFELY-TPP platform within TPP’s secure environment, and separately to the OpenSAFELY-EMIS platform within EMIS’s secure environment, where they were each executed separately against local patient data; summary results were then reviewed for disclosiveness, released, and combined for the final outputs. All code for the OpenSAFELY platform for data management, analysis and secure code execution is shared for review and re-use under open licenses at GitHub.com/OpenSAFELY. All code for data management and analysis for this paper is shared for scientific review and re-use under open licenses on GitHub [https://github.com/opensafely/long-covid](https://github.com/opensafely/long-covid). ## Results ### Cohort characteristics and overall rate of recording There were 58.5m people in the combined cohort in total, 24.8m in the TPP cohort, and 33.7m in the EMIS cohort. Demographics of the cohort are described in Table 1. Up to XXth March 2021, there were 23,273 patients with a recorded code indicative of long COVID diagnosis. A higher proportion of these recorded diagnoses were in EMIS, with 18,262, compared to 5,011 in TPP. Taking into account the larger total number of patients in EMIS practices, the rate over the whole study period was 53.7 per 100,000 people (95% CI 52.9-54.4) in EMIS and 20.9 (95% CI 20.3-21.4) in TPP. View this table: [Table 1:](http://medrxiv.org/content/early/2021/05/13/2021.05.06.21256755/T1) Table 1: Characteristics of the cohort ### Rate of coding stratified by demographics Counts and rates of long COVID coding stratified by demographic factors are presented in Table 2. For age, the incidence of long COVID recording rises to a peak in the 45-54 group, before declining again in older age groups. Women had a higher rate of recording than men (52.1 (95% CI 51.3-52.9) vs 28.1 (95% CI 27.5-28.7) per 100,000 people). Counts of long COVID recording by IMD and ethnicity are reported in supplementary table S1. View this table: [Table 2:](http://medrxiv.org/content/early/2021/05/13/2021.05.06.21256755/T2) Table 2: Counts and rates of long covid coding stratified by demographic variable View this table: [Table 3:](http://medrxiv.org/content/early/2021/05/13/2021.05.06.21256755/T3) Table 3: Total use of each individual long COVID related code. This is distinct from table 1 in that it counts all coded events, including where patients have been coded more than once. ### Geographic and practice distribution of coding The rate of coding varies substantially between regions (Table 2), from a minimum rate of 20.3 per 100,000 people in the East of England (95%CI 19.3-21.4) to 55.6 in London (95% CI 54.1-57.1), though these region specific data were only available in TPP practices at the time of data extraction. Over a quarter (26.7%) of practices have not used the codes at all. This proportion is much higher in practices using TPP software (44.2%) than those using EMIS (15.1%). The distribution is described more fully in Figure 1. The highest number of codes in a single practice was 150. ![Figure 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/05/13/2021.05.06.21256755/F1.medium.gif) [Figure 1:](http://medrxiv.org/content/early/2021/05/13/2021.05.06.21256755/F1) Figure 1: Volume of code use in individual practices, stratified by the electronic health record provider of the practice. ### Rate of coding over time The number of recorded events was relatively low until the end of January 2021, after which there was an increase in coding (Figure 2). This increase was more marked in EMIS practices, which before that time had recorded fewer long COVID codes overall than TPP practices. ![Figure 2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/05/13/2021.05.06.21256755/F2.medium.gif) [Figure 2:](http://medrxiv.org/content/early/2021/05/13/2021.05.06.21256755/F2) Figure 2: Use of long COVID codes over time, stratified by the electronic health record provider of the practice. Reporting lag may affect recent dates ### Coding of individual SNOMED codes The diagnostic codes were the most commonly used codes, particularly the “Post-COVID-19 syndrome” code, which accounted for 64.3% of all recorded codes. However there are differences in the distribution of codes between TPP and EMIS practices (Table 3). Codes relating to assessment of long COVID accounted for just 2.4% of long COVID codes used to date. ## Discussion ### Summary As of late April 2021, 23,273 people had a record of at least one long COVID code in their primary care record. Use between different general practices varied greatly, and a large proportion (26.7%) have never used any long COVID code. We found substantially higher recording in practices that use EMIS software compared to those who use TPP software. Amongst those people who did have a recorded long COVID code, rates were highest in the working age population and more common in women. ### Strengths and weaknesses The key strength of this study is its unprecedented scale: we include over 58 million people, 95% of the population in England. In contrast with many studies that use electronic health record data, we were also able to compare long COVID diagnostic codes between practices that use different software systems, and find a striking disparity: this has important implications for understanding whether clinicians are using the codes appropriately. A key weakness of this data for estimating true prevalence of long COVID in primary care, and factors associated with the condition, is that it relies on clinicians formally entering a diagnostic or referral code into the patient’s record: we note that this is a limitation of all electronic health record research for all clinical conditions and activity; however the emergence of a new diagnosis and recent launch of a new set of diagnostic codes may present new challenges in this regard. ### Research in Context To our knowledge there are no other studies on prevalence of long COVID using clinicians’ diagnoses or electronic health records data. There are numerous studies using self-reported data from patients on the prevalence of continued symptoms following COVID-19, with estimates varying between 4.5% and 89%, largely due to highly variable case definitions4; individual symptoms characteristing long COVID have been reported as fatigue, headache, dyspnea and anosmia5. The ONS COVID Infection Survey estimates prevalence of self-diagnosed long COVID at 13.7%6. Separately numerous cohort studies have reported an increased risk of serious cardiovascular and metabolic outcomes following hospitalisation with COVID7,8, and there are various prospective studies such as PHOSP following up hospitalised patients over the next year9. ### Interpretation and implications The prevalence of long COVID codes in primary care is extremely low when compared with early reports of long COVID prevalence. The large variation in apparent rate of long COVID between different geographic regions, practices and electronic health record systems strongly suggests that clinicians’ coding practice is inconsistent at present. This may reflect variation in awareness of new diagnostic codes that were only launched in November 2020, and only available in EMIS at the end of January 2021. In addition, the codes for long COVID and associated synonyms do not currently contain the term “long COVID”: this was an active choice by NICE after detailed consideration of terminology in their guidance, and by NHS Digital who manage SNOMED-UK codes[ref]. This has resulted in a mismatch between formal clinical terminology and popular parlance among clinicians and patients. In our view those managing SNOMED terminology for England should either update the long COVID codes to include the phrase “long COVID”, ideally in advance of the upcoming new SNOMED international release; or energetically disseminate their preferred new phrasing to all frontline clinicians, to ensure more appropriate use of these codes. Similarly NICE and other authoritative bodies giving guidance on long COVID should energetically communicate to clinicians the importance of correctly coding long COVID in patient records. It is a high national priority to estimate the prevalence of long COVID, identify its causes and consequences, and plan services appropriately. This cannot be done when clinical terminology is ambiguously and inconsistently used. The variation in recording between users of different electronic health record software is also striking. After speaking with clinicians and both software vendors, the reasons for this disparity remain unclear, but are likely attributable to differences in user interface, which has previously been shown to influence clinicians’ treatment choices10,11. This should be addressed by interviewing GPs about their experiences with diagnosing and treating people with long COVID in each system. Despite these issues around correct recording of clinicians’ diagnoses, there also remains a strong possibility that clinicians are not currently diagnosing their patients as having long COVID. The prevalence of long COVID in self-report in patient surveys is substantially higher than we have found here. To our mind this disparity can only be resolved by conducting prospective surveys with clinicians themselves, evaluating how many patients they have seen with a condition they would understand to be diagnosable as long COVID, perhaps complemented with qualitative research on the topic. If we accept that the different rates of long COVID usage in each sub-group reflects the true comparative risk for each demographic then there are two key findings: firstly, the lower rate in older patients, despite their higher prevalence of severe acute COVID outcomes12, which may be affected by the competing risk of death in COVID-19 patients; and secondly, the higher rate of long COVID in women, despite the higher prevalence of severe acute COVID outcomes in men, which may be explained in part by differences in routine consultation rates between men and women. 13 ## Conclusions Current recording of long COVID in primary care is variable, and low. This may reflect under-coding, sub-optimal design and communication of clinical terms, under-diagnosis, a true low prevalence of long COVID diagnosed by clinicians, or a combination of factors. We will update this analysis regularly with extended follow-up time. ## Data Availability All data were linked, stored, and analysed securely within the OpenSAFELY platform. Detailed pseudonymised patient data are potentially re-identifiable and therefore not shared. We rapidly delivered the OpenSAFELY data analysis platform without previous funding to deliver timely analyses of urgent research questions in the context of the global COVID-19 health emergency: now that the platform is established, we have established a process for external users to request access to the platform [https://www.opensafely.org/onboarding-new-users/](https://www.opensafely.org/onboarding-new-users/). ## Conflicts of Interest All authors have completed the ICMJE uniform disclosure form at [www.icmje.org/coi\_disclosure.pdf](http://www.icmje.org/coi_disclosure.pdf) and declare the following: over the past five years BG has received research funding from the Laura and John Arnold Foundation, the NHS National Institute for Health Research (NIHR), the NIHR School of Primary Care Research, the NIHR Oxford Biomedical Research Centre, the Mohn-Westlake Foundation, NIHR Applied Research Collaboration Oxford and Thames Valley, the Wellcome Trust, the Good Thinking Foundation, Health Data Research UK (HDRUK), the Health Foundation, and the World Health Organisation; he also receives personal income from speaking and writing for lay audiences on the misuse of science. KB holds a Sir Henry Dale fellowship jointly funded by Wellcome and the Royal Society (107731/Z/15/Z). HIM is funded by the NIHR Health Protection Research Unit in Immunisation, a partnership between Public Health England and London School of Hygiene & Tropical Medicine. AYSW holds a fellowship from the British Heart Foundation. EJW holds grants from MRC. RG holds grants from NIHR and MRC. RM holds a Sir Henry Wellcome Fellowship funded by the Wellcome Trust (201375/Z/16/Z). HF holds a UKRI fellowship. IJD has received unrestricted research grants and holds shares in GlaxoSmithKline (GSK). ## Funding This work was jointly funded by UKRI, NIHR and Asthma UK-BLF [COV0076; MR/V015737/] and the Longitudinal Health and Wellbeing strand of the National Core Studies programme. EMIS and TPP provided technical expertise and infrastructure within their data environments *pro bono* in the context of a national emergency. The OpenSAFELY software platform is supported by a Wellcome Discretionary Award. BG’s work on clinical informatics is supported by the NIHR Oxford Biomedical Research Centre and the NIHR Applied Research Collaboration Oxford and Thames Valley. Funders had no role in the study design, collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the article for publication. The views expressed are those of the authors and not necessarily those of the NIHR, NHS England, Public Health England or the Department of Health and Social Care. ## Information governance and ethical approval NHS England is the data controller; EMIS and TPP are the data processors; and the key researchers on OpenSAFELY are acting on behalf of NHS England. This implementation of OpenSAFELY is hosted within the EMIS and TPP environments which are accredited to the ISO 27001 information security standard and are NHS IG Toolkit compliant;[40,41] patient data has been pseudonymised for analysis and linkage using industry standard cryptographic hashing techniques; all pseudonymised datasets transmitted for linkage onto OpenSAFELY are encrypted; access to the platform is via a virtual private network (VPN) connection, restricted to a small group of researchers; the researchers hold contracts with NHS England and only access the platform to initiate database queries and statistical models; all database activity is logged; only aggregate statistical outputs leave the platform environment following best practice for anonymisation of results such as statistical disclosure control for low cell counts.[42] The OpenSAFELY research platform adheres to the obligations of the UK General Data Protection Regulation (GDPR) and the Data Protection Act 2018. In March 2020, the Secretary of State for Health and Social Care used powers under the UK Health Service (Control of Patient Information) Regulations 2002 (COPI) to require organisations to process confidential patient information for the purposes of protecting public health, providing healthcare services to the public and monitoring and managing the COVID-19 outbreak and incidents of exposure; this sets aside the requirement for patient consent.[43] Taken together, these provide the legal bases to link patient datasets on the OpenSAFELY platform. GP practices, from which the primary care data are obtained, are required to share relevant health information to support the public health response to the pandemic, and have been informed of the OpenSAFELY analytics platform. This study was approved by the Health Research Authority (REC reference 20/LO/0651) and by the LSHTM Ethics Board (reference 21863). ## Guarantor BG/LS are guarantors of the OpenSAFELY project. ## Acknowledgements We are very grateful for all the support received from the EMIS and TPP Technical Operations team throughout this work, and for generous assistance from the information governance and database teams at NHS England / NHSX. ## Appendix ### Information governance and ethics NHS England is the data controller; TPP is the data processor; and the key researchers on OpenSAFELY are acting on behalf of NHS England. OpenSAFELY is hosted within the TPP environment which is accredited to the ISO 27001 information security standard and is NHS IG Toolkit compliant; 14,15patient data are pseudonymised for analysis and linkage using industry standard cryptographic hashing techniques; all pseudonymised datasets transmitted for linkage onto OpenSAFELY are encrypted; access to the platform is via a virtual private network (VPN) connection, restricted to a small group of researchers who hold contracts with NHS England and only access the platform to initiate database queries and statistical models. Pseudonymised structured data include demographics, medications prescribed from primary care, diagnoses, and laboratory measures. No free text data are included. All database activity is logged; only aggregate statistical outputs leave the platform environment following best practice for anonymisation of results such as statistical disclosure control for low cell counts16The OpenSAFELY research platform adheres to the obligations of the UK General Data Protection Regulation (GDPR) and the Data Protection Act 2018. In March 2020, the Secretary of State for Health and Social Care used powers under the UK Health Service (Control of Patient Information) Regulations 2002 (COPI) to require organisations to process confidential patient information for the purposes of protecting public health, providing healthcare services to the public and monitoring and managing the COVID-19 outbreak and incidents of exposure; this sets aside the requirement for patient consent. 17Taken together, these provide the legal bases to link patient datasets on the OpenSAFELY platform. GP practices, from which the primary care data are obtained, are required to share relevant health information to support the public health response to the pandemic, and have been informed of the OpenSAFELY analytics platform. This study was approved by the Health Research Authority (REC reference 20/LO/0651) and by the LSHTM Ethics Board (ref 21863). * Received May 6, 2021. * Revision received May 6, 2021. * Accepted May 13, 2021. * © 2021, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution 4.0 International), CC BY 4.0, as described at [http://creativecommons.org/licenses/by/4.0/](http://creativecommons.org/licenses/by/4.0/) ## References 1. 1.Overview |COVID-19 rapid guideline: managing the long-term effects of COVID-19 |Guidance | NICE. 2. 2.1 Identifying people with ongoing symptomatic COVID-19 or post-COVID-19 syndrome | COVID-19 rapid guideline: managing the long-term effects of COVID-19 |Guidance |NICE. 3. 3.NHS Digital. COVID-19 Information Standards: COVID-19 SNOMED CT codes by groups 20201221v1.0. (2020). 4. 4.Living with Covid19 – Second review. [https://evidence.nihr.ac.uk/themedreview/living-with-covid19-second-review/](https://evidence.nihr.ac.uk/themedreview/living-with-covid19-second-review/) (2021) doi: 10.3310/themedreview_45225. 5. 5.Sudre, C. H. et al. Attributes and predictors of long COVID. Nat. Med. (2021) doi: 10.1038/s41591-021-01292-y. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41591-021-01292-y&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33692530&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F05%2F13%2F2021.05.06.21256755.atom) 6. 6.Ayoubkhani, D. Prevalence of ongoing symptoms following coronavirus (COVID-19) infection in the UK - Office for National Statistics. [https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/conditionsanddiseases/bulletins/prevalenceofongoingsymptomsfollowingcoronaviruscovid19infectionintheuk/1april2021](https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/conditionsanddiseases/bulletins/prevalenceofongoingsymptomsfollowingcoronaviruscovid19infectionintheuk/1april2021) (2021). 7. 7.Ayoubkhani, D. et al. Post-covid syndrome in individuals admitted to hospital with covid-19: retrospective cohort study. BMJ 372, n693 (2021). 8. 8.The OpenSAFELY Collaborative et al. Rates of serious clinical outcomes in survivors of hospitalisation with COVID-19: a descriptive cohort study within the OpenSAFELY platform. medRxiv (2021) doi: 10.1101/2021.01.22.21250304. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoibWVkcnhpdiI7czo1OiJyZXNpZCI7czoyMToiMjAyMS4wMS4yMi4yMTI1MDMwNHYzIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDUvMTMvMjAyMS4wNS4wNi4yMTI1Njc1NS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 9. 9.PHOSP-COVID Collaborative Group et al. Physical, cognitive and mental health impacts of COVID-19 following hospitalisation – a multi-centre prospective cohort study. bioRxiv (2021) doi: 10.1101/2021.03.22.21254057. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1101/2021.03.22.21254057&link_type=DOI) 10. 10.MacKenna, B. et al. Impact of Electronic Health Record Interface Design on Unsafe Prescribing of Ciclosporin, Tacrolimus, and Diltiazem: Cohort Study in English National Health Service Primary Care. J. Med. Internet Res. 22, e17003 (2020). 11. 11.MacKenna, B. et al. Suboptimal prescribing behaviour associated with clinical software design features: a retrospective cohort study in English NHS primary care. Br. J. Gen. Pract. 70, e636–e643 (2020). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoiYmpncCI7czo1OiJyZXNpZCI7czoxMToiNzAvNjk4L2U2MzYiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMS8wNS8xMy8yMDIxLjA1LjA2LjIxMjU2NzU1LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 12. 12.Williamson, E. J. et al. OpenSAFELY: factors associated with COVID-19 death in 17 million patients. Nature 1–11 (2020). 13. 13.Wang, Y., Hunt, K., Nazareth, I., Freemantle, N. & Petersen, I. Do men consult less than women? An analysis of routinely collected UK general practice data. BMJ Open 3, e003320 (2013). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiYm1qb3BlbiI7czo1OiJyZXNpZCI7czoxMToiMy84L2UwMDMzMjAiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMS8wNS8xMy8yMDIxLjA1LjA2LjIxMjU2NzU1LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 14. 14.NHS Digital. Data Security and Protection Toolkit. 2020. [https://digital.nhs.uk/data-and-information/looking-after-information/data-security-and-information-governance/data-security-and-protection-toolkit](https://digital.nhs.uk/data-and-information/looking-after-information/data-security-and-information-governance/data-security-and-protection-toolkit). 15. 15.NHS Digital. BETA - Data Security Standards. 2020. [https://digital.nhs.uk/about-nhs-digital/our-work/nhs-digital-data-and-technology-standards/framework/beta\---|data-security-standards](https://digital.nhs.uk/about-nhs-digital/our-work/nhs-digital-data-and-technology-standards/framework/beta\---|data-security-standards). 16. 16.NHS Digital. ISB1523: Anonymisation Standard for Publishing Health and Social Care Data. 2020. [https://digital.nhs.uk/data-and-information/information-standards/information-standards-and-data-collections-including-extractions/publications-and-notifications/standards-and-collections/isb1523-anonymisation-standard-for-publishing-health-and-social-care-data](https://digital.nhs.uk/data-and-information/information-standards/information-standards-and-data-collections-including-extractions/publications-and-notifications/standards-and-collections/isb1523-anonymisation-standard-for-publishing-health-and-social-care-data). 17. 17.Secretary of State for Health-UK Government. Coronavirus (COVID-19): notification to organisations to share information. 2020. [https://www.gov.uk/government/publications/coronavirus-covid-19-notification-of-data-controllers-to-share-information](https://www.gov.uk/government/publications/coronavirus-covid-19-notification-of-data-controllers-to-share-information).