Abstract
Objectives Following detection of the first virologically-confirmed cases of COVID-19 in Great Britain, an enhanced surveillance study was initiated by Public Health England to describe the clinical presentation, course of disease and underlying health conditions associated with infection of the first few hundred cases.
Methods Information was collected on the first COVID-19 cases according to the First Few X WHO protocol. Case-control analyses of the sensitivity, specificity and predictive value of symptoms and underlying health conditions associated with infection were conducted. Point prevalences of underlying health conditions among the UK general population were presented.
Findings The majority of FF100 cases were imported (51.4%), of which the majority had recent travel to Italy (71.4%). 24.7% were secondary cases acquired mainly through household contact (40.4%). Children had lower odds of COVID-19 infection compared with the general population.
The clinical presentation of cases was dominated by cough, fever and fatigue. Non-linear relationships with age were observed for fever, and sensitivity and specificity of symptoms varied by age.
Conditions associated with higher odds of COVID-19 infection (after adjusting for age and sex) were chronic heart disease, immunosuppression and multimorbidity.
Conclusion This study presents the first epidemiological and clinical summary of COVID-19 cases in Great Britain. The FFX study design enabled systematic data collection. The study characterized underlying health conditions associated with infection and set relative risks in context with population prevalence estimates. It also provides important evidence for generating case definitions to support public health risk assessment, clinical triage and diagnostic algorithms.
Introduction
In December 2019, the World Health Organisation (WHO) was notified of a cluster of unexplained pneumonia cases detected in Wuhan city, China. This was subsequently identified and designated as COVID-19, caused by the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), a strain previously unseen in humans. The UK was one of the first countries affected in Europe, with its first two confirmed cases of COVID-19 detected on 31 January 2020 (1).
The epidemiology and clinical features of early COVID-19 cases identified in China and elsewhere have been reported (2-7). In a pre-print article, Ma et al., present a pooled analysis of 1,155 cases from seven countries with estimates of key epidemiological parameters (8) and the first identified cases in the WHO European Region have been described (1). The most commonly reported symptoms have been fever, fatigue, dry cough, myalgia and dyspnoea (1-5, 8, 9). These studies did not report on the sensitivity, specificity or positive predictive values of symptoms.
Reported estimates of proportions of cases with underlying health conditions vary between 23% and 51% (2-5, 9). A range of underlying health conditions have been found to be associated with poor outcomes from COVID-19 (10-12). However, there are few controlled studies of underlying health conditions as risk factors for infection. In a pre-print from the United States, a test-negative design study of 3789 Veterans tested for COVID-19 found that the only comorbidity associated with testing positive after multivariable adjustment was chronic obstructive pulmonary disease, which was associated with decreased risk of testing positive for COVID-19(13). Studies in other settings and with general population controls (not selected by testing) are needed to understand the role of underlying health conditions in infection itself, rather than severity of illness.
The First Few X (FFX) is an established enhanced surveillance system designed to investigate the clinical and epidemiological characteristics of at least the first 100 confirmed cases of an emerging infectious disease and their close contacts (14). The design has previously been used in the 2009 influenza H1N1 pandemic (15) and the WHO has recommended that countries adopt this protocol for investigation of COVID-19 (14). Following the detection of the first GB cases at the end of January 2020, the FF100 surveillance system for COVID-19 was initiated by Public Health England (PHE). The primary objectives were to describe the clinical presentation and course of disease and underlying health conditions associated with for infection.
This paper describes the epidemiological and clinical characteristics of the first few hundred cases of COVID-19 identified in GB, including estimates of sensitivity and specificity of selected symptoms, associations of underlying health conditions with infection and estimates of the prevalence of these conditions in the UK population.
Methods
After detection of COVID-19 in China, the FF100 protocol, data collection questionnaires and electronic data capture system were modified for the COVID-19 outbreak.
▪ Case ascertainment
Initially enhanced surveillance forms were collected for all virologically-confirmed cases of COVID-19 in GB. Due to the large predominance of imported cases during February 2020, this was later restricted to sporadic cases only. Case definitions for testing and the time periods that they applied are outlined in Table 1.
Imported cases were defined as cases with travel to countries with known COVID-19 circulation at the time or with contact with a confirmed case whilst abroad within a maximum incubation period of their onset of symptoms. Secondary cases were defined as cases that had contact with a confirmed or probable/suspected case in the UK and did not fit the definition of an imported case. Sporadic cases were defined as cases with no travel history to countries with known COVID-19 circulation, and no known contact with a confirmed case.
▪ Data sources
○ Case questionnaires
Data collection was coordinated by the PHE Surveillance Cell, National Infection Service.
On identification of a positive case, the local Health Protection Teams, Field Service Teams and the equivalents in the Devolved Administrations of Great Britain, were asked to complete an initial enhanced surveillance questionnaire (Form 1A) providing information on demographic details, medical history and travel history. This information was collected as soon as possible after a positive laboratory result was reported and was collected through interview with the cases and/or healthcare workers or family members. Cases were followed up after 14 days from initial report (Form 1B). Follow-up information on cases was collected to determine the occurrence of any medical complications and clinical outcome. To improve completeness of the initial questionnaires and to achieve a high rate of follow-up, a team of PHE health protection practitioners, nurses, doctors and field epidemiologists proactively followed up the FF100 cases in England using telephone interviews.
Data from completed forms were entered into a dedicated FF100 web-tool. Data were extracted, cleaned and quality checked.
○ Control data
▪ Possible cases who tested negative
During the early stages of the epidemic more limited data on patient demographics, presenting illness, clinical course/complications after onset and exposures in the 14 days before onset of first symptoms, were collected on all possible cases using a Minimum Data Set (MDS) form. Those suspected cases testing negative were used as a control group for the analyses of symptoms.
▪ General population electronic health record data
General population data were obtained from an anonymised dataset of primary care electronic health records, Clinical Practice Research Datalink (CPRD) GOLD. In the UK, almost 99% of the population is registered at a general practice and healthcare is free at the point of delivery. CPRD is broadly representative of the population in terms of age and sex (16). Primary care prescriptions and laboratory test results are automatically recorded electronically. Diagnoses are recorded using Read codes, and generally have good positive predictive value (17). Approximately 75% of CPRD practices in England have consented to data linkage, and for patients at these practices anonymised secondary care data (Hospital Episode Statistics (HES)) is linked to primary care records.
▪ Data analysis
Descriptive analyses of the FF100 cases were undertaken in relation to patient characteristics, clinical symptoms and complications, healthcare interactions and outcomes. For the analysis of symptoms, missing data was assumed to indicate absence of that symptom. Ethnicity was assigned to cases by linking, using NHS number and Date of Birth, to the latest recording of ethnicity in the Outpatient HES database. If no record of ethnicity was found in the Outpatient record, then the HES Admitted Patient Care recording of ethnicity was used.
Underlying health conditions among the UK general population were ascertained from diagnoses, prescriptions and test results in CPRD primary care records, with hospital diagnoses supplementing ascertainment for a secondary analysis limited to England (due to availability of HES-linkage). Detailed definitions of risk groups were tailored to respiratory infection risk using established methods described more fully in Supplementary Table A (18-22). Multimorbidity was defined as ≥2 conditions in different domains, combining asthma with chronic respiratory disease and organ transplant with other immunosuppression.
Analysis of associations of underlying health conditions with infection used a case-control design, with controls from the UK general population in CPRD (details in Supplementary Table A). Cases with any data recorded for any underlying health condition (whether ‘Yes’ or ‘No’) or any medical history recorded in free text were included, with missing data assumed to indicate absence of that comorbidity. Cases for whom there was no evidence of any data collection for underlying health conditions were excluded (n=23). Cancer in controls was defined as any history of malignant cancer, as questionnaire free-text suggested this would be most comparable to cases.
Single variable and multivariable logistic regression were used to calculate odds ratios of each underlying health condition in COVID-19 cases compared to the UK general population, adjusted for sex and age (in categories <18, 18-35, 36-49, 50-69 and ≥70 years) as a priori confounders. Consideration of other potential confounders (such as comorbid health conditions) was limited by data sparsity. Age (over and under 70 years) and infection source (imported or sporadic/secondary) were considered as potential effect modifiers.
Point prevalence estimates for underlying health conditions were calculated per 100,000 for the UK general population on 5 March 2019. Prevalence of cancer was estimated for malignant cancer incident within the previous year rather than ever, for greater clinical relevance.
Estimates of the sensitivity and specificity of symptoms were obtained from the proportion of COVID-19 cases, and those testing negative respectively. Predictive values were estimated for the observed prevalence of positive patients. The functional relationships between the presence of a symptom and age were explored using locally weighted scatter plot smoothing. Fractional polynomial logistic regression models were used to obtain parametric functions of these relationships with age. Interaction terms between the fractional polynomial age terms and case type (imported, sporadic or secondary) were used to assess if there was evidence of different age relationships. Logistic regression analyses were performed to assess which symptoms were independently associated with COVID-19, accounting for sex and age. Multinomial regression models were used with case type as the outcome variable to assess whether the associations with symptoms differed for each case type.
Analyses were undertaken in Microsoft Excel 2010, R and Stata 16 MP.
▪ Ethical considerations
This was an observational surveillance system carried out under the permissions granted under Regulation 3 of The Health Service (Control of Patient Information) Regulations 2020, and without explicit patient permission under Section 251 of the NHS Act 2006.
Analysis involving CPRD data was approved by the Independent Scientific Advisory Group of the CPRD (ISAC, reference 20_062A) and the London School of Hygiene and Tropical Medicine Ethics Committee (reference 21815). The ISAC protocol was made available to reviewers.
Results
381 confirmed cases of COVID-19 in GB were included in this analysis (up to 09 April 2020) (359 from England, 19 from Scotland and three from Wales). 16.8% of cases were resident in London. Figure 1 shows the distribution of cases included in the FF100 study by date of symptom onset and COVID-19 case types. Approximately half of the FF100 cases were imported (51.4%) with the remainder being secondary (24.7%) or sporadic (23.9%) cases. Of the FF100 cases, 38.3% reported travel to Italy in the 14 days prior to symptom onset (71.4% of imported cases) and hence Europe was the most commonly visited continent by the imported cases (Figure 2). Where occupation was recorded (n=357), 42 cases were healthcare workers (11.8%), although the majority of these were imported cases (n=26). Of the secondary cases, almost all reported close contact with a confirmed case (93/94, 98.9%), of these 39.8% had close contact within a household setting, 10.8% in a healthcare setting, 47.3% in other settings (for example work setting, social gatherings) and 3.2% unknown setting.
▪ Demographic characteristics of cases
There were slightly more male cases (56.7%) compared with female cases (43.3%). The FF100 cases ranged between 1 year and 94 years of age with a mean age of 47.7 years (standard deviation (SD) 17.4) (Figure 3). Country of birth was available on 68.2% of the cases, of which the majority were born in the UK (73.5%), followed by Italy (5.8%), Iran (1.9%) and the United States (1.9%). The ethnicity breakdown of the FF100 cases (available for 63.0% of cases) was comparable to the general population (England and Wales, ONS 2011 Census) (Table 2).
Logistic regression analysis of associations of age and sex with COVID-19 included 358 cases with data on underlying health conditions (to allow adjustment for immunosuppression), and 2,705,963 UK general population controls.
When stratified by infection source and sex, women had lower odds of imported COVID-19 than men (AOR 0.61, 95% CI 0.46-0.83), but there was no evidence of an association of sex with secondary/sporadic cases (Table 3).
When stratified by infection source and age, children aged under 18 years had considerably lower odds of imported and secondary/sporadic COVID-19 than the general population (Table 2). Adults aged 50-69 years had higher odds of imported COVID-19 than adults aged 18-35 years, but otherwise there was no evidence of any association between infection and age for adults aged under 70 years. Adults aged 70 years or over had lower odds of imported COVID-19 than younger adults, but this association was not observed among secondary/sporadic cases (Table 3).
○ Underlying health conditions
Analysis of associations of underlying health conditions with COVID-19 infection included 358 cases with comorbidity data, and 2,705,963 UK general population controls (Table 4). Approximately one third of the cases had an underlying health condition (115/358, 32.1%), and 11.7% of cases had multimorbidity. The most frequent conditions were asthma, diabetes, and chronic heart disease.
After adjustment for age and sex, cases had higher odds of chronic heart disease (adjusted odds ratio (AOR) 1. 95, 95% CI 1.34–2.84), immunosuppression (AOR 4.46, 2.65–7.50), and multimorbidity (AOR 1.68, 1.17–2.40), and lower odds of chronic kidney disease (AOR 0.39, 0.21–0.73) compared to the general population. There was weak evidence suggesting that cases had higher odds of chronic liver disease (AOR 2.66, 0.99–7.14) and lower odds of asthma (AOR 0.73, 0.53–1.02) than the general population.
When stratified by infection source, associations between underlying health conditions and COVID-19 infection appeared generally smaller for imported cases than for sporadic or secondary cases, but confidence intervals overlapped for all conditions other than multimorbidity.
There was no evidence of any association of COVID-19 infection with diabetes, chronic respiratory disease (excluding asthma), chronic neurological disease, organ transplant or a history of malignant cancer, although confidence intervals were particularly wide for organ transplant.
In England, age-and sex-adjusted odds of liver disease (AOR 3.79, 95% CI 1.41–10.18) and organ transplant (2.64, 9.37–18.80) were higher among cases than the general population, but no evidence of these associations remained in analysis using linked hospital data among patients eligible for data linkage (Supplementary Table B). When secondary care data were used to enhance ascertainment of underlying health conditions in England, the increased ascertainment of underlying health conditions among controls resulted in a general decrease in odds ratios for associations of those conditions with COVID-19 infection – this was most pronounced for chronic liver disease, immunosuppression and chronic heart disease. Nevertheless, odds of chronic heart disease and immunosuppression remained consistently higher among cases than the general population in all analyses.
○ Prevalence of underlying health conditions in the general population
The prevalence of any underlying health condition was 26.7% in England, 28.8% in Scotland, 30.4% in Northern Ireland, and 30.8% in Wales (Supplementary Table C). Among those conditions associated with COVID-19, point prevalence of chronic heart disease ranged from 3.7% (England) to 5.0% (Scotland), immunosuppression from 0.7% (Wales) to 1.0% (Scotland), chronic liver disease from 0.3% (England) to 0.5% (Scotland) and multimorbidity from 6.6% (England) to 8.4% (Wales and Northern Ireland). Prevalence of any underlying health condition increased to 30.0% in England when ascertainment was supplemented with secondary care data; the greatest increases in prevalence estimates were for chronic liver disease and chronic heart disease.
▪ Clinical features
○ Symptoms over the course of illness
The symptoms during illness reported by all FF100 cases are summarised in Figure 4. The most frequent symptoms were cough (77.7%), fatigue (70.9%), fever (60.1%), headache (56.7%), muscle ache (50.9%), appetite loss (44.1%). Just over three quarters of the cases experiencing a cough reported whether they experienced a dry or productive cough: the majority of whom reported a dry cough (78.1%). Anosmia was added to the follow-up questionnaire part way through the FF100 study. 111 of the 229 cases who were asked this question reported anosmia during their illness (48.5%). One case reported anosmia as their only symptom.
Cough was the most common presenting symptom for all age groups. A lower proportion of cases in the 70+ year old age group reported headache, sore throat, runny nose and sneezing compared with other age groups (Supplementary Table D).
Symptoms by sex were relatively consistent, although a higher proportion of females reported headache (61.6% compared to 51.6% of males), sore throat (43.9% compared to 33.2% of males), joint ache (43.9% compared to 33.2% of males), diarrhoea (32.3% compared to 22.1% of males) and nausea (30.5% compared to 16.6% in males) (Supplementary Table D). Table 5 shows common groups (pairs/trios) of presenting symptoms.
▪ Association of symptoms with COVID-19
There were four symptoms; fever, cough, shortness of breath and sore throat, that were collected on 380 symptomatic COVID-19 cases in the FF100, and 752 contemporaneous confirmed non-COVID-19 respiratory infections.
The relationship between the presence of a symptom and age for those with COVID-19 and those with non-COVID-19 respiratory illness is shown in Figures 5 and 6. These show non-linear relationships with age. For fever, there is an increase in occurrence with increasing age for COVID-19, while for those with other respiratory infections the occurrence decreases with increasing age.
For cough, a similar relationship is observed with COVID-19 and other respiratory infections. Shortness of breath exhibits an increasing occurrence with increasing age for both, although in those with COVID-19 there is a higher proportion of elderly cases with this symptom compared to those with other respiratory infections. The age relationship for sore throat was similar for both COVID-19 and other respiratory infections.
The shape of the age relationship with each symptom were similar in each of the different types of COVID-19 cases (imported, sporadic and secondary).
○ Sensitivity and specificity
Estimates of sensitivity and specificity for symptoms are presented in Table 6. Broad age categories; <30, 30-59, 60+, have been used to provide estimates within these age strata. Fever had both good sensitivity and specificity (64.0% and 63.9%), cough had high sensitivity but poor specificity (79.6% and 15.5%), shortness of breath was the most specific symptom but had low sensitivity, while sore throat had relatively low sensitivity and specificity.
○ Predictive values
Estimates of the positive predictive value (PPV) and negative predictive value (NPV) for the observed proportions of cases are presented in Table 7. The PPV and NPV were similar for fever and shortness of breath. These were 48.9% and 76.7%, respectively, for fever and 48.7% and 70.2% for shortness of breath, respectively. The estimated PPV increased with increasing age groups, with the NPV decreasing.
○ Multivariable analysis of presenting symptoms and diagnosis of COVID-19
The association between COVID-19 and symptoms has been assessed using logistic regression modelling. The estimated odds ratios (OR) and adjusted odds ratios (AOR) results are presented in Table 8.
After adjusting for the other symptoms, age and sex; fever and shortness of breath were significantly associated with a diagnosis of COVID-19 (AOR 4.15 (95% CI 2.95-5.82) and (AOR 2.27 (95% CI 1.56-3.29), respectively. Cough and sore throat exhibited no association with COVID-19. Age was positively associated which may reflect how the early cases were identified rather than what may have be occurring in the general population.
As the occurrence of symptoms change with age, interaction terms between a symptom and the broad age groups used previously have been explored. This provided strong evidence that the association with fever differed in these age groups (interaction term p value 0.0009). In the under 30-year olds the AOR was 2.67 (95% CI 1.41-5.07), in the 30 to 59-year old the AOR was 4.08 (95% CI 2.60-6.41), while in those 60 years of age or older the AOR was 17.15 (95% CI 5.60-52.55). No other symptom exhibited any strong evidence of an interaction with age.
Finally, a multinomial regression model was used to assess associations with the different COVID-19 case types. As with the logistic regression models, there was strong evidence of an interaction with age group and fever (Table 9). Sporadic cases exhibit a stronger association with both fever and shortness of breath, and have a higher proportion aged 60 or older. For all case types the association with fever is strongest in those aged 60 or older. Cough and sore throat are not strongly associated with any case type.
▪ Healthcare interactions, clinical course and complications
Overall follow-up information was obtained on 88.7% of FF100 cases (338/381) (94.2% of the FF100 cases in England (338/359)).
Duration of illness, in those cases which had sufficient information recorded (n=154), ranged from 2 to 36 days (median 11 days, IQR 7-15 days, mean 12.1 days).
The use of NHS 111, a telephone and online service was the most common healthcare interaction amongst the FF100 cases in England with just over three quarters of cases accessing NHS 111 at least once (Table 9). Smaller proportions of cases visited their GP (13.1%) or Accident and Emergency (A&E) (28.7%). 42.9% of cases were hospitalised of which 35 were admitted to ICU (21.5% of those hospitalised) and 25 required mechanical ventilation (15.3% of those hospitalised) (Table 10). Clinical outcomes were ascertained for 302 of these cases. At the time of follow-up 72.8% of cases had recovered from their COVID-19 illness (220/302), 18.9% were still ill (self-reported illness) (57/302) and 8.3% had died (25/302).
Discussion
This study presents an early assessment of the epidemiological and clinical characteristics of COVID-19 patients in GB. The COVID-19 FF100 system was rapidly established with data collected and analysed from the first virologically-confirmed cases. In this early phase, patients were only eligible for testing if they fitted the case definition at the time, which was based on a combination of clinical symptoms, travel history or contact with a confirmed case and later admission to hospital. This is the first epidemiological and clinical summary of severe and non-severe confirmed COVID-19 cases in GB and is also the first controlled analysis that we are aware of outside the United States that investigates associations of underlying health conditions with COVID-19 infection, rather than severe outcomes. The study is also the first to estimate sensitivity and specificity of different symptoms for COVID-19 in the GB population.
Just over half of the cases included in our study were imported with the majority imported from Italy, highlighting the importance of the Italian outbreak in facilitating spread to other European countries (23). Sporadic cases were detected within a month of the first confirmed case in GB.
Where known exposure to a confirmed case was documented, the most common setting for exposure was the household, confirming the household as a high risk setting for transmission. Only a small proportion of the cases were healthcare workers and most of these acquired their infection abroad suggesting that, in the containment phase of the epidemic, healthcare acquired infection did not play a significant role.
There were slightly more males amongst the FF100 cases than females; a gender distribution that has been described elsewhere (2-5). However, there was no evidence of an association of sex with secondary/sporadic cases. The median age was 47.7 years, again similar to other case series (2-5, 8, 9). Children had lower odds of infection compared to the general population for all case types.
The proportion of cases with any underlying health condition was 32.1%, slightly lower than the 37.6% of 7,162 cases reported to CDC in the US (24). Chronic heart disease, immunosuppression and multimorbidity were associated with higher odds of COVID-19 infection, for sporadic and imported cases. There was also weak evidence suggesting that chronic liver disease might be associated with higher odds of COVID-19 infection. These conditions may be risk factors for infection, although the associations could alternatively be explained by a greater likelihood of symptomatic or severe presentations in these patients which in turn increased the probability of testing for COVID-19.
The prevalence of any specified underlying health condition among the general population on 5 March 2019 ranged from 27.7% in England to 32.1% in Wales, similar to the proportion seen in the FF100 cases (32.1%). Chronic kidney disease and a history of asthma were associated with lower odds of COVID-19 infection for imported and sporadic cases, suggesting this is not explained by reduced likelihood of travel. This could be explained if ascertainment of these conditions is higher using electronic primary care records for general population controls than case questionnaires. For asthma this is plausible since the case questionnaires only asked about asthma requiring medication – although free text responses include mild or childhood asthma. It is also possible that asthma is not a risk factor for COVID-19 infection: asthma has been under-represented in case series of hospitalised patients, and there are mixed findings on the association of asthma with severe outcomes of COVID-19 (5, 11-13, 24-27).
The clinical presentation in FF100 cases was dominated by cough, fatigue and fever, consistent with other studies (2-5, 7-9). The proportion of FF100 cases reporting fever was slightly lower than case series descriptions from China (2-5), although similar to the first cases of COVID-19 described in Europe (1). This may be due to the inclusion of non-hospitalised cases in this study as well as the European study. Almost half of cases reported anosmia during the course of their illness, an unexpected symptom that has been observed by others, with some suggestion that it might precede the onset of more common symptoms or be present in those with mild or asymptomatic infection (28-31). Further work is required to understand anosmia as a distinctive clinical feature of COVID-19 (30, 31).
Symptoms were relatively consistent by sex. We noted differences in symptom presentation by age, with a lower proportion of cases in the youngest (<20 years) and oldest age groups (70+ years) reporting symptoms when compared to the other age groups. In particular, a non-linear relationship with age was observed for fever which increased with age for the COVID-19 cases. Although this study included only a small number of children, our findings are congruent with other studies suggesting that children may experience milder illness with different symptoms (32-34). The different sensitivity and specificity of symptoms by age highlights the need to consider this when setting up case definitions to support public health risk assessment, clinical triage and diagnostic algorithms.
Fever is clearly an important symptom, exhibiting good sensitivity and specificity. It was also significantly associated with a diagnosis of COVID-19, as was shortness of breath, and there was evidence of an interaction between fever and age, including for the different case types. In particular, sporadic cases exhibited a stronger association with fever and shortness of breath. The model used broad age groups, so differences due to age may have been concealed by this.
As expected, the non-urgent medical telephone and online service, NHS 111, was the most common healthcare interaction of the FF100 cases, in keeping with key government messaging during the study period which emphasised that those experiencing COVID-19 symptoms should stay home and contact NHS 111. This highlights the importance of public messaging about using online and telephone services to avoid propagating transmission of COVID-19 in healthcare settings. Forty-three percent of cases included in our study were hospitalised however this is certainly an overestimate of the case hospitalisation rate since at the beginning of the UK COVID-19 incident response all confirmed cases were hospitalised for isolation rather than clinical management purposes, and from 13 March 2020, testing was restricted to hospitalised cases only. The fact that only a small proportion of those hospitalised required mechanical ventilation in comparison with other studies supports this (2, 4).
One of the key strengths of this study is the detailed epidemiological and clinical data it was able to obtain. Use of the FFX study design allowed for systematic data collection on cases using ready to use infrastructure. The study achieved high rates of follow up on the FF100 cases and also benefits from having followed the majority of cases up past the intended 14 days. This has enabled capture of totality of symptoms over the course of illness which in many cases has been longer than 14 days.
This study included detailed ascertainment of underlying health conditions in controls from the general population to describe age and sex adjusted associations of health conditions with infection.
There were a number of limitations to this study. Firstly, the FF100 cases are a small subset of cases that were identified at the beginning of the COVID-19 epidemic in the Great Britain. They are unlikely to be representative of all GB cases of COVID-19 or of the general population, with different exposure and risk profiles. As such, the case-control study is vulnerable to selection bias. The FF100 cases are likely to be biased towards more severe cases who presented to healthcare, therefore they will underrepresent those with mild illness in the population, which may overestimate the associations between underlying health conditions and infection. The clinical presentation of FF100 cases may also differ to that of all GB cases since the imported cases (accounting for more than half of all cases) are more likely to have been of working age and may have been ‘healthier’ than the general population. This could conversely bias our estimates of associations between health conditions and infection downwards. The severity estimates are likely to be overestimates due to policy changes over the course of the study period, with hospitalisation of cases for isolation purposes initially and latterly restricting testing to hospitalised cases only.
Despite having a high follow-up rate, clinical outcome was not available on some cases due to sensitivities and difficulties around ascertaining the required information, particularly for the most severe cases. Also, for some cases the timing of follow-up, during which the initial questionnaire was revisited, may have been some time after onset of illness leading to possible recall bias.
The power and precision of estimates of associations between underlying health conditions and COVID-19 is limited by the number of cases in the FF100. This also limits the scope to adjust for potential confounders other than age and sex in the study. It is important to note that the predictive values do not necessarily reflect the prevalence of COVID-19 in the population at any particular time.
Future pandemic planning, should note the importance of maintaining FFX studies into the community transmission phase, as we have shown differences between imported and GB-acquired cases. Furthermore, achieving high quality and complete data capture and follow-up of cases, depends on the ability to rapidly mobilise a cadre of trained public health professionals. This can pose a challenge when capacity is already overwhelmed by the incident response. Consideration should also be given to case ascertainment through FFX investigations and how this may differ as a pandemic progresses due to changing contact tracing and testing policies over time.
In summary, this paper presents the first epidemiological and clinical summary of COVID-19 cases in Great Britain. This study has informed modelling of likely disease burden and healthcare demand, as well as priority groups for preventive measures such as physical distancing. This study also provides important evidence for generating case definitions to support public health risk assessment, clinical triage and diagnostic algorithms.
Data Availability
Applications for relevant anonymised data should be submitted to the Public Health England Office for Data Release: https://www.gov.uk/government/publications/accessing-public-health-england-data/about-the-phe-odr-and-accessing-data
Funding
The analysis using CPRD data was funded by the National Institute for Health Research (NIHR) Health Protection Research Unit (HPRU) in Immunisation at the London School of Hygiene and Tropical Medicine in partnership with Public Health England (PHE). The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, the Department of Health and Social Care, or PHE.
Declaration of interests
All authors have completed the ICMJE Conflicts of Interest form and declare: no support from any organisation for the submitted work; JMcM reports grant funding from IMOVE COVID-19, no other relationships or activities that could appear to have influenced the submitted work.
Contributor information
JLB is the corresponding author and guarantor. NLB, AC, HIMcD, VS and JLB drafted the manuscript.
For the FF100 case description and analysis of symptoms: CB, LC, TV, RW, MS, and NP were involved in database management, data validation and supervision of data collection process; NLB, AC, JS and EB were involved in data analysis; LL, PMacD, RV, OE, SC, JMcM, VS, GD, JLB were involved in supervision of the investigation/data collection process.
For the analysis of underlying health conditions: HIMcD, JLW, HS, and ID designed the study, JLW, DJG, HS, KEM, CTR, CM, RM and MP conducted data analysis, HIMcD, JLW, DJG, HS, KEM, CTR, CM, ID, RM and MP interpreted the results, and HIMcD is the guarantor.
MR, MZ, JLB, SE, NLB and AC were involved in the conceptualization of the overall study. All authors reviewed and revised the manuscript and approved the final version.
Data sharing
Applications for relevant anonymised data should be submitted to the Public Health England Office for Data Release: https://www.gov.uk/government/publications/accessing-public-health-england-data/about-the-phe-odr-and-accessing-data. The data used for this study were obtained from the Clinical Practice Research Datalink (CPRD). All CPRD data are available via an application to the Independent Scientific Advisory Committee (see https://www.cprd.com/Data-access). Data acquisition is associated with a fee.
Acknowledgements
We also acknowledge the contributions of Richard Pebody, Katie Owens, Hongxin Zhao and Chinelo Obi from the PHE Immunisation Department for their work on developing the protocols, study design, data collection and data entry, and the PHE Software Development Unit for their work on developing the FF100 web-platform.
We would like to acknowledge the contribution of a number of teams to this study including the PHE Health Protection Teams, PHE Field Services, PHE Centres, the PHE Centres and Regions Operational Cell, and the team of PHE health protection and immunisation specialist nurses and public health registrars who assisted with the case follow ups. Also the contribution of the teams in Health Protection Scotland, and Malorie Perry, Clare Sawyer and the staff of Public Health Communicable Disease Surveillance Centre and COVID-19 response cells in Public Health Wales.
We would also like to acknowledge the contributions of Nick Andrews, Anthony Scott, Julia Stowe, Liam Smeeth from the NIHR Health Protection Research Unit in Immunisation at PHE and LSHTM, and Angel Wong from LSHTM, towards study design, data extraction and cleaning, and interpretation of results for the analysis of underlying health conditions in cases and CPRD controls.