ABSTRACT
Background Several community-based studies have assessed the ability of different symptoms to identify COVID-19 infections, but few have compared symptoms over time (reflecting SARS-CoV-2 variants) and by vaccination status.
Methods Using data and samples collected by the COVID-19 Infection Survey at regular visits to representative households across the UK, we compared symptoms in new PCR-positives and comparator test-negative controls.
Results From 26/4/2020-7/8/2021, 27,869 SARS-CoV-2 PCR-positive episodes occurred in 27,692 participants (median 42 years (IQR 22-58)); 13,427 (48%) self-reported symptoms (“symptomatic positive episodes”). The comparator group comprised 3,806,692 test-negative visits (457,215 participants); 130,612 (3%) self-reported symptoms (“symptomatic negative visit”). Reporting of any symptoms in positive episodes varied over calendar time, reflecting changes in prevalence of variants, incidental changes (e.g. seasonal pathogens, schools re-opening) and vaccination roll-out. There was a small increase in sore throat reporting in symptomatic positive episodes and negative visits from April-2021. After May-2021 when Delta emerged there were substantial increases in headache and fever in positives, but not in negatives. Although specific symptom reporting in symptomatic positive episodes vs. negative visits varied by age, sex, and ethnicity, only small improvements in symptom-based infection detection were obtained; e.g. adding fatigue/weakness or all eight symptoms to the classic four symptoms (cough, fever, loss of taste/smell) increased sensitivity from 74% to 81% to 90% but tests per positive from 4.6 to 5.3 to 8.7.
Conclusions Whilst SARS-CoV-2-associated symptoms vary by variant, vaccination status and demographics, differences are modest and do not warrant large-scale changes to targeted testing approaches given resource implications.
Summary Within the COVID-19 Infection Survey, recruiting representative households across the UK general population, SARS-CoV-2-associated symptoms varied by viral variant, vaccination status and demographics. However, differences are modest and do not currently warrant large-scale changes to targeted testing approaches.
INTRODUCTION
Whilst a substantial proportion of individuals infected with SARS-CoV-2 remain asymptomatic[1,2], symptoms are associated with higher viral loads[3], and higher viral loads with infectivity and transmission[4]. Resource constraints generally prevent universal testing being deployed, so testing and isolation strategies are usually targeted to those with symptoms most predictive of infection and/or contacts of known positive individuals. Currently, four “classic” symptoms trigger PCR-based community testing in the UK, namely: loss/change of smell/taste, fever, and/or a new, continuous cough. In the US, the Centers for Disease Control and Prevention (CDC) advises testing for any of fever or chills, cough, shortness of breath/difficulty breathing, new loss of taste/smell, fatigue, muscle or body aches, headache, sore throat, congestion/runny nose, nausea/vomiting, or diarrhoea.
As national testing policies depend on symptoms, understanding their predictive value in the context of seasonality, changing prevalence of different variants[3], and widespread vaccination[5] is essential. Most studies to date have restricted to those hospitalised or seeking healthcare, who do not represent most infections[6]. Three recent UK community-based studies suggested that sensitivity could be increased by 10-20% by extending the “classic” symptoms (REACT[7], adding combinations of headache, muscle aches, chills and appetite loss depending on age; ZOE[8], adding different symptoms depending on age, sex, BMI and working in healthcare; VirusWatch[9], adding feeling feverish, headache, muscle aches, loss of appetite or chills) but at a cost of increasing numbers eligible for testing 2-3 fold and tests per positive identified up to 7-fold. However, these studies were mainly before widespread vaccination, and whilst Alpha dominated. ZOE found no evidence of difference in reported symptoms between wild type and the Alpha variant[10]. With Delta, ZOE has recently identified headache, sore throat and runny nose/sneezing as non-classic symptoms most commonly occurring in fully, partially and unvaccinated positives, respectively[11].
We therefore used a large community-based survey representative of the UK general population to investigate symptoms over time in PCR-positive episodes and negative controls, also evaluating the impact of age, ethnicity, cycle threshold (Ct) value, vaccination status and PCR gene profile (as a proxy for variant).
METHODS
The Office for National Statistics (ONS) COVID-19 Infection Survey[12] (ISRCTN21086382, https://www.ndm.ox.ac.uk/covid-19/covid-19-infection-survey/protocol-and-information-sheets) continuously randomly selects private households from address lists and previous surveys. Having obtained verbal agreement, each household is visited by a study worker, and written informed consent obtained for individuals aged ≥2y (from parents/carers for those 2-15y, with those 10-15y also providing written assent). At the first visit, participants are asked for consent for optional follow-up visits every week for the next month, then monthly thereafter. At each visit, participants provide a nose and throat self-swab following instructions from the study worker and answer questions about behaviours, work, vaccination uptake and symptoms in the last 7 days (https://www.ndm.ox.ac.uk/covid-19/covid-19-infection-survey/case-record-forms). Twelve specific symptoms are elicited, namely: loss of taste, loss of smell, fever, cough, headache, tiredness/weakness (denoted fatigue/weakness), muscle ache (denoted muscle ache/myalgia), abdominal pain, diarrhoea, nausea or vomiting, shortness of breath and sore throat; plus a general question about presence of symptoms participants considered COVID-19-related (denoted any evidence of symptoms).
Swabs are tested at national testing laboratories using the Thermo Fisher TaqPath PCR assay (3 targets: ORF1ab, nucleocapsid (N), and spike protein (S)). If N and/or ORF1ab genes are detected, samples are called positive; the S-gene can accompany other genes, but does not count as positive alone (27/34,494 (0.08%) only S-gene positives counted as negative).
We included the first study positive test in each positive “episode”, defining re-infections (arbitrarily) as occurring ≥120 days after an index positive with a preceding negative test, or after 4 consecutive negative tests[5]. Each episode was classified by the minimum Ct value across positive tests in it and as wild-type/Delta-compatible if the S-gene was ever detected within it (by definition, with N/ORF1ab/both), otherwise Alpha-compatible if positive at least once for ORF1ab+N, otherwise “other” (N-only/ORF1ab-only) (Fig.S1). Symptom presence included reports at any visit (test-positive/negative/failed) within [0,+35] days of the first positive per episode (i.e. spanning [-7,+35] days given the question timeframe).
The comparator was visits with negative PCR tests, excluding visits with symptoms related to ongoing COVID-19 (and long COVID), with high probability of undetected COVID-19, and where symptoms were likely driven by recent vaccination (details in Supplementary Methods).
Statistical analyses
Primary analyses restricted to positive episodes and negative visits with evidence of any symptoms, because this is the population targeted by public health messages to test and self-isolate, denoted “symptomatic positive episodes” and “symptomatic negative visits” respectively. We considered all positive episodes, as well as subgroups defined by Ct (reflecting viral load), gene positivity pattern (reflecting variant compatibility), vaccination status, age and calendar time (reflecting background incidental symptoms) (details in Supplementary Methods).
Initially, hierarchical clustering with complete linkage (Jaccard distance matrix) assessed congruence of self-reported symptoms internally. To assess predictors of reporting any symptom, and each symptom in symptomatic episodes/visits, we fitted generalised additive models (binomial distribution with complementary log-log link, mgcv (v.1.8-31) package), adjusting simultaneously for calendar time (smoothing spline), age (smoothing spline), sex, ethnicity (white vs non-white). We tested whether effects varied by positive/negative episodes/visits using interaction tests. In positives only, separate models also adjusted for Ct value (smoothing spline) and gene-positivity, or vaccination status (details in Supplementary Methods).
We assessed the impact on performance of adding each of the other eight symptoms to the four classic symptoms currently prompting testing in the UK, and of every combination of 1-8 symptoms, and any of the 12 elicited symptoms, using standard metrics (details in Supplementary Methods, epiR (v.1.0-15), pROC (v.1.16.2) packages) plus tests per case (TPC)=1/positive predictive value (PPV) and the inflation factor=episodes/visits with any included symptom/episodes/visits with classic symptoms. Finally, for each positive episode, we compared symptoms reported at the first vs subsequent visits within the episode (both absent, both present, absent then present, present then absent), and the associated Ct distributions.
RESULTS
Between 26/4/2020-7/8/2021, 5,130,318 study PCR results were available from 484,317 participants; 34,494 (0.67%) were SARS-CoV-2-positive. In total, 27,869 positive episodes occurred in 27,692 participants (median age 42 years (IQR 22-58)); 13,427 (48%) self-reported symptoms (“symptomatic positive episodes”). The comparator group comprised 3,806,692 negative visits (457,215 participants, median age 52 years (IQR 32-66)); 130,612 (3%) self-reported symptoms (“symptomatic negative visits”) (exclusions in Tables S1&S2; characteristics in Table S3).
Specific symptoms are associated with SARS-CoV-2 and variants
Fatigue/weakness, cough, and headache were the most frequently reported symptoms in positive episodes (54%, 54%, 52% of symptomatic positive episodes; Fig.1). However, headache and cough were most frequently reported amongst negative visits too, together with sore throat (23%, 22%, 22%; Fig.1). Loss of taste/smell were the most specific symptoms for SARS-CoV-2 positivity (33%/33% positive episodes, only 2%/2% negative visits). In positive episodes, loss of taste or smell were commonly co-reported, as were gastrointestinal symptoms, and headache/myalgia/fatigue. Symptom co-reporting in symptomatic positive episodes was generally similar regardless of Ct, variant, vaccination status or age (Fig.S2), although myalgia was less strongly co-reported with headache/fatigue in those 6-15y, with Delta and in those ≥14d post-second vaccination. Patterns were broadly similar in negatives, except cough and sore throat were more commonly co-reported.
Symptomatology varied significantly by variant (Fig.1); with a smaller percentage of symptomatic positive episodes reporting loss of taste/smell for Alpha-compatible (31%/28%) than wild-type (38%/36%) or Delta-compatible (38%/39%) symptomatic positive episodes (p<0.0001). Fever/headache/sore throat had the largest difference between symptoms reported for Alpha-compatible (37%/56%/38%) and Delta-compatible (46%/62%/50%) symptomatic positive episodes (p<0.0001). Cough and fatigue/weakness had the largest differences between wild-type (50%/50%) vs Alpha-compatible (61%/61%) or Delta-compatible (64%/60%) symptomatic positive episodes (p<0.0001). In general, specific symptoms were reported slightly more in symptomatic positive episodes ≥14d from second vaccination versus those unvaccinated or ≥21d from first vaccination.
Symptomatology over calendar time
Adjusting for age, sex and ethnicity, the probability of symptoms being reported amongst positive episodes was reasonably stable after August-2020 given changes in incidence and sample size (Fig.2, top panels), with fluctuations likely reflecting school return in September-2020 and March-2021, plus the emergence of Alpha and Delta in November-2020 and May-2021. Smaller fluctuations in reporting any symptoms in negative visits (Fig.2, bottom panels) mirrored those in positive episodes. The percentage of symptomatic positive episodes reporting each specific symptom generally increased over time, consistent with increasing awareness. Reporting of most specific symptoms except loss of taste/smell temporarily peaked in January-2021, consistent with the peak in Alpha, then remained approximately constant through to May-2021, before increasing again, markedly so for headache, cough and fever after Delta became dominant. Increases in cough and sore throat in symptomatic negative visits occurred after school return in September-2020, in January-2021 when schools were shut, and from early April-2021, consistent with other respiratory viruses. The winter months saw particular increases in gastrointestinal symptoms, fatigue/weakness, myalgia and headache in symptomatic negative visits, consistent with the presence of other common seasonal pathogens.
Symptomatology by age, sex and ethnicity
All symptoms showed marked variation across age in both positive episodes and negative visits, mostly being reported less in children and elderly adults (Fig.3). Loss of taste/smell were most frequently reported in symptomatic positive episodes in those aged ∼20y, decreasing gradually at older ages; both were rarely reported amongst symptomatic negative visits, consistent with their high specificity for SARS-CoV-2, although slightly more in the elderly. Sore throat and headache also peaked in symptomatic positive episodes and negative visits in late adolescence, but were common regardless of positivity. High proportions reported cough, relatively similarly amongst symptomatic positive episodes and negative visits to ∼10y, therefore without discriminating; however, above 20y the proportion reporting cough was more than double in symptomatic positive episodes, and increased to ∼60y, as did fatigue/weakness, shortness of breath, and diarrhoea.
Adjusting for calendar time, age and ethnicity, women were generally more likely than men to report most symptoms (Fig.4). Increased reporting in women was significantly greater in symptomatic positive episodes than negative visits for loss of smell and taste, diarrhoea and shortness of breath, and significantly smaller for headache and sore throat (all heterogeneity p<0.01). Whilst there was no evidence of differential reporting of fever between male and female symptomatic negative visits, female symptomatic positive episodes were significantly less likely to report fever (p<0.001).
After adjusting for calendar time, age and sex, in both symptomatic positive episodes and negative visits, non-white ethnic groups were more likely to report fever than white ethnic groups, and less likely to report headache, nausea/vomiting and shortness of breath (Fig.4). Those from non-white ethnic groups were less likely to report loss of taste or smell and shortness of breath to a significantly greater degree in symptomatic positive episodes versus negative visits (all p<0.01).
Symptoms by vaccination status
Both positive episodes and negative visits had lower odds of reporting any evidence of symptoms compared to those pre-vaccination with no evidence of difference (p>0.44, Fig.5). However, positives episodes ≥21d from first vaccination had significantly lower odds of reporting 10/12 symptoms compared to negative-visits (p<0.01).
Symptoms by Ct value in PCR-positive episodes
At low Ct values (≤20) (a proxy for high viral load), the most commonly reported symptoms were cough, fatigue/weakness, headache and muscle ache/myalgia, all occurring in >50% of episodes (Fig.6; adjusted for gene positivity pattern in Fig.S3). Above Ct>27.5 (reflecting lower viral loads), all symptoms declined in prevalence with a trajectory that tracked Ct; between 20-27.5, most symptoms showed little variation. Interestingly, prevalence of reported loss of taste/smell increased substantially from ∼30% to ∼45% between Ct 15-27.5, with smaller increases for shortness of breath.
Symptom combinations predicting symptomatic PCR-positive episodes
Over the whole study, PPV of any evidence of symptoms for identifying PCR-positive episodes was 9%. Including any of the 12 elicited symptoms by definition maximised sensitivity (90% of symptomatic positive episodes reported at least one specific symptom; remainder reported any other symptom considered COVID-19 related only), but substantially increased TPC (8.7) and number of tests (2.3-fold) compared with using the four classic (cough, fever, loss of taste/smell) symptoms (74%, 4.6, 1-fold). For a fixed number of 1-8 symptoms, the choice of whether to maximise sensitivity or area under the receiver operating characteristic (AUROC) curve, or to minimise TPC or the inflation factor, led to different optimal combinations (Table S4). However, these frequently included one or more of the classic four symptoms. Sensitivity was generally higher for combinations including fatigue/weakness and/or headache, but at a cost of higher TPC, particularly for headache. Including gastrointestinal symptoms had the lowest TPC and numbers to be tested, but lowest sensitivity. In those ≥14 days post second dose, sore throat had similar effects on sensitivity and TPC as headache, and in children diarrhoea had a greater benefit for sensitivity (Table S4).
Balancing different performance metrics, adding fatigue/weakness to the classic four symptoms improved sensitivity from 74% to 81%, while dropping the AUROC by <0.01 (0.734 to 0.727) (Fig.7). However, TPC increased from 4.6 to 5.3, and 1.3 times more people would need testing. This combination generally performed well across subgroups (Table S5). Adding other symptoms to the classic four symptoms generally led to lower AUROCs, and at best similar sensitivity (Table S5, Fig.7), excepting children/adolescents in whom adding headache achieved highest sensitivities when considering adding only one extra symptom, and also highest AUROC for those aged under 10y (Fig.S4-S8).
Symptomatology at different stages of SARS-CoV-2 infection
Considering symptoms over time in positive episodes with ≥2 visits within 35 days (Table S6), the most common symptoms presenting after the index positive were fatigue/weakness (8%), headache (7%), cough (6%), loss of taste (6%), loss of smell (5%) or muscle ache/myalgia (5%) (Table S7). For most symptoms, Ct values were highest in those never reporting the symptom, lowest in those reporting it initially and subsequently, and intermediate where symptoms were reported at either the initial or subsequent visits only (Fig.S9). The main contrast was loss of taste and loss of smell, where Ct values were lowest in those reporting loss of taste or smell at subsequent visits only (p<0.0001).
DISCUSSION
In a large, randomly selected community-based survey, we demonstrate that overall reporting of any symptoms in new SARS-CoV-2 positives varied substantially over calendar time (40-70%), reflecting changing dominance of specific variants (Alpha, Delta) and positivity rates (higher viral burden (low Ct)/symptomatic infections being identified more frequently at (mostly) monthly visits when rates are increasing[5]), but also background incidental changes (e.g. public awareness of SARS-CoV-2-associated symptoms, seasonal pathogens, schools re-opening). Additional important differences in the relative frequency of symptom reporting in positive episodes vs negative visits were also observed by age, sex, ethnicity and vaccination status. Broadly however, of the 12 symptoms evaluated, the four classic symptoms (fever, cough, loss/change of smell/taste) gave close to optimal symptom-based screening, for testing given limited capacity; where additional testing capacity is available adding fatigue/weakness had the greatest improvement in sensitivity (7%) whilst inflating TPC by only 15%.
Whilst the CDC approach of using a broad range of symptoms to prompt testing maximises sensitivity, this approach is associated with substantially higher TPC (8.7 vs 4.6 for classic symptoms) and total tests needed (2.3-fold) with associated costs and capacity requirements. The UK approach of focussing on four classic symptoms has lower sensitivity (74% vs 90%), but higher accuracy to detect SARS-CoV-2 symptomatic infection overall (AUROC 0.734 vs 0.593). Increases in sensitivity from adding symptoms to the four classic symptoms typically either reduced overall accuracy and/or increased TPC and tests needed, highlighting the importance of evaluating several test metrics. Although advantages from including any additional symptoms were limited, those that best improved sensitivity across multiple subgroups included either fatigue/weakness or muscle ache/myalgia, or, in children/adolescents, headache. The REACT study[7] evaluating symptom constellations during the Alpha wave (December-2020/January-2021) suggested adding headache, muscle aches, chills and appetite loss to the classic symptoms. Our survey did not specifically elicit chills or appetite loss, which may partially overlap with fever and nausea/vomiting/abdominal pain; however, we found headache to have poorer specificity (i.e. to be commonly reported in test-negatives), particularly in adults, leading to substantially increased TPC despite improving sensitivity. REACT also suggested having different symptom sets for adults and children in order to optimise sensitivity, which would require careful public health messaging. ZOE[8] suggested an algorithm also including working in healthcare; whilst this could theoretically be programmed into an online test system, such complexity risks gaming the system if individuals cannot otherwise get a test.
The main limitation is that the survey collected information on only 12 specific COVID-19 symptoms, plus one generic question, to minimise participant burden. We therefore could not evaluate some symptoms more recently proposed for inclusion in expanded case definitions, such as coryza[13,14]. Parents/carers reported symptoms for children; symptom reporting may be affected by other cultural differences we could not adjust for as well as public awareness (e.g. increased reporting of loss of taste/smell once this became a “recognised” symptom). Power was limited within some subgroups, e.g. children and specific non-white ethnic groups. Our survey does not include those in care homes and with severe disease admitted to hospital who may have different symptom profiles. Testing was predominantly monthly; although individuals were followed longitudinally, we had limited resolution to assess the short-term evolution of symptoms during an infection.
The main study strengths are its size and population representativeness, particularly capturing episodes of mild infection in the community. We took a stringent approach to defining our ‘test-negative’ comparator to limit possible contamination from undetected positives/ongoing COVID-19. We report over periods that include different dominant viral variants. We took a pragmatic approach to comparing test performance, taking into account trade-offs between overall accuracy, sensitivity and TPC over different background prevalences, reflecting practical concerns regarding testing capacity, rather than optimising individual criteria.
Overall, we did not find any major shift away from the importance of the classic four symptoms in positive episodes with the emergence of the Delta variant and vaccine roll-out in the UK. Given their concurrent changes in test-negatives, recent reports of associations with sore throat may reflect background increases in other respiratory infections/hayfever, potentially even with SARS-CoV-2 isolated incidentally given that one-third of cases are estimated to be asymptomatic[2]. Currently, we therefore have limited evidence for expanding the case definition beyond the classic four symptoms where universal testing is not practical/affordable, with fatigue/weakness the most promising candidate. However, this requires ongoing monitoring as other respiratory viruses increasingly circulate following lifting of restrictions with vaccine roll-out[15–18], potentially altering the specificity of symptoms in determining SARS-CoV-2 vs other community-acquired infections.
Data Availability
Data are still being collected for the COVID-19 Infection Survey. De-identified study data are available for access by accredited researchers in the ONS Secure Research Service (SRS) for accredited research purposes under part 5, chapter 5 of the Digital Economy Act 2017. For further information about accreditation, contact Research.Support@ons.gov.uk or visit the SRS website.
Contributors
This specific analysis was designed by ASW, KBP, PCM, NS, DWE, TH, DC, TEAP, K-DV. K-DV conducted the statistical analysis of the survey data. K-DV, NS, PCM, ASW drafted the manuscript. All authors contributed to interpretation of the study results, and revised and approved the manuscript for intellectual content. K-DV is the guarantor and accepts full responsibility for the work and conduct of the study, had access to the data, and controlled the decision to publish. The corresponding author (K-DV) attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.
Funding
This study is funded by the Department of Health and Social Care with in-kind support from the Welsh Government, the Department of Health on behalf of the Northern Ireland Government and the Scottish Government. K-DV, KBP, ASW, TEAP, NS, DE are supported by the National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Healthcare Associated Infections and Antimicrobial Resistance at the University of Oxford in partnership with Public Health England (PHE) (NIHR200915). ASW and TEAP are also supported by the NIHR Oxford Biomedical Research Centre. KBP is also supported by the Huo Family Foundation. ASW is also supported by core support from the Medical Research Council UK to the MRC Clinical Trials Unit [MC_UU_12023/22] and is an NIHR Senior Investigator. PCM is funded by Wellcome (intermediate fellowship, grant ref 110110/Z/15/Z) and holds an NIHR Oxford BRC Senior Fellowship award. DWE is supported by a Robertson Fellowship and an NIHR Oxford BRC Senior Fellowship. NS is an Oxford Martin Fellow and an NIHR Oxford BRC Senior Fellow. The views expressed are those of the authors and not necessarily those of the National Health Service, NIHR, Department of Health, or PHE. The funder/sponsor did not have any role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. All authors had full access to all data analysis outputs (reports and tables) and take responsibility for their integrity and accuracy.
PCM received funding from the Wellcome Trust [110110/Z/15/Z]. For the purpose of Open Access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.
Competing interests
All authors have completed the ICMJE uniform disclosure from at www.icmje.org/coi_disclore.pdf. DWE declares lecture fees from Gilead outside the submitted work. No other author has a conflict of interest to declare.
Ethical approval
The study received ethical approval from the South Central Berkshire B Research Ethics Committee (20/SC/0195).
Data sharing
Data are still being collected for the COVID-19 Infection Survey. De-identified study data are available for access by accredited researchers in the ONS Secure Research Service (SRS) for accredited research purposes under part 5, chapter 5 of the Digital Economy Act 2017. For further information about accreditation, contact Research.Support{at}ons.gov.uk or visit the SRS website.
Transparency
The lead authors affirm that the manuscript is an honest, accurate, and transparent account of the study design being reported, no important aspects of the study have been omitted, and any discrepancies from the study as originally planned (and, if relevant, registered) have been explained. Dissemination to participants and related patient and public communities: Results of individual tests were communicated to the participants. Overall study results were disseminated through the preprint of the study. Findings were disseminated in lay language in the national and local press.
Supplementary Material
Supplementary Methods
The presence of three SARS-CoV-2 genes (ORF1ab, N, S) was identified using real-time polymerase chain reaction (RT-PCR) with the TaqPath RT-PCR COVID-19 kit (Thermo Fisher Scientific, Waltham, MA, USA), analysed using UgenTec Fast Finder 3.300.5 (TaqMan 2019-nCoV assay kit V2 UK NHS ABI 7500 v2.1; UgenTec, Hasselt, Belgium).
Choice of negative visits in the comparator group
As a comparator group, we initially included all visits where PCR tests were negative, and then excluded visits where symptoms could plausibly be related to ongoing effects of COVID-19 or long COVID, where there was a high pre-test probability that the participant actually had a new COVID-19 infection that had not been detected in the survey, or where symptoms were likely driven by recent vaccination. Specifically, we excluded all negative visits (numbers in Table S1):
From -90 days before the first S-antibody positive blood test in the study prior to vaccination, where such antibody results are likely to represent previous undetected infection (these results were available only in a random subset of the population);
From -35 days before the first swab positive onwards from individuals who ever tested PCR positive in the study or positive on either PCR or LFD in the linked English testing programme (to avoid ongoing long COVID symptoms,2 and COVID-related symptoms occurring shortly before the positive test);
From -35 days before any self-reported positive swab test result onwards (for the same reason; reflecting the fact that individuals may have obtained tests elsewhere)
From a small number of individuals who reported either loss of taste or loss of smell at their first study visit and had no national testing programme result within [-21,+21] days (all before 1 July 2020), given the high specificity of this symptom for COVID-19 infection, the fact that it would have been impossible for these individuals to get an external test at the time and the potential for subsequent symptoms to represent long COVID;
Where participants reported self-isolating OR contact with definite positives in the preceding 28 days (since these individuals have much higher risk of SARS-CoV-2 infection which may not have been detected) and the previous and the next visit (because of higher risk of unidentified positivity, and because they may have been contact traced through the national training programme they may be more likely to report symptoms through recall bias, regardless of status);
Occurring within [-7,+14 days] of either first or second vaccination date3, to avoid the inclusion of common symptoms caused by vaccination in the test-negative comparator group and to reflect the possibility of small inaccuracies in reported date of vaccination for some participants.
Time windows were arbitrary but aligned with other analyses or windows for considering symptoms associated with positive episodes.
Subgroups
In order to assess the impact of various changes over the course of the epidemic, we considered symptoms overall in all positive episodes, and in specific subgroups, as follows:
S-gene present before 17/Nov/2020 (wild-type) vs S-gene absent from 17/Nov/2020 to 17/May/2020 (Alpha-compatible) vs S-gene present from 17/May/2020 onwards (Delta-compatible) (Fig.S1); all restricted to Ct<30 to increase the chance that the survey positive test was closer to the start of the infection
Ct<30 or ≥30 as a proxy for higher viral load (where symptoms may be more completely ascertained if the infection is identified close to onset) vs lower viral load
Up to 0 days before 1st vaccination date or unvaccinated, from 21 days post 1st vaccination date to 13 days post 2nd vaccination date inclusive, from 14 days post 2nd vaccination date onwards. Episodes/tests 0-20 days after first vaccination were excluded as symptoms may be due to side-effects.
Age groups 2-5, 6-10, 11-15, 16-44, 45-64, 65+ years
All positive episodes split between before 1/Sep/2020, 1/Sep/2020-17/Nov/2020, 17/Nov/2020-1/Mar/2021, 1/Mar/2021-17/May/2021, 17/May/2021-17/Jul/2021 on the basis of background incidental symptoms in negative visits (different background rates could be driven either by epidemic dynamics and/or other infections being more prevalent during certain periods)
Cycle threshold (Ct) values
Each positive test has a Ct value for each positive gene, leading to 1-3 individual Ct values per positive result. As the Spearman correlation between Ct values for each pair of genes (when present together) was very high (>0.98), we first took the arithmetic mean of all Ct values for detected genes for each positive as the single Ct value per positive test. We then took the minimum of these Ct values across all positive tests in an episode as the Ct value for each positive episode.
Generalised additive models
In regression models for reporting any evidence of symptoms and specific symptoms in those with evidence of symptoms in positive episodes and negative visits, we truncated age at 85y and Ct at the 5th and 95th percentiles in order to avoid undue influence of outliers. Age was modelled as smoothing spline. Due to small numbers (Table S3) we were only able to investigate differences by self-reported ethnicity as white vs non-white.
bam(cbind(n_withsymptom, n_withoutsymptom) ∼
s(study_day, bs=“bs”, k=15, by=Sars_COV_2_positivity) +
s(age_at_visit, bs=“bs”, k=15, by=Sars_COV_2_positivity) +
sex:Sars_COV_2_positivity + ethnicity_wo:Sars_COV_2_positivity + sex + ethnicity_wo +
Sars_COV_2_positivity, family=binomial(link=“cloglog”), method = “fREML”, data = data,
discrete=TRUE, nthreads =12)
bam(cbind(n_withsymptom, n_withoutsymptom) ∼
vaccinated:Sars_COV_2_positivity + vaccinated +
s(study_day, bs=“bs”, k=15, by=Sars_COV_2_positivity) +
s(age_at_visit, bs=“bs”, k=15, by=Sars_COV_2_positivity) +
sex:Sars_COV_2_positivity + ethnicity_wo:Sars_COV_2_positivity +
sex + ethnicity_wo + Sars_COV_2_positivity, family=binomial(link=“cloglog”), method =
“fREML”, data = data, discrete=TRUE, nthreads =12)
In positive episodes only:
bam(cbind(n_withsymptom, n_withoutsymptom) ∼
s(ct _mean, bs=“bs”, k=5) +
s(study_day, bs=“bs”, k=15) +
s(age_at_visit, bs=“bs”, k=15) +
sex + ethnicity_wo, family=binomial(link=“cloglog”), method = “fREML”, data = data,
discrete=TRUE, nthreads =12)
Also adjusting for S-gene positivity (present/absent/unknown):
bam(cbind(n_withsymptom, n_withoutsymptom) ∼
s(ct _mean, bs=”bs”, k=5) +
S_gene_positivity_pattern +
s(study_day, bs=”bs”, k=15) +
s(age_at_visit, bs=”bs”, k=15) +
sex + ethnicity_wo, family=binomial(link=”cloglog”), method = “fREML”, data = data,
discrete=TRUE, nthreads =12)
Performance metrics
For each combination of symptoms considered, we calculated the number of true positives (≥1 symptom from the combination being considered reported for a positive episode; TP), false positives (≥1 symptom reported for a negative visit; FP), true negatives (symptoms in the combination not reported for a negative visit; TN), false negatives (symptoms in the combination not reported in a positive episode; FN). We compared sensitivity=TP/(TP+FN); specificity=TN/(TN+FP); positive predictive value=TP/(TP+FP) (PPV); negative predictive value=TN/(TN+FN) (NPV); area under the receiver operating characteristic curve (AUROC), the area under the curve of specificity and 1–sensitivity, which, in this case where we have a binary outcome and a binary exposure, is 0.5*(sensitivity+specificity); tests per case identified=1/PPV (TPC); and the inflation factor=episodes/visits with ≥1 symptom/episodes/visits with classic symptoms.
Results
Defining the negative visit comparator group
Table S1 shows the total number of test-negative visits and the number of participants in which they occur, in all PCR negative visits and restricting to those visits with evidence of symptoms only, for each step of the hierarchical restriction of test-negative visits to form the comparator. This was done in order to exclude potential contamination from undetected positives, ongoing symptoms after detected positives, and symptoms due to vaccination. Overall, 75% of test-negative visits in 93% participants with any test-negative visit were retained in analyses, and 62% of test-negative visits where any symptoms were reported in 68% of participants with any test-negative visit where any symptoms were reported. Most test-negative visits were excluded because of self-isolation or contact with definite positives. The larger exclusion of test-negative visits where any symptoms were reported, despite the fact that symptoms were not a reason for exclusion (with the exception of visits from a small number of individuals reporting loss of taste/smell in 2020 before tests were available) illustrate the likely contamination of all test-negative visits with both undetected positives and ongoing symptoms post COVID-19 infection (Table S2).
Comparing the probability of reporting specific symptoms in negative visits with evidence of symptoms over time before (Fig.S10) and after (Fig.2) the final exclusion of negative visits that occurred within [-7,+14 days] of either first or second vaccination date, the substantially higher probabilities of reporting fever, headache, fatigue weakness, muscle ache myalgia and nausea-vomiting in the first quarter of 2021 are plausibly driven by side effects of vaccination, as these probabilities markedly reduce after the exclusion.
Acknowledgements
We are grateful for the support of all COVID-19 Infection Survey participants. Office for National Statistics: Sir Ian Diamond, Emma Rourke, Ruth Studley, Tina Thomas. Office for National Statistics COVID-19 Infection Survey Analysis and Operations teams, in particular: Daniel Ayoubkhani, Russell Black, Antonio Felton, Megan Crees, Joel Jones, Lina Lloyd, Esther Sunderland. University of Oxford, Nuffield Department of Medicine: Ann Sarah Walker, Derrick Crook, Philippa C Matthews, Tim Peto, Emma Pritchard, Nicole Stoesser, Karina-Doris Vihta, Jia Wei, Alison Howarth, George Doherty, James Kavanagh, Kevin K Chau, Sarah Cameron, Phoebe Tamblin-Hopper, Magda Wolna, Rachael Brown, Stephanie B Hatch, Daniel Ebner, Lucas Martins Ferreira, Thomas Christott, Brian D Marsden, Wanwisa Dejnirattisai, Juthathip Mongkolsapaya, Sarah Hoosdally, Richard Cornall, David I Stuart, E Yvonne Jones, Gavin Screaton. University of Oxford, Nuffield Department of Population Health: Koen Pouwels. University of Oxford, Big Data Institute: David W Eyre, Katrina Lythgoe, David Bonsall, Tanya Golubchik, Helen Fryer. University of Oxford, Radcliffe Department of Medicine: John Bell. University of Manchester: Thomas House. Public Health England: John Newton, Julie Robotham, Paul Birrell. IQVIA: Helena Jordan, Tim Sheppard, Graham Athey, Dan Moody, Leigh Curry, Pamela Brereton. Glasgow Lighthouse Laboratory: Jodie Hay, Harper VanSteenhouse. National Biocentre: Anna Godsmark, George Morris, Bobby Mallick, Phil Eeles. Oxford University Hospitals NHS Foundation Trust: Stuart Cox, Kevin Paddon, Tim James, Sarah Cameron, Phoebe Tamblin-Hopper, Magda Wolna, Rachael Brown. Department of Health: Jessica Lee.
Footnotes
See Acknowledgements for the Coronavirus Infection Survey team