Abstract
Background Cohort studies of people with a history of COVID-19 infection and controls will be essential to understand the epidemiology of long-term effects. However, clinical diagnosis requires resources that are frequently restricted to the severely ill. Cohort studies may have to rely on surrogate indicators of COVID-19 illness. We describe the prevalence and overlap of five potential indicators: self-reported suspicion, self-reported core symptoms, symptom algorithm, self-reported routine test results, and home antibody testing.
Methods An occupational cohort of staff and postgraduate students at a large London university who participated in surveys and antibody testing. Self-report items cover March to June 2020 and antibody test results from ‘lateral flow’ IgG/IgM antibody test cassettes sent to participants in June 2020.
Results Valid antibody test results were returned for 1882 participants. Of the COVID-19 indicators, the highest prevalence was core symptoms (770 participants positive, 41%), followed by participant suspicion of infection (n=509, 27%), a symptom algorithm (n=297, 16%), study antibody positive test (n=124, 6.6%) and self-report of a positive external test (n=39, 2.1%). Study antibody positive result was rare in people who had no suspicion they had experienced COVID-19 (n=4, 0.7%) or did not experience core symptoms (n=10, 1.6%). When study antibody test results were compared with earlier external antibody results in those who had reported them, the study antibody results agreed in 88% cases (kappa= 0.636), with a lower proportion testing positive on this occasion (proportion with antibodies detected 15% in study test vs 24% in external testing).
Discussion Our results demonstrate that there is some agreement between different COVID indicators, but that they a more complete story when used together. Antibody testing may provide greater certainty and be one of the only ways to detect asymptomatic cases, but is likely to under-ascertain due to weak antibody responses to mild infection, which wane over time. Cohort studies will need to review how they deal with different and sometimes conflicting indicators of COVID-19 illness in order to study the long-term outcomes of COVID-19 infection and related impacts.
What is already known on this subject?Research into the effects of COVID-19 in the community is needed to respond to the pandemic. Objective testing has not been widely available and accuracy may not be high when carried out in retrospect. Many cohort studies are considering how best to measure COVID-19 infection status.
What this study adds?Antibody testing is feasible, but it is possible that sensitivity may be poor. Each indicator included added different aspects to the ascertainment of COVID-19 exposure. Using combinations of self-reported and objectively measured variables, it may be possible to tailor COVID-19 indicators to the situation.
1. Introduction
The majority of those affected by COVID-19 are community cases not requiring hospitalisation,[1] whilst most research has focussed on hospital admissions. Non-hospitalised presentations are important, not just because they are numerous, but also because there have been widespread concerns about the medium and long-term outcomes,[2-4] particularly so-called “long COVID” – i.e. chronic symptoms attributed to the disease which persist after the acute infection.[5] Whilst there is no “gold-standard” for diagnosing COVID-19,[6] in hospital cohorts, combined clinical assessment, antigen testing by polymerase chain reaction (PCR) and lung imaging provide a strong basis for diagnosis. By contrast, in community settings, particularly at the start of the pandemic where antigen testing was not widespread, such information is not available, and since tests such as antigen/PCR are time sensitive, many participants will have missed the window when such diagnostics would have been helpful. Since there are no “gold standard” methods by which researchers can distinguish between cases and controls in most community studies, researchers have to rely on proxy indicator measures of COVID-19.
Potential indicators of past COVID-19 include self-report and objective measures. Self-report includes (a) whether the participant thinks they had been infected, (b) the report of specific symptoms, and (c) self-report of testing, which is dependent not only on participant recall but also on the availability of testing at the time when the participant was ill. Objective measures include detection of antibodies produced in response to SAR-CoV-2 (the virus that causes COVID-19), either using enzyme-linked immunoassay (ELISA) in the laboratory or “point of care” kits that can be used in clinics or at home.[7] While there is guidance on the clinical use of tests for COVID-19,[8, 9] there is little guidance for cohort studies for choosing and interpreting these self-report and measured indicators. To appraise these methods, we therefore explore five potential indicators in the context of a cohort study of staff and post-graduate research students in London, UK, that has been running since April 2020 and undertook antibody testing in June 2020.[10, 11] Our aim was to provide a descriptive analysis of the overlapping indicators of COVID-19 measured during the first four months of the pandemic.
2. Methods
2.1 Study
The King’s College London Coronavirus Health and Experiences of Colleagues at King’s (KCL-CHECK) study explores the short-, medium-, and long-term health and wellbeing outcomes of the COVID-19 pandemic (both the illness and the societal response) on KCL staff and postgraduate research (PGR) students. A protocol detailing study design and procedures is available.[10] Briefly, eligible participants were current staff or postgraduate research (PGR) students residing in the UK (for antibody testing). All KCL staff and PGR students were invited to participate via email on April 16th 2020, with reminder emails and advertisements on internal social media over the following two weeks. The baseline survey was open for two weeks (participants who had started but not yet completed were reminded and given a further week to complete). Most (90%) of the participants opted into follow-up consisting of short surveys every two weeks and longer surveys every two months. There were a small number of participants (1%) who opted out of two-weekly surveys but agreed to two-monthly surveys.
2.1.1 Ethics and reporting
Ethical approval has been gained from King’s College London Psychiatry, Nursing and Midwifery Research Ethics Committee (HR-19/20-18247). Participants provided informed consent to take part. Reporting conforms to The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines,[12] and a checklist can be found in supplementary materials.
2.2 Measures
Table 1 shows the schedule for follow-ups, with the first follow-up survey referred to as Period 1 (P1). Many questions in the baseline survey and longer follow-up surveys asked about experiences in the last two months, while the questions in the two-weekly follow-ups refer to experiences in the last two weeks. The present study used COVID-19 indicators from surveys at P0 (baseline) to P5 to measure exposures before the testing in late June. Questions asked in both the short and long surveys participant reports in P4 (‘In the last two months’) overlap with the reports in P1-3 (‘In the last two weeks’).
2.2.1 Self-reported suspicion of COVID-19 illness
At the baseline (P0) survey participants were asked “Do you think that you have had COVID-19 (coronavirus) at any time?”. At two-weekly follow-ups (P1, P2, P3, and P5) participants were asked “Do you think that you have had COVID-19 (coronavirus) in the last two weeks?” At the two-monthly follow-up (P4) participants were asked “Do you think that you have had COVID-19 (coronavirus) in the last two months?”. At all periods, participants could answer ‘Definitely’, ‘Probably’, ‘Unsure’ or ‘No’. These responses were summarised as the highest degree of suspicion ever reported for each participant (across P0 to P5). For some analyses report of ‘Definitely’ and ‘Probably’ were combined over surveys P0-P5, and if present, indicated positive suspicion of COVID-19.
2.2.2 Self-report of COVID symptoms
We adapted the symptom list used by the ZOE coronavirus daily reporting app [13] to cover two-month periods (P0 and P4) or two-week periods (P1, P2, P3 and P5), with a screening question “In the last two months[weeks], how have you felt physically?” and if they reported they had not been feeling well, they were taken to the checklist. If they reported shortness of breath or fatigue, they were asked about severity. We scored symptoms according to two definitions for comparison, representing examples of wider and more narrow definitions: (a) the presence of core symptoms of one or more of fever, cough and loss of smell/taste at any time; (b) scoring positively on a symptom algorithm described by the ZOE team using symptoms of loss of smell/taste loss, persistent cough, severe fatigue and skipped meals, participant age and gender [13]. Symptoms were summarised over all available survey periods as a binary indicator of whether participants had (i) ever vs. never scored positively on the ZOE algorithm, (ii) ever vs. never reported a core symptom, (iii) ever vs never reported feeling not right physically (‘any symptom’). If participants missed a survey period they were considered to have not reported symptoms at that period.
2.2.3 Self-report of COVID-19 testing
We had no access to routine data or data collected for other studies on COVID-19 testing, so we asked “Have you had a test for COVID-19 (coronavirus)?” and “What was the result?” at baseline, and the same for the preceding two weeks at P1, P2, P3, and P5, and two months at P4. After the KCL-CHECK antibody tests were issued, we added the instructions to answer excluding the antibody kit we had sent them. At P8, we also asked participants to report separately about the type of test, asking about ever receiving tests involving a swab of the throat and/or nose to look for infection and, separately, about ever receiving blood and blood spot tests to look for evidence of past infection. These were used to split reported tests in P0 to P5 into “antigen/PCR tests” and “external antibody tests”. External testing was summarised over P0 to P5 as having ‘ever’ vs ‘never’ reported antigen/PCR or external antibody tests.
2.2.4 Home antibody tests
A Rapid Immunoassay Test Cassette was used to measure evidence of antibodies to the ‘spike’ protein of SARS-CoV-2. The SureScreen Diagnostics COVID-19 IgG/IgM Rapid Test Cassette performance against antigen/PCR positive cases was shown to be good, for instance in laboratory conditions using samples from 110 hospitalised COVID-19 patients and 50 negative historical samples it had 89% sensitivity and 100% specificity, having >90% agreement with the ELISA result.[14]. An internal pilot demonstrated that the test cassette could be used by participants without specific training, following which we developed the procedure described in an earlier paper.[11] Briefly, the test cassette was sent by post to the participants’ nominated address in late June, along with a lancet for providing a blood spot, the buffer solution and detailed illustrated instructions (available from the authors). Participants were asked to add blood then buffer to the cassette and wait ten minutes for results to appear. A printed rectangle labelled with the participant study ID number was provided within which the test cassette could be photographed, which they uploaded to a secure server. The team then interpreted the photographs. Participants were asked to contact the team if they had difficulties, who answered within two working days, and could arrange for a replacement kit to be sent if necessary (replacement kits were sent in early July).[11]
2.2.5 Participant Characteristics
All characteristics were taken from the baseline survey (P0). Ethnicity was asked using recommended wording from the Office of National Statistics with 18 groups [15], and is reported grouped into five categories due to small cell sizes. Role within the university was grouped according to self-report main role and grouped based on likely seniority and exposure categories: academic, specialist and management; research, clerical and technical; teaching, facilities and clinical; and PGR student. Participants were also asked to indicate if they were in any of the government-defined roles that made them a “key worker”.
2.3 Analysis
Datasets from each period and antibody testing were merged using a ‘longitudinal ID’ that was allocated at baseline using R 4.0.0 and associated packages [16-19]. First, we summarised participation and missing data. Second, we explored the overlap of indicators through descriptive analysis and charts. 95% confidence intervals around proportions were calculated using Wilson’s method. We refer to sensitivity and specificity of the self-report indicators for predicting those participants who were positive on the KCL-CHECK antibody test, as this was the objective measure available in this study. We do not necessarily regard this as sensitivity or specificity for past COVID-19 illness or SARS-CoV-2 infection since we do not have a “gold-standard” diagnosis for comparison.
3. Results
3.1 Cohort and Missing data
The baseline study included 2807 staff and PGR students, representing response a rate of 23% (Figure 1). Supplementary materials table ST1 and ST2 compare data about the composition of KCL staff and PGRs to the final cohort, and characteristics of staff and PGR students combined at each point in the survey is shown in Table 2. 1882 staff and PGR students who completed the baseline survey, consented to follow-up, and completed the antibody testing by 13th July with a valid result: they were 88% white, 71% female, 17% PGR students and 13% keyworkers, with a median age of 37 years.
After the baseline survey, there were five opportunities to complete follow up surveys before antibody test completion (see Table 1). 98% completed at least one survey, and 68% completed all five. Analyses were performed on the full cohort of 1882 participants and a secondary analysis limited to 1687 participants (90%) who had completed the P4 survey which, because it asked about the previous two months, provides a continuous record of COVID indicators covering time since baseline until around two weeks before the antibody tests were sent to participants. Prevalence and overlap of COVID indicators were identical in these analyses, so the larger cohort is reported.
3.2 COVID indicators
Table 3 shows the prevalence of COVID indicators in our sample. Of 1882 with valid antibody testing results, 124 tested positive (6.6% 95% confidence interval 5.6-7.8). This compares with 814 (41%, 95%CI 39-43) who had experienced at least one core symptom, 527 (27%, 95%CI 25-29) who suspected they had experienced COVID-19 and 312 (16%, 95%CI 14-18) who were positive in the symptom algorithm. Around nine out of ten of these participants reported these indicators at baseline (90% with core symptoms, 91% suspected, 88% symptom algorithm). Only 323 people reported receiving a COVID-19 test (17%, 95%CI 16-19), not including the test from this study, with 235 receiving a probable antigen/PCR test, and 138 receiving a probable external antibody test, including 50 reporting both. Ten reported a positive antigen/PCR test and 33 a positive external antibody test, meaning 2.1% (95%CI 1.5-2.8) of participants (12% of those tested) reported a positive external test.
Table 3 also shows the pattern of agreement between pairs of indicators. Universally, endorsing one indicator increased the likelihood participants met criteria for other indicators. Of people who endorsed at least one core symptom of COVID-19, 56% thought they had had COVID-19, but fewer met the symptom algorithm (39%) and fewer still were positive for antibodies (14%) or a positive external test (4%). When positive external test results were reported, all other criteria were met at least half the time. Of those who had tested positive on the KCL-CHECK antibody tests, 85% had experienced COVID-19 symptoms, 81% thought they had had COVID-19 and 67% met the symptom algorithm. In the KCL-CHECK antibody positive group, 19% reported a positive external test result: a further 9% reported having had external tests but not a positive test.
Taking into account agreement in both positive and negative outcomes, supplementary table ST3 shows overall agreement ranged between 60%-94%, with the most likely to be concordant being KCL-CHECK antibody test and external test, followed by KCL-CHECK antibody test and symptoms algorithm; least likely to be concordant were core symptoms and external test. Considering the subset of 138 people who reported results of external antibody tests (supplementary table ST4), shows that 24% were positive on external tests, while 15% were positive on KCL-CHECK test. Overall agreement was 88% (Cohen’s kappa=0.636).
Figure 2A and supplementary table ST5 show that if a participant thought they had not experienced COVID-19 they were very unlikely to get a positive antibody test result, with only 4 (0.7%) testing positive. The proportion testing positive steeply increased as suspicion increased, such that those who were definite were nearly 60 times more likely to test positive (39% positive), with probable or definite suspicion of COVID-19 infection having 81% sensitivity (101/124) and 77% specificity (1350/1758) for KCL-CHECK positive antibody test. Figure 2B and supplementary table ST6 shows that the majority of people who tested positive on the KCL-CHECK antibody test were positive on the symptom algorithm in at least one survey. The algorithm had a 67% sensitivity (83/124) and 88% specificity (1544/1758) against the KCL-CHECK antibody test. Core symptoms (including all who were algorithm positive) had 85% sensitivity (106/124) and 62% specificity (1094/1758) against the KCL-CHECK antibody test. Comparing those who were positive on the symptom algorithm with those who had only core symptoms, algorithm positive participants were around six times as likely to test positively (28%:5%). Among participants not reporting any core symptoms, the proportion positive on KCL-CHECK test was no different between those who reported having potentially atypical symptoms and those who reported no symptoms at all (1.7% non-core/atypical symptoms:1.6% no symptoms).
Participant suspicion and self-reported symptoms were highly related. Table ST8 shows the proportion testing positive on KCL-CHECK antibody test for the intersections of suspicion and symptom with at least ten people. For each level of symptoms (algorithm, core, non-core, none) increasing suspicion increased the proportion having tested positive on the KCL-CHECK antibody test, such that suspicion seems to add information beyond symptom report. Those with the highest proportion of positive antibody results are those with algorithm positive and definite suspicion, at just 49% antibody positive.
4. Discussion
Ascertaining who has and who has not had COVID-19 is not an easy undertaking, given symptoms that are variable and common, and availability of testing has been poor.[1] However cohort studies in the community must tackle this to answer some of the uncertainties and guide responses to this evolving pandemic.[4, 5] There are no “gold-standard” diagnostic criteria,[6] and for people with symptoms in the community who have not had the time-sensitive tests, there is now no opportunity to get diagnostic certainty, and therefore no “ground truth” of who has and who has not had COVID-19. For example, in the KCL-CHECK study, the vast majority (91%) of the people who thought they had experienced COVID-19 by June reported it in the baseline survey in April 2020, too late for any real-time ascertainment. Retrospective ascertainment using self-report and home testing kits have been used by KCL-CHECK to reduce some of the uncertainty, but none of these methods can recover the “ground truth”. This paper reports the results of each measure against the other to look at the merits of each approach and what can be learned for future studies.
The KCL-CHECK cohort was made up of staff and PGR students at the university. COVID indicators suggest a prevalence ranging from 2%-41% by June 2020. The 2% refers to positive external tests – but given that antigen testing was unavailable at the peak of infections, and performance depends on timing and swab technique, this will under-estimate COVID-19.[20] The 41% refers to reporting one or more core COVID-19 symptoms of fever, persistent cough and loss of smell/taste, the first two of which overlap with many other illnesses, so this will over-estimate symptomatic COVID-19. Report of little or no suspicion of COVID-19 was very predictive of antibody testing being negative, but false-positive report may be frequent due to the very high profile of the illness. However, we found that for each level of symptoms, higher self-reported suspicion adds to the likelihood of testing positive. Presumably participants are able to factor in both the symptom unusualness for them and external factors such as contact with someone with a history of COVID-19 illness.
The ZOE algorithm improves on using core symptoms to identify cases by including markers severity. In developing the algorithm we used they found that a single positive outcome on the algorithm had a sensitivity of 65% and specificity of 78% for antigen test outcome.[13]. In a separate cohort (TwinsUK), ever being algorithm positive during daily use of the app in March and April had a sensitivity of 37% and specificity of 95% for laboratory antibody testing.[21] In our study, algorithm positivity in any of up to six surveys had a sensitivity of 67% and specificity of 88% for home antibody test outcome. The higher sensitivity of the algorithm in our study reflects the fact that we found few asymptomatic people to be positive for antibodies (9% no symptoms, 15% no core symptoms), whereas they have been a substantial minority in many other studies, including TwinsUK (19% no symptoms, 27% no core symptoms), another UK study REACT2 (32% no symptoms, 39% no core symptoms)[22] and a study from Spain (22% no symptoms, 36% no core symptoms)[23]. Thus, while the algorithm may be good at identifying those with a classical COVID-19 illness, recalled symptoms alone will fail to identify all of those who have had the infection, for which testing may improve identification.
Antibody testing in KCL-CHECK used an IgG/IgM test kit based on a “lateral flow”, and could be sent to participants, was simple to use and had been shown under ideal settings to have high sensitivity (89% for serum from 110 hospital patients).[14] Antibody tests can be fallible though, giving false positives through cross-reactivity with antibodies unrelated to SARS-CoV-2 (possibly seasonal coronaviruses).[9] Our device reacted to none of the 50 control sera in Pickering et al., but most lateral flow devices tested on more controls give false positive results in approximately 2 per 1,000 samples,[9, 24] which could be problematic in large studies with low prevalence. For the present study sensitivity is a concern, since it is known that small numbers of people do not produce anti-spike antibodies,[24, 25] they are detected more inconsistently in mild (i.e. non-hospitalised) COVID-19,[14, 26] and the concentration has been shown to decrease over time.[27, 28]. Testing in KCL-CHECK occurred at least three months after most people reported symptoms: few studies have yet tracked antibody levels over this length of time, but in some cases antibodies may cease to be detectible, especially on the qualitative lateral flow devices.[29, 30] In the subset of KCL-CHECK participants who had reported previous antibody testing 15% were positive on the KCL-CHECK antibody test, compared with 24% in their prior reported test, which could show time-dependent loss of reactivity, although there were likely to be differences in test specifications too.
Further rounds of testing may help elucidate the role of timing for antibody detection using lateral flow devices. There may be the possibility of augmenting antibody testing with testing for T cell response to SARS-CoV-2 to better track long-term immunity in the future.[31] For the present, it makes sense for studies to collect results from external testing from participants, which are likely to be accurate but under-estimate the proportion infected. Collecting enough symptom reports to calculate the ZOE algorithm will assist in finding those who have had a COVID-19-like illness, but both sensitivity and specificity may be added by also taking into account participant’s own appraisal of whether they have had COVID-19. Testing with a high specificity antibody test will identify past cases that were asymptomatic or atypically symptomatic and add more certainty where a positive test accords with COVID-19 symptoms. However, timing may lead to poor sensitivity when testing is done in isolation.
4.1 Strengths and weaknesses
The strengths of this study include the survey repeating every fortnight, which we hope struck a balance between minimising recall bias for symptoms and participant burden. We incorporated a symptom checklist that has been evaluated elsewhere, alongside self-report suspicion of infection and external testing, but also gave participants the opportunity to participate in antibody testing. The antibody test kit was chosen as being highly specific for SARS-CoV-2, and therefore suited to minimise false positives in population screening. While our conclusions could have been strengthened by the presence of a hospital standard diagnosis against which to compare other outcomes, the paper aimed to show what results can be gathered in the community and so compares these against each other.
Offering our participants home testing very likely improved uptake of the test at a time when people may have been put off attending a hospital due to infection concerns. The lateral flow cassette is designed for use by a trained person, but from our pilot and the high proportion of people returning valid results, we believe that with the procedures we had in place (such as illustrated instructions and a responsive email enquiry address) it was possible for participants to perform the test.[11] Nevertheless, there is the potential that participant errors and inconsistencies may have increased the number of invalid test results and potentially reduced sensitivity.
Our cohort was unusual for a population study because it was only including staff and PGR students from a single university; we found that women and people of white ethnicity were more likely to volunteer, as well as those in management and research roles, leading to a lack numbers and representativeness in some of the non-white ethnicities and lower socio-economic groups. We included all who had completed the baseline survey and antibody testing, regardless of missing intermediate surveys. Our sensitivity analysis showed this made little difference, which it intuitively would not given COVID-19 infections peaked at the end of March and estimated rates of infection were below 5 per 100,000 throughout May and June [32, 33] so we would expect relatively few positives to occur at timepoints in May and June compared to the baseline.
4.2 Conclusions and implications
This paper shows a variety of potential COVID-19 indicators that may be available for community studies, their prevalence and overlap in a single cohort. All of the indicators are related and different combinations of self-reported and objective tests need to be considered in order to overcome the facts in this pandemic, such as time-course of detectable antigen and antibody, poor access to routine testing, symptoms that overlap with many other illnesses, and the high profile of the illness. Where false positives can be tolerated, it seems reasonable to take participant suspicion, whereas adding the symptom algorithm may increase specificity and an antibody test may add asymptomatic cases. With our present knowledge of COVID-19, it will be a case of maximising the algorithm for COVID-19 ascertainment, rather than one measure giving real certainty.
Data Availability
Researchers may access pseudonymised data by application to the Principal Investigators (Professor Matthew Hotopf, Professor Reza Razavi and Dr Sharon Stevelink, email: check@kcl.ac.uk) subject to conditions set out in the protocol
Competing interests
None declared.
Funding
This study was funded by King’s College London.
Data availability
Researchers may access pseudonymised data by application to the Principal Investigators (Professor Matthew Hotopf, Professor Reza Razavi and Dr Sharon Stevelink, email: check{at}kcl.ac.uk) subject to conditions set out in the protocol https://www.medrxiv.org/content/10.1101/2020.06.16.20132456v2.
Supplementary Tables
Appendix 1
STROBE Statement—Checklist of items that should be included in reports of cohort studies
Acknowledgements
The research team would like to thank the participants of KCL-CHECK and the team that has provided logistic and advisory support, in particular: Jonathan Edgeworth, Liam Jones, Lisa Sanderson, Jana Kim, Laila Danesh, Charlotte Williamson, Laurence Blight, Rupa Bhundia, Lucy O’Neill, Candice Middleton. We also thank Mark Zuckerman for comments on an earlier draft of this paper. This paper represents independent research supported by the National Institute for Health Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. MHM is a Wellcome Trust Investigator.
Footnotes
↵+ joint last authors