Abstract
Contact tracing is an important tool for controlling the spread of infectious diseases, including COVID-19. Here, we investigate the spread of COVID-19 and the effectiveness of contact tracing in a university population, using a data-driven ego-centric network model constructed with social contact data collected during 2020 and similar data collected in 2010. We find that during 2020, university staff and students consistently reported fewer social contacts than in 2010, however those contacts occurred more frequently and were of longer duration. We find that contact tracing in the presence of social distancing is less impactful than without social distancing. By combining multiple data sources, we show that University-aged populations are likely to develop asymptomatic COVID-19 infections. We find that asymptomatic index cases cannot be reliably back-traced through contact tracing and consequently transmission in their social network is not significantly reduced through contact tracing. In summary, social distancing restrictions had a large impact on limiting COVID-19 outbreaks in universities; to reduce transmission further contact tracing should be used in conjunction with alternative interventions.
Introduction
Contact tracing is an important tool for controlling the spread of infectious diseases, including COVID-19. Contact tracing involves identifying individuals at high risk of infection due to their contact with a known case, and then testing, treating, or isolating them to prevent the onward spread of infection. It is commonly used for sexually transmitted infections [1, 37], where the definition of a contact is clearly established. Contact tracing is also used to control novel viral infections, particularly in the early containment phase of an epidemic [17, 23], as during the 2020/1 COVID-19 pandemic [18, 22].
The success of contact tracing depends on the proportion of transmission that occurs before symptom onset [15]; in the case of COVID-19, individuals are infectious for at least a day before developing symptoms and so forward contact tracing, which aims to identify contacts before they are infected, is likely to lag behind transmission. In contrast, backward contact tracing, which aims to identify missed ancestral cases and then use forward tracing to identify multiple branches of transmission, has been found to be more successful for COVID-19, in part due to the presence of super-spreaders [13].
Individual-based models are useful for capturing the heterogeneous nature of disease transmission and network models lend themselves to contact tracing analysis as they can explicitly define connections between individuals. We use an ego-centric network model framework, whereby a central completely connected node has alters that are satellites to the ego — a formulation that resembles an individual and their social contacts and can be constructed from available social contact data [11, 28].
The Social Contact Survey (SCS) collected data on social contact patterns for the general British population in 2010 and has been used for mathematical modelling of infectious disease [3, 10, 11]. However, university populations are different to the general population [12]. Students tend to be young adults, with higher than average numbers of social contacts and often live in communal residences. Also, during the COVID-19 pandemic, social restrictions have changed the way people socialise. To capture any new contact patterns in university staff and students during the pandemic, the Coronavirus Questionnaire (CON-QUEST) [28] survey was launched in June 2020.
With these two surveys we are able to investigate how social behaviours of a university population changed under social restrictions and then the effectiveness of contact tracing in this setting. The effectiveness of contact tracing is assessed by its ability to reduce the extent an individual spreads the disease and the schemes ability to back-trace and discover asymptomatic index cases.
Methods
Social Contact Data
The Social Contact Survey (SCS)
The SCS surveyed 5,388 individuals from across Great Britain in 2010; see [10, 11] for full details of the methodology and results. We extracted 635 individuals from the SCS whose occupation contained the text “Student”, “Lecturer” or “Researcher” and who were aged 18 or older. We assumed that these respondents were affiliated with a university.
The University of Bristol Coronavirus Questionnaire (CON-QUEST)
We used data from a longitudinal online survey of University of Bristol staff and students, launched on 23 June 2020. Here, we used data collected between 23 June 2020 and 8 February 2021. In that time, 744 staff and 887 students responded at least once and we used the first response from each participant. The CON-QUEST questionnaire was designed to mirror the SCS, so many of the questions are similar, but the surveys are not identical. In particular, SCS did not collect the age of the contact and CON-QUEST did not collect who-knows-whom data. For full details of CON-QUEST, see [28].
The data we extracted from CONQUEST was the age of the respondent, the age and number of individual contacts, the size of group contacts and the ages of their members, and the duration and frequency of any interaction. Note that ages of individual and group contacts were recorded within intervals, not as exact ages, so the ages of contacts are estimated by interpolating the intervals with a uniform distribution.
The duration and frequency distributions of social contacts reported in the two surveys are compared with a Chi-square test, to investigate whether social restrictions implemented during the COIVD-19 pandemic have changed contact patterns in the university.
Modelling COVID-19 transmission and contact tracing on ego-centric networks
We model the spread of COVID-19 over ego-centric social networks, constructed from the social contact survey data described above. An ego-centric network consists of an ego, which is a central node that is connect to all other nodes, with alters that are connected to the ego but not necessarily to each other. This formulation lends itself to the available data when the respondent is modelled as the ego and their contacts are the alters. The edges represent social connections and are weighted with the duration (T) and frequency (F) of interaction. The nodes are weighted by the number of people involved in the interactions, so will only be greater than one 1 when representing a group.
Groups are assumed to be well-mixed and individuals group members are explicitly defined only during an interaction. Transmission within group is modelled on a virtual, complete network. The virtual network exists when a node weighted greater than 1 is interacted with. When the ego interacts with a group contact of size s (represented as a node with weight s) transmission is modelled on a complete network with s + 1 nodes and s(s + 1)/2 edges. How the respondent interacted with each group member is unknown, so we assume that the duration weighting of all edges in the group network is a fraction of the reported duration of the group interaction, such that the s edges of all nodes will be weighted with T/s, where T is the reported duration. Frequency weighting in the virtual network is unnecessary because the group interaction will already have been triggered. The edge that connects the ego with the weighted node will have a frequency weighting equal to the reported frequency.
We model contact tracing as originating from a symptomatic infectious node. Upon developing symptoms, symptomatic cases may be tested. All contacts they met in the past 5 days are traced. The symptomatic originating case and their traced contacts isolate. During isolation it is assumed that individuals contact no one, including household contacts. The only contacts that do not isolate are ones reported in CON-QUEST to be ‘met for the first time this day’, these contacts are considered strangers and unidentifiable.
COVID-19 natural history parameters
We use a compartmental approach for capturing progression from susceptible to infection, to latent infection (not infectious), and infectious (either symptomatic or asymptomatic) [8, 31, 34].
We model the probability of transmission P (T) during an interaction to be dependent on the length of the interaction, t, as the risk of infection has been found to increase the longer a person is exposed to an infected individual [36]. We attempt to capture this time dependence of transmission with a stochastic transmission process that hinges on an adaptation of the fixed infectious period model reported in [19]. We consider asymptomatic individuals to be less infectious by a factor A; the probability of transmission during an interaction, P (T), is then where α is the rate of transmission and t is the duration of interaction in minutes. We used A = 0.35 [7], and values of α that are: 1.6 × 10−4, 3.1 × 10−4, 5.6 × 10−4and 7.9 × 10−4 per minute. These values are based on data from the SCS, where the median number of unique contacts was 9, the median frequency of interaction is 6 days in two weeks and mean duration was 124 minutes. Taken together with an infectious period of 12.2 days, this corresponds to an average number of secondary cases between 1 and 4.
Upon infection, all individuals become latently infected with a mean latent period of 3.3 days [38]. Subsequently, the infectious period is modelled as lasting for 10 days after symptom onset, to reflect the UK track and trace guidelines which state that a person who has developed symptoms of COVID-19 must isolate for 10 days [27].
Infectious individuals are either symptomatic or asymptomatic. Age is a key factor in the probability of symptoms, with younger individuals less likely to report typical COVID-19-like symptoms [24, 35], so we seek to quantify an age-dependent probability of being asymptomatic. Measuring the age dependence of being asymptomatic directly is challenging, therefore we calculate the conditional probability using Bayes’ Theorem, where P (Age | Asymptomatic) is the age distribution of asymptomatic cases, estimated using a meta-analysis [21], P (Asymptomatic) is the mean percentage of asymptomatic cases in infected individuals [29] and P (Age) is the probability of an individual belonging to a given an age group, calculated from data on the UK age demographic [14]. The banding of age groups as well as their conditional probability of being asymptomatic are shown in Table 1.
The Simulation Procedure
Event-based simulations with a time step of one day are used to model the local epidemic. The simulations are initiated with an infected ego and susceptible alters, and run for 30 days. On any given day, two nodes that are connected by an edge have a probability of interacting equal to F/14, where F is the frequency weighting (the expected number of interactions in 14 days) of the connecting edge. The probability of transmission during this interaction is given by equation 1. All newly infected individuals transition through the stages of a COVID-19 infection described above.
Alters in the network develop asymptomatic infections with a probability dependent on their age (Table 1). There are three scenarios for the egos symptomatic status: (1) they are all symptomatic, (2) they are all asymptomatic, they are asymptomatic with a probability dependent of their age (Table 1). Once an individual develops symptoms it is assumed they take a test that same day and receive the results within 3 days. Individuals will not isolate until they receive their result, and it at this point that all identifiable contacts met in the past 5 days also isolate. Isolation is implemented by setting the probability that an individual meets their contacts to zero. Isolation lasts for 10 days for the individual who tests positive and 14 days for their contacts. An individual is considered susceptible once their isolation ends. Hospitalisations and deaths are not included in this model as we are focusing on a primarily young population.
Results
Comparing characteristics between the two surveys
The median number of daily contacts reported in the CON-QUEST survey was 2, compared to 9 from the university-affiliated subset of the SCS used here. No one in the SCS subset reported having zero contacts on the previous day, but this was reported by 18% of CON-QUEST participants. The proportion of respondents with low numbers of daily contacts is greater in the CON-QUEST data than in the SCS data. The proportion of respondents reporting one, two or three daily contacts is an order of magnitude greater in the CON-QUEST survey than in the SCS subset (Figure 1d). The maximum number of daily contacts was greater in the SCS subset (k = 3011) than in CON-QUEST (k = 531) and there was a higher overall proportion of respondents with high numbers of contacts in the SCS subset (the proportion of respondents reporting 12–24 contacts was one order of magnitude greater).
The duration of social contacts reported during 2020 increased compared to contacts reported in 2010 (χ2 = 367.8, df = 2, p 0.001). Respondents in the CON-QUEST survey had longer interactions with their contacts than those in the SCS, with 58% and 43% of respondents having interactions that lasted longer than one hour respectively. A higher percentage of university-affiliated respondents in the SCS reported short interactions lasting of less than 10 minutes (33%) compared to the CON-QUEST survey (24%). In the SCS, nearly 60% of all reported interactions lasted less than one hour, compared to 42% in CON-QUEST (Figure 1a).
In addition to longer duration contacts, social contacts reported in CON-QUEST occurred more frequently than in the SCS (χ2 = 1518, df = 4, p 0.001). A greater proportion of contacts reported to be met ‘8 or more times in 2 weeks’ in the CON-QUEST survey than in the SCS (57.2% and 29.2% respectively). However, there was little difference between the number of contacts that were ‘Met for the first time this day’ (SCS-13.8%, CON-QUEST-12.5%) (Figure 1b).
For respondents with ages 18–24 (n = 637) and 25–44 (n = 647) the majority of their contacts (77% and 58% respectively) were within their own age interval. Respondents aged 45–64 (n = 314) had the greatest proportion of their contacts in their own age group (36%) but not a majority, while respondents 65–80 (n = 16) had similar proportions of contacts aged 25–44, 45–64 and 65–80 (28%, 23%, 25% respectively). However, respondents aged 81+ (n = 1), reported no contacts in their own age category. The greatest proportion of contacts for respondents age 81+ was 0–4 years old.
Secondary COVID-19 cases
To assess how observed behavioural changes have impacted transmission and highlight the importance of contextually accurate data in modelling, the model is run with both university affiliated data from the SCS and the CON-QUEST data. In these simulations it is assumed the ego is symptomatic and there is no contact tracing. As can be seen in Table 2, when the ego-centric networks are constructed from the pre-COVID data recorded in the SCS there is a greater average number of secondary cases for all transmission rates. The average number of secondary cases is at least 1 when the SCS data is used. However, only with the highest transmission rate (fitted to produce a reproductive number of 4 in a simplified pre-COVID social network) does the average number of secondary cases reach 1 when the CON-QUEST data is used.
The effect of contact tracing
The spread of COVID-19 in a university population is found to be dependent on the transmission rate, the infectiousness of the index case, and whether contact tracing is implemented. Figure ?? shows that an increase in transmission rate always results in an increase in the average number of secondary cases. However, it is only in the scenario where all index cases are symptomatic and the transmission rate is 7.9 × 10−4 per minute that the average number of secondary cases exceeds 1 (Figure ??a).
In the absence of contact tracing, symptomatic index cases generate a greater number of secondary cases compared to asymptomatic index cases due to their increased infectiousness. However, the spread of COVID-19 from symptomatic index cases is greatly reduced with the implementation of contact tracing. In comparison, the reduction due to contact tracing for asymptomatic index cases is minimal. When the index case is symptomatic, contact tracing has the potential to reduce the mean number of secondary cases by up to 56%, with the biggest impact for lower transmission rates. When the index case is asymptomatic, contact tracing reduces the mean number of secondary cases by less than 3%. In the asymptomatic case, the greater impact is seen with higher transmission probabilities. When the index case has a probability of being asymptomatic dependent on their age we see the expected behaviour of COVID-19 in a university population, shown in the bottom panel of Figure ??. In this realistic scenario, contact tracing has the potential to reduce the mean number of secondary cases by 26.6%, 25.2%, 22.8% and 20.9% for α1-4 respectively.
The inability for contact tracing to curb asymptomatic transmission can be explained by the low percentage of index cases that are back-traced from their symptomatic contacts. As the transmission rate is increased in the model from α1 to α4 the percentage of index cases isolated also increases, from 4% to 12%, but remains low. This is inline with the increasing relative reduction of average cases seen in the asymptomatic scenario.
Discussion
Analysis comparing the two sets of survey data suggests university populations have become more clustered during the pandemic. The general pattern of the CON-QUEST data is that university populations have few select contacts that they meet frequently for long periods of time. This pattern is in line with expected lockdown behaviour. The increase in clustering within university populations seen in the 2020 data meaningfully reduces the average number of secondary cases produced by an infectious individual. This reduction is seen when model simulations are run with the pre-COVID SCS data and the CON-Q UEST data. The comparison suggests that social restrictions can prevent COVID-19 from becoming endemic in a university for all but the most transmissible variants. However, there may be other explanations for the differences seen between the SCS and CON-QUEST surveys. For example, a survey in 2017/8, which collected information on contact patterns of UK citizens [20], found a real reduction in the number of contacts made by 15-19 year-olds from data collected 2005/6, that could be explained by the digitisation of young people’s social interactions [32]. Therefore, the differences seen in the social contact behaviour in students captured in the 2010 SCS compared to the behaviour captured by CONQUEST in 2020 may not be completely attributable to the COVID-19 pandemic.
When contact tracing is used in combination with social restrictions, they amount to an effective control. For all transmission rates when contact tracing is used the average number of secondary cases is below one. However, the extent to which contact tracing reduces transmission is limited when index cases are asymptomatic, suggesting that asymptomatic transmission cannot be curbed by contact tracing when asymptomatic cases are left undetected. Considering that universities have a young population and younger people have a higher probability of being asymptomatic, contact tracing may not sufficiently reduce transmission without mass testing. With mass testing the expected university behaviour will more closely resemble the scenario where all index cases are symptomatic.
The inability of contact tracing to control asymptomatic transmission partly stems from the fact that young members of the university interact most with other young members. Within these cliques an asymptomatic invective is likely to transmit to a person who will also develop an asymptomatic infection. When transmission is low, the majority of index cases will only transmit to people who develop asymptomatic infections. Hence, contact tracing cannot be relied upon to back-trace and isolate an asymptomatic index case.
These findings are consistent with other modelling approaches in identifying social distancing and mass testing as necessary measures for control in a university. Modelling conducted with anonymised university accommodation data found that asymptomatic cases produce more secondary cases even if they have a lower transmission rate than symptomatic cases [2]. A non-data driven model for COVID-19 transmission on US campuses also found that mass testing is important but does conclude that when prevalence is low, high rates of testing can lead to high proportions of isolated individuals with false-positives [30]. These high proportions of unnecessarily isolated people were generated without the inclusion of contact tracing; only those that were tested were isolated. With the low levels of transmission found in this model, if contacts identified through mass testing isolated then the number of unnecessary isolations would be a justifiable criticism of the control measure.
Although CON-QUEST is a longitudinal survey, we only analyse the first responses from the 23rd June 2020 to 8th February 2021. In this period, there was variation in the social restrictions measures and we do not capture such time varying behaviour here, nor do we consider significant social events such as mass travel for University vacations.
Our model does not account for vaccination or natural immunity and the results are representative of early stages in the pandemic, which is true also for the contact data used in constructing the networks. Other models have found that vaccination can reduce COVID-19 incidence in a population to near zero [25], even without other mitigating controls [4]. Due to the low levels of transmission observed in the simulations, we would expect few reinfections in the majority of networks. However, due to heterogeneity in the networks, where multiple transmissions do occur it is likely they will be to a select few individuals and here reinfection is a possibility. Consequently, the inclusion of natural immunity would likely lead to a reduction in the average number of secondary cases.
Further limitations of the model are that alters are not connected, and the assumption is made that social networks are isolated from each other, where in reality population level connections do exist. More extensive data collection such as that done by the SCS, where respondents reported which of their contacts had met each other, would provide information on transitive links. In the absence of these data, algorithms such as preferential attachment, an observed social phenomenon [5, 26], could be used. With regards to population-level networks, more targeted data collection could be conducted on schools or departments within universities [6, 9, 33], collecting information on whole cohort interactions. Data collection on this scale can be expensive and at times impractical, so synthetic social networks can be construct via algorithms. Previous research [16] has modelled COVID-19 transmission assuming a scale-free degree distribution exists at a population level, and investigated how different population-level social restrictions can reduce the critical levels for herd immunity. However, both the SCS and CON-QUEST data suggest that the degree distribution of a university population is power-law distributed only in the tail. Better population-level networks could be constructed through data-driven algorithms, such as exponential random graph models.
Conclusion
In this paper we compare the results of two social surveys conducted before and during the pandemic, by looking at the frequency, duration and number of contacts that were reported. From the data we construct ego-centric networks to model COVID-19 transmission from an index cases to their contacts. We then introduce contact tracing and investigate three scenarios where the index case is either, always symptomatic, always asymptomatic, or asymptomatic with a probability dependent on their age.
During the period of active social restrictions the contact patterns of university populations changed, people generally met fewer contacts for longer periods and with a greater frequency. This behavioural change reduces the average number of secondary infections. Further reduction can be achieved with contact tracing, when cases are identifiable. In a university population where there is expected to be a large asymptomatic population, mass testing will be required to identify asymptomatic individuals and exercise contact tracing to its full potential. However, our model suggests that under social restrictions population level out-breaks are unlikely in a university.
Data Availability
Data use in the present study are available request at: http://data.bris.ac.uk/data/dataset/2jxe5mx7gzbku2dekvlmbcwwhk or an anonymised version is available from: https://github.com/DStocks42/Uni-Contact-Tracing-Code. Other data used is openly accessible from: http://wrap.warwick.ac.uk/54273/
http://wrap.warwick.ac.uk/54273/
http://data.bris.ac.uk/data/dataset/2jxe5mx7gzbku2dekvlmbcwwhk
Data availability
The CONQUEST data used in this study is available from Nixon, E. J. (Creator), Trickey, A. J. W. (Creator) & Brooks Pollock, E. (Creator), University of Bristol, 18 May 2021
DOI: 10.5523/bris.2jxe5mx7gzbku2dekvlmbcwwhk, http://data.bris.ac.uk/data/dataset/2jxe5mx7gzbku2dekvlmbcwwhk
The SCS data used in this study is available from Danon, L. (Creator), Read, J. M. (Creator), Keeling, M. J. (Creator), House, T. A. (Creator) and Vernon, M. C. (Creator), University of Warwick, 2009. http://wrap.warwick.ac.uk/54273/
Acknowledgements
EBP would like to acknowledge support from the National Institute for Health Research (NIHR) Health Protection Research Unit (HPRU) in Behavioural Science and Evaluation at the University of Bristol. The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care. EBP, EN and DS are supported by UKRI through the JUNIPER consortium (Grant Number MR/V038613/1). EBP is further supported by MRC (Grant Number MC/PC/19067). AT is funded by the Wellcome Trust through his Sir Henry Wellcome Fellowship.