1 Abstract
Aim The objective of this observational study was to investigate the association between SARS-CoV-2 transmission risk, RT-PCR Cycle threshold (Ct) values, and age of infected cases in Danish households.
Background The Covid-19 pandemic is one of the most serious global public health threats in recent times. Understanding transmission of SARS-CoV-2 is of utmost importance to be able to respond to outbreaks and take action against the spread of the disease. Viral load is generally thought to correlate with transmission risk.
Methods We used comprehensive administrative register data from Denmark, comprising the full population and all SARS-CoV-2 tests (August 25, 2020 to February 10, 2021), to estimate household transmission risk.
Results We found that the transmission risk was negatively associated—approximately linear—with the Ct values of the tested primary cases. Also, we found that even for relatively high Ct values, the risk of transmission was not negligible; e.g., for primary cases with a Ct value of 38, we found a transmission risk of 8%. This implies that there is no obvious cut-off for Ct values for risk of transmission. We estimated the transmission risk according to age and found an almost linearly increasing transmission risk with the age of the primary cases for adults (≥20 years) and negatively for children (<20 years). Age had a higher impact than Ct value on the risk of transmission.
Conclusions Lower Ct values (indicating higher viral load) are associated with higher risk of SARS-CoV-2 transmission. However, even at high Ct values, transmission occurs. In addition, we found a strong association between age and transmission risk, and this dominated the Ct value association.
2 Introduction
The world is in the midst of a pandemic caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). It is essential to understand the SARS-CoV-2 transmission dynamics and associated factors in order to design interventions to control the spread of the disease, such as contact tracing efforts.
Real-time reverse transcription polymerase chain reaction (RT-PCR) is used worldwide to detect SARS-CoV-2 in swabs from nasopharynx and oropharynx (Corman et al., 2020). These tests are useful for fast and reliable detection of infected individuals who should then isolate themselves to prevent further spread of the disease (Cheng et al., 2020). Detection of SARS-CoV-2 by RT-PCR is—similar to other viruses—dependent on the viral load and it is reversely correlated with the Cycle threshold (Ct) value of a test (Singanayagam et al., 2020). The viral load changes during the course of disease and has been shown to be highest around the time of symptom onset (He et al., 2020).
Recently, an epidemiological study of 282 primary cases in Spain showed that higher viral load was associated with increased transmission risk of SARS-CoV-2 (Marks et al., 2021). The results were confirmed in a study of 219,722 primary cases in the UK (Lee et al., 2021). Thus, it is argued that contact tracing should be prioritized on cases with low Ct values (high viral load), as they will have a higher risk of generating secondary cases.
The aim of this observational study was to investigate the association between the transmission risk of SARS-CoV-2, RT-PCR Ct values, and age of infected cases in Danish households. In Denmark, all residents have access to tax-paid universal health insurance, and a test for SARS-CoV-2 is free of charge. The testing capacity is high, and testing is widespread. Denmark also has comprehensive social insurance, and SARS-CoV-2 sick leave is fully reimbursed by the state. Thus, neither financial reasons nor access to tests were a major obstacle to obtaining a test during our study.
3 Data and Methods
During the study period, RT-PCR tests for SARS-CoV-2 could be obtained from either community testing facilities at TestCenter Denmark (TCDK) or from hospitals. Statens Serum Institut (SSI) analyses all tests from TCDK. Information on Ct values was only available for samples that tested positive at TCDK.
3.1 Register Data
In this study we used Danish administrative register data comprising the full population. All residents in Denmark have a unique personal identification number that allows a complete linkage of information across different registers at the individual level. All laboratory results from departments of clinical microbiology in Denmark are registered in the Danish Microbiology Database (MiBA), from which we obtained individual level data on all national tests for SARS-CoV-2 for the period August 25, 2020 (which was the first date with accessible Ct values) to February 10, 2021. Primary cases were only included until January 25, 2021 in order to allow for secondary cases to present within the following 14 days. Moreover, we only included primary cases identified by TCDK, as we only had Ct values on those case samples. For potential secondary cases, we include all tests, regardless of them being tested at TCDK or hospitals.
Information on the reason for being tested (e.g., symptoms, potential contact with infected persons etc.) was not available. From the Danish Civil Registry System, we obtained information about the sex, age, and home address for all individuals living in Denmark. People who tested positive for SARS-CoV-2 using antigen tests were not included, as these test results were not transferred to MiBA at the time of the study.
3.2 Data Linkage
We constructed households by linking all individuals living at the same address, and we only considered households with two to six members, in order to exclude, e.g., institutions. Thus, six single apartments in the same building counted as six independent households. Person-level data, which included information on the test results, dates and times of sampling as well as the times the results were available, were linked to individuals within households. For each household, we identified the first positive test for SARS-CoV-2; called the primary case throughout this paper. We considered all subsequent tests from other members in the same household as tests taken in response to the primary case. We defined secondary cases as those who had a positive test with sampling dates 1 to 14 days after the primary case tested positive. Thus, we defined cases that tested positive on the same day as co-primary cases, thereby excluding them as potential secondary cases. For a thorough discussion on co-primary cases, see Lyngse et al. (2020). In addition, we assumed that all identified secondary cases were infected by the primary case within the same household, because all other household members should isolate themselves from society once a primary case within a household is confirmed. Thus, the primary case is the most likely source of infection for any additional household members.
3.3 Laboratory Analyses
Analysis of tests of TCDK is performed using a set of primers that target the E-gene on SARS-CoV-2 (Corman et al., 2020), which is recommended by the WHO and the ECDC, and has a high sensitivity and specificity (Vogels et al., 2020). TCDK have used the same methodology and primers throughout the epidemic, making the Ct values comparable across the study period. An RT-PCR test for SARS-CoV-2 is defined as positive if the Ct value is ≤38 (SSI, 2021).
3.4 Statistical Analyses
The transmission risk of the primary case (p) is the outcome variable yp; defined as the proportion of potential secondary cases that tested positive.
To estimate the association between Ct values and transmission risk, we estimated the non-parametric regression equation: where Ctp,1 is the Ct value (rounded to the nearest integer) of the primary case. β measures the transmission risk for each Ct value. εp denotes the error term, clustered on the household (event) level.
To estimate the association between age and transmission risk, stratified by the median Ct value (28), we estimated the non-parametric regression equation: where Agep,5 is the age (in five-year groups) of the primary case. β measures the transmission risk for each five-year age group of the primary cases. εp denotes the error term, clustered on the household (event) level.
To estimate the association between Ct value, age, and transmission risk, we estimated the non-parametric regression equation: where Ctp,2 is the Ct value (in bi-value groups) and Agep,10 is the age (in ten-year groups) of the primary case. β measures the transmission risk of the interaction between Ct value and age of the primary case. Ages is the age of the potential secondary cases (s) and α measures the linear association with age of the potential secondary cases (Lyngse et al., 2020). εp denotes the error term, clustered on the household (event) level.
To quantify the increased transmission risk across different observable characteristics, we estimated a univariable and a multivariable logistic regression. In particular, to estimate the odds ratio, we estimated the logistic regression equation: where β measures the non-parametric association with Ct values, γ measures the nonparametric association with age of the primary case, ɸmeasures the association with sex, and δ measures the association with the size of the household. εp denotes the error term, clustered on the household (event) level.
3.4.1 Sensitivity analyses
To investigate whether age had an impact on the result, we estimated the association between Ct value and transmission risk, controlling for age of the primary case. Furthermore, we estimated the age structured transmission risk stratified by sex to see whether there were different patterns across men and women. We then estimated the age structured transmission risk stratified by Ct value quartiles to see whether the pattern was independent of the median Ct value cut-off.
As Ct values were only available for the primary cases that were identified by being tested in TCDK, only these primary cases were included in the analyses. To address the potential bias from not including primary cases that were identified at hospitals, we performed sensitivity analyses by estimating transmission risk stratified by TCDK and hospitals.
We also estimated the transmission risk of the interaction between Ct value and age of the primary case without controlling for age of the potential secondary case to see whether our results were driven by the age of the potential secondary cases. Because people normally live with a partner around their own age and parents with their children, susceptibility correlation with age could drive an age structured transmission risk. To address this potential bias, we stratified our sample by Ct value quartiles and estimated the transmission risk with the interaction of the age of the primary and potential secondary case.
3.5 Ethical statement
This study was conducted on administrative register data. According to Danish law, ethics approval is not needed for such research. All data management and analyses were carried out on the Danish Health Data Authority’s restricted research servers with project number FSEID-00004942. The project was notified to the Danish Data Protection Agency.
4 Results
4.1 Descriptive Statistics
From August 25, 2020 to February 10, 2021, TCDK analyzed 9,416,298 samples for SARS-CoV-2 (74% of all tests in Denmark) and identified 66,602 primary cases up to January 25 (73% of all primary cases) (Table S1). Of these primary cases, a Ct value was available for 99.6%.
In this study, we had 66,311 primary cases living with 213,576 potential secondary cases, of which 103,389 (48%) tested positive for SARS-CoV-2 and were thus actual secondary cases (Table S2). Approximately half were men and half women. 25% of primary cases had a Ct value ≤25, 50% had a Ct value ≤28, and 75% had a Ct value ≤32 (Figure S1). The distribution of Ct values was relatively similar across age groups, suggesting that differences in test strategy across age were not driving our results (Figure S2 and Table S4).
4.2 Associations with Transmission Risk
Figure 1 shows the association between Ct values and transmission risk. There is an approximately linear decreasing relationship between Ct values and transmission risk.
Figure 2 shows the cumulative distribution of secondary cases by Ct values, e.g., primary cases with Ct values ≥30 account for 39% secondary cases.
Figure 3 shows the age structured transmission risk stratified by the median Ct value. There is an overall positive association between age and transmission risk for adults (≥20 years) and a negative association for children (<20 years), i.e., the transmission risk increased with younger age for children. Across all age groups, we found that primary cases with a lower Ct value (<28, red) had a significantly higher transmission risk compared to primary cases with a higher Ct value (≥28, blue).
Figure 4 shows the association between age, Ct value and transmission risk. The transmission risk generally increases with higher age and lower Ct value.
Table 1 provides univariable and multivariable regression estimates of the odds ratio for the transmission risk across different observational characteristics. The transmission risk increases with a lower Ct value, i.e., with a higher viral load. For example, a primary case with a Ct value of 18-20 has a transmission risk 1.89 higher than a primary case with a Ct value of 36-38. The transmission risk increases with age for adults, such that a primary case aged 75-80 has a 1.98 times greater transmission risk than a primary case aged 30-35 year.
4.3 Sensitivity analyses
We found an approximately linear negative association between Ct value and transmission risk that was also present when controlling for age of the primary case (Figure S3). We found similar transmission risks for men and women (Figure S4). The age structured transmission risk stratified by Ct value quartiles showed the same picture as when stratifying by the median Ct value (Figure S5).
We also found that primary cases tested at hospital test facilities had a slightly higher transmission risk—across five-year age groups (Figure S6).
When excluding the age of the potential secondary case as a control variable, we found a slightly reduced association between the transmission risk and the interaction between the Ct value age of the primary case (Figure S7). Furthermore, when stratifying by Ct value quartiles and estimating the transmission risk by the interaction of the age of the primary case and the age of potential secondary cases, we still found the same overall pattern (Figure S8).
5 Discussion and Conclusion
To our knowledge, this is the first nationwide study investigating the association between transmission risk, age and Ct values. We here exploited the detailed Danish register data comprising the full population and all RT-PCR tests for SARS-CoV-2.
We found an approximately linear association between Ct value and transmission risk, implying that cases with a higher viral load are more infectious than cases with a lower viral load (Figure 1). This association was expected, because a high viral load implies that there are more virus particles in the sample, and hence the possibility of a larger inoculum.
However, this result also highlights that there is no obvious cut-off of Ct values to eliminate transmission risk. Importantly, we found a considerable transmission risk for cases with high Ct values: Primary cases with a Ct value of 38 had a transmission risk of 8% within the household. This result contradicts previous studies that have argued that cases with a Ct value above a certain cut-off are not contagious. A Ct value cut-off of 30, or even lower, has been suggested (La Scola et al., 2020; Bullard et al., 2020; Brown et al., 2020; Prince-Guerra et al., 2021). These Ct values correspond to the limit of detection of virus cultures in Vero cells and antigen tests. However, choosing a cut-off of Ct ≤30 for infectiousness instead of Ct ≤38 would have missed transmission to 39% of the secondary cases in this study of household transmission (Figure 2, Table S3).
Rapid antigen tests are proposed for fast detection of cases as they provide a test result on the spot. Due to the lower sensitivity, they can only identify cases with a relatively high viral load, which implies that they can find the most infectious cases and allow for fast contact tracing. However, as they do not detect cases with a Ct value >30 (Jaafar et al., 2020; Bullard et al., 2020), they may miss a considerable share of infectious cases that would be detected with an RT-PCR test—42% of the primary cases in this study. This example assumes that only one test is performed—at least for a period of time. Nevertheless, rapid antigen tests could potentially increase detection of cases with repeated testing, conditional on the cases developing a viral load exceeding the detection threshold.
During an infection, the viral load is low shortly after exposure, increases over the infection, peaks around the onset of symptoms, and decreases later on (He et al., 2020). Hence, the Ct value is sensitive to the timing of the test, e.g., when a case is pre-symptomatic compared to later being symptomatic. Furthermore, there is a large variation in both severity of symptoms and infectivity across persons. Some studies found an indication of lower viral loads in asymptomatic persons (Zhou et al., 2020), while others found no differences across symptomatic and asymptomatic cases (Long et al., 2020). Additionally, the test results depend on the quality of the sampling as well as the assay, i.e., the chosen primers, probe and other reagents, which determines the accuracy of the test, making comparisons across laboratories difficult. In this study, the same methodology, set of primers, and probe were used throughout the whole study period, making the Ct values comparable. Despite the variations mentioned above, we found an association between Ct values and transmission risk, emphasizing the importance of this association.
Two other studies have also investigated the association between SARS-CoV-2 transmission risk and Ct values (Marks et al., 2021; Lee et al., 2021). Both also found a negative association. Lee et al. (2021) estimated the odds ratio of transmission to be 0.93 (95%-CI: 0.92-0.93), which is close to our estimate of 0.97 (95%-CI: 0.97-0.97) (Table S5). Our estimates here benefit from a large sample size and objective selection of potential secondary cases. We included all household members as potential secondary cases (un-conditional on them being contacted by the official contact tracing system), of which 88% were tested within 1-14 days of the primary case. 48% of the potential secondary cases tested positive, indicating a high degree of transmission in the household domain. This is in contrast to Lee et al. (2021), who found that 40% of contacts were tested and 6% tested positive within 2-7 days.
Another main result from this study is that age was strongly associated with transmission risk—even when controlling for viral load (Figure 3 and S5). This result was not driven by the age of the potential secondary cases, as we found roughly the same pattern when adjusting for this (Figure S7). We found an overall positive association between age and transmission risk for adults (≥20 years), whereas the transmission risk decreased with age for children (<20 years) (Figure 3). This pattern was found independently of the Ct value of the primary case, but the overall the transmission risk increased with lower Ct values (Figure S8).
It is noteworthy, that we found that age dominated viral load in predicting transmission risk. For instance, the transmission risk doubles when the Ct value decreases from 36-38 to 18-20; similarly, the risk doubled when the age of the primary case increases from 20-25 years to 65-70 years, and triples, when the case is 80-85 years (Table 1). This pattern could be driven by the susceptibility of the potential secondary cases, as people tend to live with their partner, who is around their own age, and parents live with their children (Lyngse et al., 2020; Madewell et al., 2020). We investigated this and did find that primary cases tend to infect other persons around the same age within the household (Figure S8). However, when we control for age of the potential secondary cases, we still found an association between age and transmission risk (Figure 4).
Possible explanations for this finding could be that age is associated with increased viral exhalation (Edwards et al., 2021) or that the immunological response is associated with age (Long et al., 2020). The findings in children could be associated with younger children having closer contact with their parents than adolescents. Further clinical research is needed to clarify this.
Heald-Sargent et al. (2020) studied the viral load in 46 cases aged 0-5 years, 51 cases aged 5-17 years, and 48 cases aged 18-65 years. They found indications of high viral loads in children and speculated that they could be a main driver of the epidemic. Our results contradict this, as we did not find any association between the distribution of Ct values and age (Table S4). To calculate the distribution of Ct values across age groups, it is necessary to include a relatively large sample comprising all age groups.
As Ct values were only available for the primary cases identified by being tested in TCDK, only these primary cases were included in the analyses. We addressed the potential bias from not including primary cases that had been identified at hospitals (27%) by estimating the transmission risk stratified by primary cases identified at TCDK and hospitals (Figure S6). Primary cases identified in hospitals generally had a higher transmission risk, possibly because they were symptomatic. However, the trend in the age structured transmission risk was approximately the same across TCDK and hospitals, indicating that sample selection did not affect the general results.
We defined the primary cases as the first positive test within a household and all other persons living in the same household as potential secondary cases. We defined all secondary cases as those testing positive 1-14 days after the primary case. However, some of these co-primary and secondary cases may be misclassified, e.g., if they were infected earlier but not diagnosed, because they were pre- or asymptomatic. Including secondary cases found >14 days after the primary case could result in misclassification of secondary cases being either tertiary cases or having somewhere else as the source of secondary infections.
Optimal contact tracing naturally has to prioritize the order of new cases and their contacts. Our results suggest that contact tracing should prioritize cases according to Ct values, but more so, according to age.
In conclusion, we found that lower Ct values (indicating higher viral load) is associated with higher risk of SARS-CoV-2 transmission. However, even at high Ct values, transmission occurs. In addition, we found a strong association between age and transmission risk that dominated the Ct value association.
Data Availability
Data access can be granted through application to the Danish Health Data Authority.
6 Appendix A: Summary statistics
7 Appendix B: Sensitivity Analyses
Footnotes
* We thank Statens Serum Institut and The Danish Health Data Authority for data access and helpful institutional knowledge. We also thank the rest of the Expert Group for Mathematical Modelling of COVID-19 at Statens Serum Institut.