Abstract
Background Recent research points towards age- and sex-specific transmission of COVID-19 infections and their outcomes. The effect of sex, however, has been overlooked in past modelling approaches of COVID-19 infections.
Aim The aim of our study is to develop an age- and sex-specific model of COVID-19 transmission and to explore how contact changes effect COVID-19 infection and death rates.
Method We consider a compartment model to establish forecasts of the COVID-19 epidemic, in which the compartments are subdivided into different age groups and genders. Estimated contact patterns, based on other studies, are incorporated to account for age- and sex-specific social behaviour. The model is fitted to real data and used for assessing hypothetical scenarios with regard to lockdown measures.
Results Under current mitigation measures as of mid-August, active COVID-19 cases will double by the end of October 2020. Infection rates will be highest among the young and working ages, but will also rise among the old. Sex ratios reveal higher infection risks among women than men at working ages; the opposite holds true at old age. Death rates in all age groups are twice as high among men as women. Small changes in contact rates at working and young ages may have a considerable effect on infections and mortality at old age, with elderly men being always at higher risk of infection and mortality.
Discussion Our results underline the high importance of the non-pharmaceutical mitigation measures in the current phase of the pandemic to prevent that an increase in contact rates leads to higher mortality among the elderly. Gender differences in contact rates, in addition to biological mechanisms related to the immune system, may contribute to sex-specific infection rates and their mortality outcome. To further explore possible pathways, more data on COVID-19 transmission is needed which includes socio-demographic information.
1 Introduction
Right from the start of the COVID-19 pandemic, the importance of age on COVID-19 contraction and fatality has been recognised (among others, Esteve et al. (2020), Dudel et al. (2020), Kulu and Dorey (2020), Wu and McGoogan (2020), Karagiannidis et al. (2020)), as well as of coresidence patterns (Esteve et al. (2020)). Compartment and agent-based models aiming at projecting the spread of the disease have incorporated age as an important variable of transmission (e.g. Davies et al. (2020), Deforche (2020), Colombo et al. (2020), Blyuss and Kyrychko (2020), Balabdaoui and Mohr (2020)), in addition to other characteristics such as space (Colombo et al. (2020)) or contact patterns (Zhang et al. (2020)). An important determinant, which appeared to be largely overlooked in modelling exercises, is sex. In the following, we will refer to sex when discussing technical details and biological factors, and gender, when referring to social factors. While studies generally notice that infection and in particular fatality rates were higher among elderly men than women, the reverse appears to be true for infections at working ages (Sobotka et al. (2020)). In Germany, during the first wave of the pandemic through mid-May, infection rates were higher among women than men at working ages (Figure 1), while they were higher for men thereafter.
One reason for this difference, in addition to biological factors (see discussion below) may lie in gender-specific contact rates. Estimates of contact rates (van de Kassteele et al. (2017)) based on the POLYMOD study (Mossong et al. (2008)) showed that household, workplace and school structures strongly shape age- and gender-specific contacts made by individuals. Using the contact matrices from the latter study and calculating the ratio of the age-specific number of contacts for men and women (contacts men/contacts women) a clear pattern emerges (Figure 2): among ages 20–39, contacts are between 13%–26% higher among women, while among ages 50 to 69, they are 9%–14% higher among men. At the highest ages, the pattern reverses again, with women having slightly more contacts.
The aim of our study is to model COVID-19 transmission taking into account the two crucial demographic factors age and sex. We develop an SEIRD-model that incorporates age- and sex-specific contacts, which shape transmission rates. The model may be used for short- and long-term projections, our example explores short-term effects up to two and a half months of hypothetical changes in contact rates. The model can be used to develop scenarios which address the effects of age- and gender-specific changes in contacts due to the closing of schools, kindergarten and shops, or work in home office, as well as to explore the effect lifting of these measures. While we are not able to address these effects separately, we translate them into hypothetical changes in age- and sex-specific contact rates by developing three scenarios. The first scenario reflects a continuation of the situation of mid-August 2020; the second assumes a lifting of measures mainly at working ages, and the third extends this to children, adolescents, and young adults. The manuscript is structured as follows: First we introduce the basic SEIRD model and discuss how age- and sex-specific contact modelling was incorporated. We present the numerical implementation of the model, model fitting and the development of uncertainty intervals. Then we introduce our scenarios and present the projection results in terms of number of active infections (prevalence), and cumulated number of deaths by 31 October 2020. We also explore how increasing contacts affect sex-ratios in infections and deaths. We close with a discussion of the results, the strengths and limitations of our model, as well as policy implications.
2 Methods
The core of the epidemiological model is an SEIRD compartment model (see Hethcote (2000)) consisting of the epidemiological states S (susceptible, i.e. not yet exposed to the virus), E (exposed, but not infectious), I (infectious), R (recovered), and D (dead). The compartments represent individual states with respect to contagious diseases, i.e. COVID-19 in this case, and the transitions between them are considered on a population level (see Figure 3). In this sense, the compartment model is used to describe a population process, but is not intended to model individual processes with respect to COVID-19.
The following essential rate and fraction parameters are involved in the model:
β (contact rate): the average number of individual contacts per specified timespan that are potentially sufficient to transmit the virus (see below for detailed specification)
ρ (manifestation index, fraction): the fraction of people who become infectious at some time after being exposed to the virus
θ (incubation rate): the mean rate of exposed people to become infectious; 1/θ is the average incubation time
γ (recovery rate): the mean rate of exiting the infectious state, either to recovery or death; 1/γ is the average duration of the disease
τ (infection fatality rate): the fraction of people who die due to COVID-19
2.1 Contact modeling
The contact model is considered for a population of N individuals, which is decomposed into A disjoint groups. For each group a = 1, …, A, the proportion of individuals with regard to the whole population is Na/N, where Na denotes the number of individuals in group a. For any a ∈ {1, .., A} and b ∈ {1, …, A}, let λab be the average number of contacts of an arbitrary individual from group a with individuals in group b during a fixed base time unit δ, e.g. 24 hours.
More specifically, define ηab(t1, t2) as the random number of contacts of an individual in group a with any individual from group b over the timespan [t1, t2] and as the (random) overall number of contacts of an individual from group a. It is assumed that ηab(t1, t2) is Poisson distributed as via the contact intensity µab(t). By assuming independence of contacts to different groups, it follows that ηa*(t1, t2) is also Poisson distributed having intensity . The average rate of contact of any individual from group a with group b is then obtained as where for the sake of simplicity we assume that µab(t) is periodic in the sense that µab(t + δ) = µab(t) for all t ⩾ 0. Deviations from these assumptions can be incorporated by appropriate modifications to the contact model and parameter set. In the compartment modeling approach, individuals within each group are generally assumed to be homogenous with respect to contact behaviour and no individual effects are considered.
2.2 Group-specific system of ODEs
In order to address the potential impact of the implementation and easing of lockdown measures, we expand the model structure to group-specific compartments. Below, we define groups according to sex and age group, but the following reasoning is valid for any specification of disjoint groups, given that the resulting groups are sufficiently large. Specifically, for given groups a = 1, …, A and any time t, set Sa(t) as the number of susceptible people in group a at time t, Ea(t) as the number of exposed people in group a at time t, and so on. The group-specific compartment model is characterised by the ODE system for all groups a = 1, …, A, which is a direct extension of the ODE system of the basic compartment model for the special case A = 1. We define as the effective contact rate between groups a and b, where w is the secondary attack rate, mab is the specific mitigation effect by lockdown measures with regard to contacts between groups a and b, r is a general factor that accounts for compliance to distance, isolation and quarantine orders, hb is the proportion of infectious people in group b in need of hospitalisation and λab is the basic contact rate between groups a and b when no lockdown measures are in place. As we are primarily interested in short-term prediction, we do not model biological aging, i.e. transitions between demographic groups. Therefore, for any time t, compartment-specific additivity is assumed, i.e. S(t) = Σa Sa(t), E(t) = Σ a Ea(t), I(t) = Σ a Ea Ia(t), R(t) = Σ a Ra(t) and D(t) = Σ a Da(t) and N = S(t) + E(t) + I(t) + R(t) + D(t). The system is closed, meaning that the sum of all ODEs is 0 at each time t.
In the absence of any lockdown measures, the general contact patterns are characterised by the basic contact rates λab, which represent how intensive/often group a has any contact with group b sufficient for potential virus transmission. In the POLYMOD study (Mossong et al. (2008), 7,290 participants from 8 countries including Germany reported the number and extent of their social contacts during a randomly assigned 24 hour period, using a written diary. The age and gender of the contacted persons were recorded, among other information. Overall, the study contains information on 97,904 contacts, distributed across the 8 participating countries. The overall contact pattern for Germany is displayed in Figure 4.
The behaviour of the epidemiological model is primarily governed by the effective contact rates βab which result from the basic contact rates λab by accounting for the secondary attack rate and lockdown measures. It is implicitly assumed here that hospitalised cases are effectively isolated from the remaining population and can not spread the disease. Note that the product (1 − mab)(1 − r)(1 − hb) represents the proportion of potential virus transmissions that are not prevented.
2.3 Demand for hospital beds and intensive care units
Based on the compartment model, derivative states such as the demand for hospitalisation and demand for intensive care units can be modeled separately by imposing estimated proportions on the compartment I. More precisely, Ha(t) and Ca(t), i.e. the number of hospitalised persons and patients in intensive care in group a at time t are calculated as: where ha is the age-specific proportion of infectious people in need of hospitalisation and ca is the age-specific proportion of hospitalised cases that need intensive care. For these parameters, estimates are available from Imperial College COVID-19 Response Team (2020) and Verity et al. (2020); see Table 1.
3 Numerical implementation
We have implemented the suggested model in R using a discrete approximation of the ODE system via the Forward Euler Method (see Butcher (2016)). The step size Δt is chosen as a quarter fraction of one day. Accordingly, the transition rates between the compartments need to be adjusted, whereas the fraction parameters remain unchanged. For instance, if the average incubation time is 5 days and Δt = 1/4 (days), the transition parameter ϵ = 1/5 1/4 = 1/20, whereas the manifestation index ρ, as the relative proportion of exposed people developing symptoms, is the same for any Δt. The time-discrete approximation of the system of ODEs is therefore described as follows.
3.1 Model fitting
We suggest to fit the model along the following consecutive steps:
Determine a timespan {1, …, T} during which no lockdown measures had been in place, and determine the cumulative number of infections during this time.
Based on plausible ranges for the involved compartment parameters and the initial state of the compartment model, fit the contact intensity model with regard to the cumulative number of infections during {1, …, T}.
In order to derive the secondary attack rate w from the contact rates λab given in van de Kassteele et al. (2017), we fit the proposed compartment model to the reported cases during a timespan {1, …, T} of no lockdown. This step is necessary, because the social contact rates λab do not incorporate the specific transmission characteristics of SARS-CoV-2, such as the average length of the infectious period and average infection probability per contact. We assume that w is not specific to age or sex. We employ as a least-squares criterion function in order to determine the optimal value , where Icum are the observed cumulative infections, and are the estimated cumulative infections based on the epidemiological model given w. Hence, is the scalar parameter for which the cumulative infections are best predicted retrospectively. Note that the observed cumulative number of infections is usually recorded for each day, while the step size Δt in the model may be different. Thus, appropriate matching of observed and estimated values is necessary.
This fitting method requires that the number of infections for the geographical region considered is sufficiently large, such that the mechanics of the compartment model are plausible. Note that potential under-ascertainment may not substantially change the optimal value of w as long as the proportion of detected cases does not strongly vary over time. Furthermore, the suggested fitting method is based on the assumption that the probability of virus transmission is independent of age and sex, given that a contact has occurred. If different propensities of virus transmission are allowed for, the contact matrix may be correspondingly adjusted along introduced parameters w1, …, wab for each group combination or w1, …, wa, if the probability of transmission only depends on the contact group. The criterion function is likewise extended as (w1, …, wab) ↦Q(w1, …, wab). However, optimisation in this extended model requires a sufficiently large number of transmissions and detailed information on the recorded infections, and may lead to unpractically vague estimates otherwise. Therefore, we suggest to employ the simpler model with univariate w first.
3.2 Sensitivity analysis and parameter uncertainty
In order to account for parameter uncertainty, we develop uncertainty intervals for the number of people in each compartment. As a cautionary remark, note that these intervals are not to be equated to confidence intervals in the classical sense. Though the resulting intervals are conceptually comparable to Bayesian credibility intervals, they are to be distinguished in that no prior distribution is explicitly assumed here. Note that these intervals do not reflect uncertainty in terms of the underlying infection data.
We predict the number of cases in each age-specific compartment using a Monte Carlo simulation method. For each simulated run, all parameters are independently drawn from their respective range, yielding an instance of a hypothetical parameter setup. Given these parameters, the SEIRD ODE model is approximated using the Forward Euler Method and known initial states, as described above. After NR of such simulated runs, the prediction intervals for all relevant values are construed based on the pseudo-empirical trajectories of the compartment model. Furthermore, prediction intervals are derived as point-wise quantile ranges for each t. For instance, an 80% prediction interval for the number of infectious people in group a at time t is [Ia,10%(t), Ia,90%(t)].
4 Analytical approach and scenarios
First, we fitted the model to observed COVID-19 infections using transition rates from literature as described under Section 2 for the period 21 February to 13 March 2020. We estimated the model parameter w, also termed secondary attack rate, which reflects the probability of infection per contact, by least squares between observed and predicted values, as described in Section 3.1. Second, we developed three scenarios starting our projections on 15 August 2020 and, using quarter-days as base time, ending on 31 October 2020. The first scenario, which is our baseline scenario, assumed that the age- and sex-specific contacts are down by 80%, i.e. only 20% of the contacts estimated by van de Kassteele et al. (2017) were realized between start and end of the projection. This applied to all age groups and to both sexes. This scenario should reflect continuous distancing measures as were present in mid-August. The second scenario assumed that contacts at working ages 30–59 were increased by 4 percentage points (PP), and among those aged 60–69 by 2 PP, equaling a decline of 76% and 78% respectively. All other ages remained at 80% contact reduction. This should reflect the return from home office settings, the opening of shops, cafes, restaurants, etc. The third scenario considers an additional increase in contact rates among ages 10–29 by 4 PP, which should reflect the opening of schools and venues mainly visited by young individuals. We explored the following age-specific outcomes:
Number of active infections which were defined as the number of individuals in compartment I by sex,
Cumulative number of deaths out of compartment I by sex,
Excess number of deaths in scenarios 2 and 3 in comparison to scenario 1 by sex,
Sex ratio of incidence defined as male/female ratio of the number of new COVID-19 cases divided by the total population (Sa)
Sex ratio of mortality rate defined as male/female ratio of the number of deaths out of compartment I divided by the total population (Sa).
5 Results
Fitting our model to COVID-19 infections observed during our fitting period (21 Feb – 31 March 2020) results in an estimate of the secondary attack rate w ≈ 13%. We started with 10,572 active infections on 15 August 2020 and under Scenario 1 this figure increased to approximately 19,814 (Figure 3) (men: 9,811; women: 10,002). The number of active infections was highest at age 30–39 (men: 1,680; women: 1,857), followed by age 10–19 (men: 1,649; women: 1,742), and age 40–49 (men: 1,606; women: 1,594). The cumulative number of deaths increased from 9,258 to 9,960 with 5,520 men and 4,440 women. By 31 October 2020, infection rates (Table 2) were highest among the 10–19-year old (men 33.5 and women 36.7 per 1000 individuals) followed by ages 20 to 49 (18.1–27.3), and ages 0–9 (22.2–23.1). At ages above 50, infection rates declined rapidly, almost halving from individuals in their fifties (16.2–18.9) to those in their sixties (8.7–10.4), while at older ages the decline followed at a much lower pace (ages 70–79: 6.7–6.5; ages 80+: 4.9–5.6). Sex ratios of infections were below 1 in the age interval 10 to 39, indicating a higher risk of infections among women. From age 50 onwards they were generally above 1, thus turning the disadvantage towards men. As expected, death rates (Table 2) increased with age with a decline at the oldest ages probably reflecting health selection or better protection of the oldest old. They were more than twice as high among men than women, again with the exception of the oldest age group, where men might be positively selected by health. Scenario 2 assumed increased contacts at working ages and arrived at 28,520 active infections by 31 October 2020 and therefore 8,520 active infections more than in scenario 1 (men 4,434; women 4,274). These additional infections stemmed from all ages, even if the risk of infections increased most among the working ages. Sex ratios of infection rates remained unchanged, because we increased contact rates at the same proportion for both genders. The additional infections translated into an additional 422 deaths (men: 234; women 188); among women, three quarters of these deaths resulted at ages 80 and above; among men, 53%, reflecting their higher mortality already at younger ages. Sex ratios of death rates remained unchanged as compared to Scenario 1, reflecting our model assumption of parallel increase in contact rates for both genders. Scenario 3 with increased contacts at young and working ages resulted in 42,369 active infections and thus 22,555 more than in Scenario 1 (men 11,147; women 11,410) which translated into an additional 566 deaths with the majority resulting from ages 80 and above (women 75%; men 53%). There was little change in sex ratios as compared to the other two scenarios.
6 Discussion
Incorporating age- and sex-specific contact rates in a COVID-19 compartment model permits exploration of the effects of changes in mitigation measures on the two genders. We developed three scenarios which assumed ongoing distancing measures versus easing of contact restrictions in working ages, and among adolescents and young adults. Our projections do not set out to forecast the actual number of COVID-19 infections in a time span of about two months, they rather assess the effect of increased contacts on the infection and mortality risks of the two genders and the various age groups. The fit of our model to the baseline period in February and March results in an estimated secondary attack rate w ≈ 13%, putting our findings in close agreement with the rates reported in Ghangdou, where the household w varied between 12% and 17%, and the non-household w between 6% and 9% (Jing et al. (2020)), although higher attack rates of up to 35% have been reported e.g. for meals and holiday visits (Liu et al. (2020b)). Three important lessons can be learned from our scenarios.
First, even a small change in contact rates has a large impact on infections and deaths. In our projections we assumed an increase ranging from 4 to 2 PP. This reflects the fact that without non-pharmaceutical mitigation measures (NPMM) such as masks, physical distance between individuals, better air ventilation and hygiene, and without contact tracing, the infection rates would return to the initial exponential increase. This was reflected in a reproduction rate of 3.3 to 3.8, as observed at the beginning of the pandemic (Lin et al. (2020), Liu et al. (2020) and Alimohamadi et al. (2020), RKI (2020)). However, the presence of NPMM also mitigates the effect of the increase in contacts due to the return to office, opening of shops, restaurants, as well as schools, and venues visited by young adults, leaving it far from the initial impact. In our present scenarios, both effects, the change of contact rates and the change of their impact, are captured in the reduction matrix (mab), which is multiplied with the matrix of the contact rates. One alternative approach would be to develop separate scenarios for changes in the secondary attack rate w due to NPMM and changes in the contact rates (mab), which is one possibility to modify this analysis further. At any rate, our scenarios show that small changes already have large impacts on infections and deaths. This implies that the impact of contacts must be diminished considerably to allow increases in contacts without returning to exponential growth of infections, hence underlining the high importance of the NPMM in the current phase of the pandemic.
Second, due to intergenerational contacts, any easing of measures in working and young ages will inevitably lead to an increase in infections and deaths, the latter mainly at old ages. Over all ages, deaths will increase by 60% when contacts increase at working ages, and increase by 80% when contacts also rise among the young. The vast majority of these increases occur at old ages, with 75% among women and 60% among men, whereby the fatality among men is more than twice as high as among women. Thus, elderly men are at a particular risk of death due to increased contacts. However, our model assumptions are based on fatality rates at the beginning of the pandemic, which may have changed because of better treatment options of critically severe COVID-19 cases using, e.g., dexamethasone (Cain and Cidlowski (2020)). Thus, we might overestimate mortality under current knowledge and treatment options. Still, increases in contacts need to be accompanied by special measures protecting the elderly from death, with-out negative physical and mental health consequences due to quarantine and isolation measures (Galea et al. (2020)). Contrary to deaths, infections will mainly increase at young and middle ages with a lower risk of severe COVID-19 symptoms or even asymptotic disease courses.
Third, small changes in contact rates will not change the sex ratios in infections and deaths. At all ages, men will have more than twice the mortality risk from COVID-19, while the risk of infections is more frequent among working age women than men. At old ages, men have a higher infection risk. Note that, in absolute numbers, more women are diagnosed with COVID-19 at old age due to their higher life expectancy. Here a more substantial question arises, namely whether COVID-19 infection rates are indeed gender-specific. German COVID-19 infection rates, as in any other country, are biased by the time-lag of reporting and by differential availability of PCR-tests over time and to subgroups of the population (RKI (2020)). Gender-specific diagnoses in favour of women may reflect that higher contact intensities of women may have led to a higher rate of PCR tests and therefore to a smaller number of undiagnosed cases. In addition, women are more health-conscious than men (Oksuzyan et al. (2020)) and may have sought PCR testing to a higher degree even when presenting with weaker symptoms. On the other hand, Takahashi et al. (2020) found sex-specific differences in immune response to COVID-19 infections. For a further discussion of potential sex-specific mechanisms modulating the course of disease, see also (Gebhard et al. (2020)). Thus, we can conclude that both biological and social factors contribute to sex- and gender-specific infection and mortality rates and that they are stable given small changes in contact rates.
We focused on the practical emulation of the dynamic behaviour and process of the spreading of COVID-19 while incorporating specific epidemiological information on the virus and disease. To achieve this aim we used a compartment modeling framework, which has become a standard approach in epidemiology due to its flexibility and accessibility. The main advantage of this modeling framework is that a considerable amount of demographic and epidemiological information can be incorporated while the essential model structure and implementation remain relatively simple. Similarly, it is possible to extend the model to incorporate parameter uncertainty, as described above. Furthermore, we want to emphasize the Markov-like property of compartment modeling in the sense that current compartment sizes on a specific date are sufficient for deducing the subsequent behaviour of the epidemiological process, which makes the framework particularly attractive for forecasting and investigating hypothetical scenarios. However, there is one drawback to compartment modelling that it is inherently based on an averaging rationale which treats population groups homogenously and the average number of contacts in each group is a determining parameter. In contrast to truly stochastic models (such as agent-based models), no random or systematic individual deviations from the fundamental contact patterns are taken into consideration. In addition, geographical and spatial information are not explicitly considered in compartment modeling, and this further limits the scope of the forecasting results.
In general, assessing the impact of introducing or easing different lockdown measures is remarkably difficult, especially because several aspects are usually changed simultaneously and the general behaviour of the population may change dynamically at the same time. Some efforts have been made to address these issues in the literature, however we advise against using the proposed model for such purposes. One main reason is that the initial state for forecasting and fitting of the model relies primarily on available data sources, which are in the form of reported count data. In addition to the general limited validity of observational data, there is still insufficient knowledge on the specific characteristics of COVID-19 and the actual current spread of the virus. Naturally, other modeling approaches face the same issues of data quality.
In our COVID-19 forecasts, the number of infections and the number of deaths differ only slightly from models which do not differentiate by sex (data not shown). However, age- and sex-specific models provide better insight into the risk populations of infections and mortality. This helps to target health policy measures under scarce resources, such as who should be tested and vaccinated first. Both biological sex and social gender appear to affect COVID-19 infection rates and their outcomes; this needs to be acknowledged in health policy decisions and medical treatment. To further explore social factors on COVID-19 transmission, more information that includes socio-demographic data is needed.
Data Availability
All data that has been used in the manuscript is publicly available. An R implementation of the proposed model is available via Github: https://github.com/AchimDoerre/Covid-19