Optimal Allocation of Limited Test Resources for the Quantification of COVID-19 Infections
==========================================================================================

* Michail Chatzimanolakis
* Pascal Weber
* George Arampatzis
* Daniel Wälchli
* Ivica Kičić
* Petr Karnakov
* Costas Papadimitriou
* Petros Koumoutsakos

## Abstract

The systematic identification of infected individuals is critical for the containment of the COVID-19 pandemic. Presently, the spread of the disease is mostly quantified by the reported numbers of infections, hospitalizations, recoveries and deaths; these quantities inform epidemiology models that provide forecasts for the spread of the epidemic and guide policy making. The veracity of these forecasts depends on the discrepancy between the numbers of reported and unreported, yet infectious, individuals.

We combine Bayesian experimental design with an epidemiology model and propose a methodology for the optimal allocation of limited testing resources in space and time, which maximizes the information gain for such unreported infections. The proposed approach is applicable at the onset and spreading of the epidemic and can forewarn for a possible recurrence of the disease after relaxation of interventions. We examine its application in Switzerland; the open source software is, however, readily adaptable to countries around the world.

We find that following the proposed methodology can lead to vastly less uncertain predictions for the spread of the disease. Estimates of the effective reproduction number and of the future number of unreported infections are improved, which in turn can provide timely and systematic guidance for the effective identification of infectious individuals and for decision-making.

Keywords
*   Bayesian Optimal Experimental Design
*   Epidemiology
*   COVID-19

## 1. Introduction

The identification of unreported individuals infected by the SARS-CoV-2 virus is critical for the quantification, forecasting and planning of interventions during the COVID-19 pandemic [1]. Presently the spread of the disease is mostly quantified by the reported numbers of infections, hospitalizations, recoveries and deaths [2]. These quantities inform epidemiology models that provide short term forecasts for the spread of the epidemic, help quantify the role of possible interventions and guide policy making. The veracity of these forecasts depends on the discrepancy between the numbers of reported and unreported, yet infectious, individuals.

In recent months the estimation of unreported infections has been the subject of several testing campaigns [3, 4]. While there is valuable information being gathered, their estimates rely on testing individuals that are either already symptomatic or have been selected based on certain criteria (hospital visits, airport arrivals, geographic vicinity to researchers, etc.). Generic, randomized tests of the population are broadly applied but they have been hampered either by delays [5] or by insufficient numbers of test kits [6]. There is broad recognition that efficient testing strategies are critical for the timely identification of infectious individuals and the optimal allocation of resources [7]. However, targeted testing entails bias while randomized tests require access to a high percentage of the population with commensurate high costs. The quality of the data as well as the ways they are incorporated in the epidemiology models is critical for their predictions and for estimating their uncertainties [8]. Having the capability to minimize these uncertainties by suitably distributing in space and time a given number of test kits is the subject of this work. This optimal allocation of testing resources and the respective increase of the fidelity of forecasting models are essential to effective policy making throughout the pandemic.

Here, we present a methodology for the OPtimal Allocation of LImited Testing resourceS (OPALITS) that maximizes the information gain over any prior knowledge regarding infections. The method relies on forecasts by epidemiological models with parameters adjusted through Bayesian inference as data become available through suitable surveys [9]. The forecasts are combined with Bayesian experimental design [10, 11, 12] to determine the optimal test allocation in space and time for various objectives (minimize prediction uncertainty, maximize information gain of unreported infections). We emphasize that the proposed OPALITS is applicable in all stages of the pandemic, regardless of the availability of data.

We employ the *SEI**r**I**u**R* model [13] that quantifies the spread of a disease in a country’s population distributed in a number of communities that are interacting through mobility networks. The *SEI**r**I**u**R* model predicts the number of susceptible (*S*), exposed (*E*), infectious reported (*I**r*), unreported (*I**u*), and removed (*R*) individuals from the population. Here we focus on Switzerland and consider its cantons as the respective communities. The model parameters are: the relative transmission rate between reported and unreported infectious individuals (*µ*), the virus latency period (*Z*), the infectious period (*D*) and the reporting rate (*α*). The transmission rate (*β*) and the mobility factor (*θ*) are considered to be time dependent in order to account for government interventions. For all stages of the epidemic, the respective uncertainties of the model parameters are quantified and propagated using Bayesian inference. At the onset of the epidemic, the uncertainty is quantified through prior probability distributions. As data of daily infections become available, the uncertainty in model parameters is updated through Bayesian inference. The parameter probability distributions are used to propagate uncertainties in the model forecasts and can assist decision makers in quantifying risks associated with the progression of the disease. The proper quantification of uncertainty bounds in the model parameters has a profound effect on predictions of the disease dynamics [8]. Large uncertainty bounds around the most probable parameter values hinder the decision process for identifying effective interventions.

The OPALITS aims to assign limited test-kit resources to acquire data that would reduce the model prediction uncertainties. Minimizing the uncertainty of the model parameters leads to more reliable predictions for quantities such as the reproduction number [14]. Moreover, the reduced model uncertainties help minimize risks associated with the decision making process including timing, extent of interventions and probability of exceeding hospital capacity.

We quantify the information gain from these tests using a utility function [15, 12] based on the Kullback-Leibler divergence between the inferred posterior distribution and the current prior distribution of the model parameters. The prior can be formulated using the posterior distribution estimated from daily data of the infectious reported individuals up to the current date (see Materials and Methods). Hence, at any stage of the epidemic, the OPALITS provides guidance for the time and location/community where testing needs to be carried out to maximize the expected information gain regarding infections in a population.

We demonstrate the simplicity and applicability of the present method in estimating the spread of the coronavirus disease in the cantons of Switzerland. We find that the OPALITS methodology outperforms non-specific, randomized testing of sub-populations throughout the COVID-19 pandemic. The proposed strategy is readily applicable to other countries and the employed open source software can readily accommodate different epidemiological models.

## 2. Results

### Optimal Allocation of Limited Testing Resources (OPALITS) during the COVID-19 pandemic

We present the optimal test-kit allocation strategy for three stages of the epidemic: (i) starting phase (blue), (ii) containment after enforcement of interventions (red) and (iii) relaxing of interventions and monitoring for a possible second outbreak (green) (Fig.2). The strategy relies on Bayesian experimental design and can operate when no data are available (as in the start of the epidemic) as well as when data have been accumulated, as in the last two stages of the epidemic. Testing campaigns rely on acquiring randomized samples from a population. The collected data, together with epidemiological models, help determine quantities of interest, such as the basic reproduction number of the disease [14]. By suitably adapting the testing campaign, the data can help reduce the model uncertainty, thus enabling improved estimates regarding the severity of the epidemic.

A testing campaign consists of a set ***s*** of surveys *s**i* = (*k**i*, *t**i*) which are labeled by *i* = 1, … *M**y* and performed in locations *k**i* ∈ 𝒞 and on days *t**i*∈ 𝒯, where 𝒞 and 𝒯 are the set of all available locations and days, respectively. In this paper a survey aims to determine the number of unreported infectious individuals in a particular location on a particular day. In the following we assume limited testing resources, where *N* test-kits are available and each testkit corresponds to testing one person. The goal is to allocate these test-kits in different times and locations so that we maximize the information gain regarding forecasts of the epidemiology model. The locations are the different Swiss cantons, and 𝒞 := {ZH, BE, LU, …} is the set of the strings with canton name abbreviations.

The results of the survey in a canton enable the estimation of a desired quantity of interest, such as the size of the unreported infected population (*I**u*). The number of samples needed to estimate population proportions within a given confidence interval, error tolerance, and probability of proportion is given by Cochran’s formula [9] corrected for a finite population size. Using Cochran’s formula with confidence level 99%, error tolerance 1% and probability of infection 0.1 we find that the samples that would be required to survey the largest Swiss canton (of Zurich) are approximately 5950. All the other cantons need up to 14% less samples with the exception of the smallest canton that needs 27% less samples (figure S7 of the Supplementary Information). Hence we assume the minimum sample size is the same for all cantons. Assuming random sampling of a population with higher probability (up to 0.9) of infection or requiring tighter error bounds, would have implied even more samples according to Cochran’s formula. We note that as of October 2020, 1500 tests per one million people are performed on a daily basis in Switzerland [16]. This amounts to approximately 460 individual tests per canton, which is about an order of magnitude less than what would be required from Cochrans’s formula for an informative random sampling. In turn, by using the proposed OPALITS we can compensate for this lack of test kits with an optimal and systematic process.

We outline the application of the proposed approach to a country with distinct administrative units (cantons in the case of Switzerland) (see figure 1). First, we determine how many cantons will be surveyed, given the number of available test-kits *N*. Then, the sequential optimization of the expected utility function is performed (see Materials and Methods) to identify optimal survey locations (cantons). We then distribute the test-kits to the identified cantons and test a random subset of their population on the suggested day. After collecting the results from all the surveys we update the prior distributions of the model parameters. The collected data leads the maximal information gain in the model parameters. This in turn translates into minimal uncertainty in predictions made with the model for quantities such as the number of unreported infections.

![Figure 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/10/2020.11.09.20228320/F1.medium.gif)

[Figure 1:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/F1)

Figure 1: Schematic for the deployment of the Optimal Allocation of Limited Testing Resources (OPALITS) methodology

![Figure 2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/10/2020.11.09.20228320/F2.medium.gif)

[Figure 2:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/F2)

Figure 2: Testing scenarios for the COVID19 outbreak in Switzerland.
Daily reported Coronavirus cases in Switzerland are plotted as gray bars. The period before (blue), during (red) and after (green) imposing non-pharmaceutical interventions are marked with color.

The expected information gain of a particular strategy for selecting the survey locations/times ***s*** is quantified by a utility function *Û*(***s***) [15]. The maximum of this function corresponds to an optimal strategy that yields the most information about the quantities of interest. The expected utility function can be understood as a measure of the difference between prior knowledge of the model parameters and the posterior knowledge, after surveys have been conducted in a set of locations and dates. Given such a set, the utility function estimates the expected difference, equivalently the information gain, by taking the expectation over all possible survey results.

The OPALITS relies on forecasts by suitable epidemiological models. In turn, these forecasts rely on prior information and their predictions are further adjusted as data become available in a Bayesian inference framework [17]. The set of Ordinary Differential Equations (ODEs) describing the *SEI**r**I**u**R* model [13] are integrated to produce the model output. The uncertainty of the model output and its discrepancy from the available data is quantified through a parametrized error model. The resulting stochastic model and its quantified uncertainties are then used to identify the optimal spatio-temporal allocation of limited test resources.

#### Case 1: Beginning of the epidemic - Optimal testing without data

At the start of an epidemic, there are no data and we assume no other prior information regarding the spread of the pathogen in a country. The initial conditions for the number of unreported infections ![Graphic][1]</img> were selected with non-zero values for the cantons of Aargau, Bern, Basel-Landschaft, Basel-Stadt, Fribourg, Geneva, Grisons, St.Gallen, Ticino, Vaud, Valais and Zurich based on their population and their large number of interconnections. Due to the lack of any prior information and relevant data, all the parameters are assumed to follow uniform prior distributions (see table S5, for details).

The first infectious person in Switzerland was reported on February 25th in the canton of Ticino ![Graphic][2]</img> with no initial reported infections in all other cantons. The initial number of exposed individuals is set proportional to the number of unreported infections ![Graphic][3]</img> in accordance with the value of *R* ≈ 3 reported in [18] in the initial stage of the disease. The rest of the population is assumed to be susceptible. The methodology involves parameters of interest (***ϑ*** = (*β, µ, α, Z, D, θ, c*)) and nuisance parameters ![Graphic][4]</img>) that the testing strategy does not aim to determine (see Materials and Methods section for definitions).

The estimated expected utility functions *Û*(***s***) for up to four surveys in the cantons of Switzerland for a time horizon of 8 days is shown in Figure 3, 𝒯 = {Feb 25, …, Mar 3}. Higher values for expected utility are estimated in cantons with larger population reflecting the larger relative uncertainty for cantons with only few reported cases. This implies that smaller cantons, with lower mobility rates, are less preferred for performing tests since their contribution to the information gain is not significant. This is reflecting the fact that the assumed covariance matrix is shared among cantons (see Materials and Methods). This implies a smaller relative error, when surveying larger cantons with consequently higher number of infections. The Bayesian analysis enables the inference of the particular cantons and days for which a survey should be performed in order to maximize the information gain. Accordingly, the most informative survey should have been made in Zurich on March 2nd. The optimal location and time for the second survey is determined to be canton of Vaud on the 27th of February. As expected, the information gained from tests in the canton of Vaud is less than the information gained from the canton of Zurich. The information that would have been gained by surveying the next two selected cantons of Vaud and Basel-Landschaft on March the 3rd and February the 28th respectively, is progressively reduced to a small level that, given the testing costs, does not justify carrying out surveys in more than 4 cantons. The values of the optimal times are listed in table S1 in the Supplementary Information.

![Figure 3:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/10/2020.11.09.20228320/F3.medium.gif)

[Figure 3:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/F3)

Figure 3: Expected information gain during start of epidemic.
The blue curve corresponds to the utility of making one survey. The green curve is the utility when a second survey is added, provided the location and time of the first survey correspond to the maximum of the blue curve (found in the canton of Zurich, on March 2nd). Similarly, the yellow and red curves show the utilities for a third and fourth surveys, when the locations and time of the previous surveys are fixed to their optimal values. The fixed dates and location of each survey are plotted with black dashed lines. The shaded areas indicate the difference to the expected information gain of the previous survey, which becomes thinner as additional surveys do not yield a further significant information gain.

The results indicate that the proposed OPALITS methodology selects certain populous and well interconnected cantons at specific times to acquire the most information for estimating the model parameters.

#### Case 2: Exponential spreading and optimal testing strategy during non - pharmaceutical interventions

When the spreading of the coronavirus entered an exponential growth stage, several governments (including the Swiss) decided to take non-pharmaceutical interventions such as requesting social distancing, closing schools and restaurants, or ordering a complete lockdown in order to contain the epidemic. Here, the goal of the OPALITS is to propose surveys that would help to better assess the effectiveness of these interventions.

In this case, probability distributions of model parameters are informed using data from the existing spread of the COVID-19. The daily reported infections in Switzerland [19] from the 25th of February up to the 17th of March 2020 are used to update the distributions, specified in the previous phase, by using Bayesian inference. The marginal posteriors are plotted in figure S1 of the Supplementary Information. The *SEI**r**I**u**R* models the non-pharmaceutical interventions with a time-dependent transmission rate *β* and mobility factor *θ*. These parameters are calibrated by the data and provide an estimate on the timing and effectiveness of the interventions[8].

Figure 4 shows the maximum values of the information gain for each survey for 𝒯 = {Mar 17, …, Mar 30}. For cantons with a small population and low connectivity to other cantons a low information gain is found. The opposite can be observed for cantons with large population and strong connections to other cantons. The values for the maximum utility in time for the measurements are listed in Table S2. If only a single canton were to be selected(due to limited availability of test-kits in the country), then a survey in the canton of Vaud carried out on the 30th of March were to be preferred over surveys in either of the cantons of Zurich, Bern or Geneva (blue in figure 4). If two surveys could be afforded, the OPALITS methodology proposes to carry them out in the same canton (Vaud) on the 17th and on the 30th March (blue and green in figure 4). Note that the canton of Zurich, ranked as the next preferred canton for a single survey (blue in figure 4), is not selected by the methodology since part of the information that would be gained from testing is already contained in surveys performed in Vaud. In case more test kits were available, in addition to the two tests in Vaud, the optimal location and time for a third survey would have been the canton of Grisons on the 30th of March (yellow in figure 4). The canton of Zurich is proposed as the fourth location to be surveyed on the 30th of March as well. However, the information gain from the fourth survey in the canton of Zurich is approximately 10% of the total information gained from the surveys carried optimally in the first three cantons.

![Figure 4:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/10/2020.11.09.20228320/F4.medium.gif)

[Figure 4:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/F4)

Figure 4: Optimal testing strategy for effect of non-pharmaceutical interventions.
The maximum gain of information is plotted on the map of Switzerland using an exponential colormap. Here blue corresponds to taking one survey, green to adding a second, yellow to a third and red to a fourth. Below the map we plot the magnitude of the expected information gain of each survey, along with the optimal measurement dates per canton.

The results suggest that surveys in two locations/times provide significant information regarding assessing the effectiveness of interventions. Further tests on more locations/times did not add substantial information. It is evident that a trade-off between the required information gain and cost of testing are decisive for the number of necessary surveys and respective test kits.

#### Case 3: Optimal monitoring for a second outbreak

After the relaxation of measures that assisted in mitigating the initial spread of the disease, it is critical to monitor the population for a possible second outbreak. The OPALITS methodology supports such monitoring with surveys of the population based on data up to and after the release of the measures.

First, Bayesian inference is performed with data available from February the 25th up to June the 6th, to update the uniform priorsthe resulting marginal posteriors are shown in figure S2 of the Supplementary Information. This date is in accordance to the first stage of major release of measures in Switzerland [20]. The effects of interventions are modeled by a parametrized time-dependent transmission rate and mobility factor (see Materials and Methods). The inferred probability distributions of these additional parameters are taken into account as the OPALITS maximizes the information gain. Note that 𝒯 = {Jun 7, …, Jun 14} in this case.

Subsequently, data from February the 25th up to July the 9th are included, repeating the Bayesian inference and estimating the marginal distributions and predictions shown in Figures S3 and S4, 𝒯 = {Jul 10, …, Jul 17}. The results indicate that the relaxation of measures correlates with an increase in the number of reported infections (Figure 5). The information gain for each canton indicates the most informative surveys should be performed a week after performing the inference. The provided information could then assist in estimating the severity of a second outbreak as indicated by the maximum of the utility in time (Tables S3 and S4). Given that tests should be carried out in four locations and times, the methodology promotes optimal surveys for two different times, within a week, in the cantons of Zurich and Vaud. First, surveys should be performed in Zurich, providing high information gain for both considered cases. The next two surveys are to be performed in Zurich and Vaud, with a rank that depends on the considered case, while the fourth test should be performed in Vaud. We find that the information gain from the last test is approximately 10% of the cumulative information gain from the first three surveys. The number of surveys can be then selected according to the available test-kits *N*.

![Figure 5:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/10/2020.11.09.20228320/F5.medium.gif)

[Figure 5:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/F5)

Figure 5: Optimal testing strategy to monitor a second outbreak.
Bayesian inference determines the parameters of the first infection wave using the data (black dots) of the daily new reported infections up to the 6th of June (upper plot) and to the 9th of July (lower plot). The 99% confidence intervals are plotted in gray. The proposed testing strategy is plotted with vertical bars at the found optimal days. Here blue indicated the utilities for the first survey. The green bars correspond to the gain in utility when adding a second survey assuming the first was chosen in the optimal location, where the yellow and red correspond to adding a third and fourth survey.

#### Case 4: Effectiveness of Optimal Testing

We demonstrate the importance of following the OPALITS by comparing it with a non-specific testing campaign that is based on heuristics. We first re-examine the situation at the start of an epidemic and assume that the available resources allow for two surveys. Surveys are simulated by evaluating the epidemiological model with the maximum a-posteriori estimate (MPE) of the parameters obtained from the inference in phase II (exponential growth) of the epidemic. We used data of the first 21 days of the infection spread in Switzerland [19] (February 25th to March 17th). After evaluating the model, artificial surveys are obtained by adding a stochastic error term.

For the optimal strategy, data are collected by consulting figure 3. Thus, the two surveys are performed in the cantons of Zurich and Vaud, on the 2nd of March and the 27th of February respectively. For a non-specific strategy, the cantons of Ticino and Bern were selected, on the 28th of February. We remark that this is the canton where the first infection was reported and the capital of the country. These artificial data, obtained for the two strategies, are added to the real data of the daily reported cases from the first 8 days after the outbreak in Ticino. For the expanded data-set 𝒟 the posterior distributions ![Graphic][5]</img> are found by sampling the model parameters using nested sampling [14].

The resulting one- and two-dimensional marginalized posterior distributions for both strategies are shown in figure 6. We note that the dispersion coefficient *r* (defined in the Materials and Methods) in the error model for the real data (the reported infections) and the correlation parameter are almost the same for both strategies. However the model parameters show significant differences even when only two new data-points are added to a set of 208 data-points. The posterior distributions of the parameters of interest are propagated through the epidemiology model to provide the uncertainties in the number of unreported infectious individuals. In figure 7 the model output for the total number of unreported infections is plotted together with a 99% confidence interval along with the true value of the unreported cases obtained by using the selected parameters. The predictions from the OPALITS have a much higher certainty with a confidence interval that is up to four times narrower than the one from a non-specific strategy. The same figure also shows the relative histogram plots for the effective reproduction number, which for the employed model is given from *R**t* = *βDα* + *βDµ*(1 − *α*) [13]. Not only is the histogram more peaked, when data is optimally collected, but also the mean value of the two histograms is different. When data is optimally collected, the found mean value for the effective reproduction number is 2.1, whereas when the non-specific strategy is followed the average value is 3.2. A mean value of 3.2 could lead to more strict non-pharmaceutical interventions, which might prove unnecessary and harmful for the economy.

![Figure 6:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/10/2020.11.09.20228320/F6.medium.gif)

[Figure 6:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/F6)

Figure 6: Marginal posterior distributions for two strategies.
The diagonal shows the histogram for the marginal distribution for every parameter. Purple indicates posterior for the survey following the optimal testing strategy, gray the one for the non-specific strategy. The lower half and upper half show the samples of the joint distribution of two parameters for the optimal and the non-specific strategy respectively. Here black indicates low density and yellow high density.

![Figure 7:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/10/2020.11.09.20228320/F7.medium.gif)

[Figure 7:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/F7)

Figure 7: Prediction uncertainty for different testing strategies.
Up: The black dots show the actual unreported infectious for an artificial spread in Switzerland. The error bounds show the 99% confidence intervals of the model output for samples of the parameters with data obtained by optimal (purple) and non-specific testing (gray). Down: Relative frequency histograms for effective reproduction number, predicted with data obtained by optimal (purple) and non-specific testing (gray).

Further comparisons, demonstrating the value of the OPALITS, include model predictions with higher certainty, as indicated by confidence intervals that are narrower than the ones obtained from from a non-specific strategy (figures S5 and S6, see Supplementary Information). Narrower uncertainty bounds provide higher confidence for decisions related to possible interventions to contain the epidemic.

## 3. Discussion

We introduce a systematic approach to identify optimal times and locations for epidemiological surveys to quantify infectious individuals in a country’s population during the COVID-19 epidemic. The proposed OPALITS methodology exploits prior information and available data to maximize the expected information gain in quantities of interest and to minimize uncertainties in the forecasts of epidemiological models.

The present study addresses the need for an accurate assessment of COVID-19 infections [21] and it is shown to be far more accurate than the currently applied random testing. The proposed methodology is, to the best of our knowledge, the first method to propose an optimal spatio-temporal allocation of limited test-kit resources. A first study for the estimate of unobserved COVID-19 infections [5] in the USA indicated that early testing would have decreased the surveillance gap during a critical phase of the epidemic. More recently a number of studies have emerged that address the optimal allocation of resources. The “Test and Contain” process suggested in [7] addresses an idealized population of 10’000 and solves an allocation problem using predictions of the SIR model. They assume an isolation of the positively identified individuals and showed that just one test a day can reduce the peak of infected individuals by 27%. This study is similar to ours in casting the test allocation problem in an optimization framework using linear programming in contrast to information maximization that we propose. However, their approach is not data informed and does not address a realistic country scenario. Another study [22] focused on test-kit allocation in the Philippines. They use a statistical approach and non-linear programming to determine the optimal percentage allocation of COVID-19 test-kits among accredited testing centers in the Philippines aiming for an equitable chance to all infected individuals to be tested. Their goal of optimal percentage allocation differs from ours that is optimal space and time allocation of test-kits.

The proposed method is demonstrated by focusing on the outbreak of the epidemic in Switzerland. We compare OPALITS with random testing and demonstrate its advantages in producing forecasts with far reduced uncertainties. We note that the existing testing capacity of 1500 tests per million people in Switzerland can be better allocated than the ongoing random testing. Moreover we show that the present methodology will be of particular importance to countries with testing capacity that is far lower than that of Switzerland [16].

The methodology relies on Bayesian experimental design using prior information and available data of reported infections along with forecasts from the *SEI**r**I**u**R* model. We compute the optimal testing strategy for three phases of the epidemic. At the onset of the epidemic the method identifies the most crucial dates and locations for randomized tests in the country’s population. The deployment of OPALITS at this phase would have allowed authorities to perform randomized testing in a period of high uncertainty, well in advance of the disease outbreak. Moreover, the presented approach is applicable to any newly arising epidemic and can be used to identify important surveying locations and a general protocol of action, whenever an unknown disease starts to spread. In the case of COVID-19, such course of action would limit early inaccurate estimates of metrics such as the virus mortality rate, estimated around 3% in early March 2020 by the World Health Organization [23] and currently believed to be lower than 1%, [24].

During the period of non-pharmaceutical interventions the proposed strategy would help quantify their effectiveness assisting decision making for further interventions or retraction of measures that may be harmful to the economy. In this study, available data for the daily reported infections prior to any interventions, combined with the proposed methodology, indicated that conducting two surveys after measures are imposed is sufficient. This can help to identify the new virus dynamics quickly and adjust interventions accordingly. Similarly, the OPALITS can assist monitoring for a recurrence of the disease after preventive measures have been relaxed and help guide further planing of interventions. Since massive testing for a new disease might not be a possibility during its first outbreak and cheap individual tests might become available only later, applying the proposed methodology at this point provides a useful guideline on how to use the individual tests to conduct large-scale surveys. For instance, in Switzerland it was not before mid-April 2020 that rapid COVID-19 tests were released on the market [25]. Collecting data for the reported cases before that and using it to inform the proposed approach to find an OPALITS (after cheap individual tests become available) that will be applied during a possible lock-down would be the suggested course of action in this case.

There are a number of issues that the model should be able to accommodate in the future. These include accounting for virological test sensitivity, delays in the reporting of the test results and bias on the estimate of the unreported infected individuals (Cochran’s formula). Further developments may include models that account for different transmission dynamics in cantons while the classical Bayesian inference methods may be replaced with Hierarchical Bayesian Method to account for heterogeneous data.

We remark that the proposed OPALITS does not depend on a particular type of data/model or to the country of Switzerland. The open source code is modular, scalable and readily adaptable to different scenarios for the epidemic and countries around the world. We believe that the present work can be a valuable tool for decision makers to allocate resources efficiently for testing the population, providing a reliable quantification of the spread of the disease and designing effective interventions.

## Data Availability

The open source software is available on Github.

[https://github.com/cselab/optimal-testing](https://github.com/cselab/optimal-testing) 

## Authors contributions

Conceptualization: C.P., P.Ko.; Data curation: M.C., P.W.; Formal Analysis: M.C., P.W., G.A., D.W., C.P.; Funding acquisition: P.Ko.; Investigation: M.C., P.W., C.P., G.A., P.Ko. Methodology: P.W., M.C., C.P., G.A., P.Ko.; Project administration: P.Ko.; Resources: P.Ko.; Software: P.W., M.C., D.W., I.K., P.Ka.; Supervision: C.P., P.Ko., G.A.; Validation: M.C., P.W.; Visualization: P.W., M.C., P.Ka; Writing – original draft: P.W., M.C., C.P., P.Ko.; Writing – review & editing: P.W., M.C., C.P., P.Ko., G.A., D.W., I.K., P.Ka.

## Competing interests

The authors have no competing interests.

## Data and materials availability

All data is available in the manuscript or the Supplementary Materials. The code is available under [https://github.com/cselab/optimal-testing](https://github.com/cselab/optimal-testing).

## Supplementary Information

### 1. Materials and Methods

The optimal time (day) and location (canton) for surveying a population to detect infectious individuals is determined via Bayesian optimal experimental design [1]. This optimal testing allocation (OPALITS) relies on combining Bayesian inference and utility theory with forecasting models of the epidemic. We remark that the OPALITS does not depend on a particular epidemiological model or type of data. The methodology is applicable at all stages of the epidemic (inception to re-occurrence). It can operate without data at the early stages of the pandemic and takes advantage of data available at later stages of the pandemic. The methodology is rendered computationally efficient using a sequential optimization algorithm [2].

#### Bayesian Inference from randomized testing

We consider a testing campaign including a set (*s*) of surveys *s**i* = (*k**i*, *t**i*), *i* = 1, … *M**y* performed in location *k**i*∈ 𝒞 and on day *t**i*∈ 𝒯. These surveys measure a quantity of interest (QoI), that is denoted by ***y***(*s*) = (*y*1, …, *y**My*). Here, *y**i* is the number of unreported infectious individuals, measured through survey *s**i*. The QoI can be predicted by a model ![Graphic][6]</img> (here the *SEI**r**I**u**R* epidemiological model) that depends on parameters of interest ***ϑ*** ∈ ℝ*N* and nuisance parameters ![Graphic][7]</img>. The distinction between model and nuisance parameters is discussed in later sections. We note that both sets of parameters are uncertain and the proposed method aims to reduce the uncertainty only in the parameters of interest.

A stochastic error term *ε*(*s*) links the model prediction with the QoI ![Formula][8]</img>  The error *ε*(*s*) is assumed to follow a zero-mean multivariate normal distribution 𝒩 (0, Σ) with covariance matrix ![Graphic][9]</img>. The elements of the covariance matrix (Σ*s,s*′) correspond to surveys taken at *s* = (*k, t*) and *s*′ = (*k*′, *t*′) and are given by ![Formula][10]</img>  where *δ**kk*′ is the Kronecker delta, which is 1 for *k* = *k*′ and 0 otherwise. The correlation time *τ* ∈ [0.5, 3.5] is considered a nuisance parameter. These assumptions about the covariance imply that surveys in different locations are not correlated, while those in the same location have an exponentially decaying temporal correlation. The latter avoids clustering of surveys in small time intervals [3]. The factor *σ**t* ∈ ℝ is assumed proportional to the expectation of the QoI, taken over all possible survey locations and over the range of model and nuisance parameters ![Formula][11]</img>  where *s**i* = (*i, t*). The parameter *c* ∈ [0, 0.25] is considered a model parameter. The expectation ![Graphic][12]</img> is taken with respect to all parameters ***ϑ*** and ![Graphic][13]</img> that follow the prior probability distribution with density ![Graphic][14]</img>.

Under these assumptions, the conditional probability of ***y*** on ![Graphic][15]</img> and *s* is given by ![Formula][16]</img>  where |Σ(*s*)| is the determinant of the covariance matrix and ![Graphic][17]</img>.

In the present study, the QoI measured by a survey is the number of unreported infectious individuals in a particular canton on a particular date. This implicitly assumes that there no restrictions on when the survey can be conducted and that there are no observational delays, which means the the QoI is instantaneously obtained. Both assumptions are not restrictive however. Restrictions on the possible survey dates can be accounted for by simply excluding those dates from the dates on which the utility function is evaluated. Also, a delay of one day (meaning that two days are needed to survey a canton *k*, starting from day *t*) would mean that ![Graphic][18]</img> is measured. In other words, when there is a delay the measured quantity can still be mapped to a model quantity, which allows us to perform Bayesian inference. There are several types of measurements (Rapid testing [4], PCR [5], Schwabs [6]) being proposed for testing asymptomatic individuals. We emphasize that our methodology is compatible with any of these types. Data related issues such as uncertainties, test sensitivities and delays in processing can be accommodated in the Bayesian inference framework and in the input to the SEIR model.

#### Expected Information Gain

The most informative surveys ***y*** provide the least uncertainty in the estimates of the model parameters ***ϑ***. Starting with a user-postulated prior distribution *p*(***ϑ***), Bayesian learning is used to update the uncertainties in the model parameters leading to a posterior distribution ![Graphic][19]</img>, based on the information contained in the test data ***y***. The Kullback–Leibler (KL) divergence between the posterior ![Graphic][20]</img> and the prior distributions *p*(***ϑ***) of the model parameters measures the distance between the two distributions. Informative data produce posterior distributions that differ from the prior; greater differences lead to higher information gain. Therefore, the most informative data ***y*** correspond to the testing strategy (measurement locations and times) with the highest information gain [7, 8].

The OPALITS is identified by maximizing a utility function [1]. One choice is the KL divergence ![Graphic][21]</img> quantifying the information gain from the data [1]. However, since data are not available in the experimental design phase, the utility function is selected here to be the expected KL divergence ![Graphic][22]</img> over all data generated by the model prediction error equation 1. Also, to account for the uncertainty in nuisance parameters ![Graphic][23]</img>, encoded in the prior distribution ![Graphic][24]</img>, the expectation is also taken with respect to ![Graphic][25]</img>, which results in the utility function [1] ![Formula][26]</img>  By using Bayes’ theorem ![Formula][27]</img>  the utility function can be simplified to ![Formula][28]</img>  Note that the expected utility only depends on the locations and times of the measurements via *s*. The term ![Graphic][29]</img> is the model evidence given by ![Formula][30]</img>  The choice of the prior distribution *p*(***ϑ***) for the parameters allows to incorporate prior knowledge from epidemiology. If no information is available from data, a case encountered in the beginning of the infection, a uniform prior distribution can be assumed. Table S5 summarizes our choice of prior distributions for all the involved uncertain quantities. If data ***d*** of the daily number of reported infectious individuals is available, Bayesian inference can be used to inform the prior distribution, as described later on. In this case, the prior *p*(***ϑ***) in equation 7 is replaced by the distribution *p*(***ϑ***|***d***) informed from the data ***d***.

In the present work, the assumed nuisance parameters are the correlation time *τ* and the initial condition of the unreported infections in the cantons of Aargau, Bern, Basel-Landschaft, Basel-Stadt, Fribourg, Geneva, Grisons, St.Gallen, Ticino, Vaud, Valais and Zurich ![Formula][31]</img>  with prior distributions ![Graphic][32]</img> and *τ* ∼ 𝒰([0.5, 3.5]).

#### Epidemiological Model

Here we employ the *SEI**r**I**u**R* epidemiological model [9] to forecast the dynamics of the coronavirus outbreak in Switzerland ![Formula][33]</img>  where ![Graphic][34]</img> and ![Graphic][35]</img> denote the number of individuals in canton *k* = {1, …, *K*} that are susceptible, exposed, reported infectious and unreported infectious, respectively. We denote by *K* the number of cantons (26 in Switzerland), by *N**k* the total population of the canton *k*, while the population mobility between cantons *k* and *l* is denoted by *M**kl* with values obtained from the Swiss Federal Statistical Office [10]. The model parameters are the transmission rate (*β*), the relative transmission rate between reported and unreported infectious individuals (*µ*), the virus latency period (*Z*), the infectious period (*D*), the reporting rate (*α*) and the mobility factor (*θ*).

We employ different time-dependent expressions for the transmission rate and the mobility factor for each stage of the epidemic. Constants are chosen for the start of an epidemic while in the cases of monitoring of interventions, the following expressions are used: ![Formula][36]</img>  where *b*, *b*1, *θ* and *θ*1 are the transmission rates and mobility factors before and after the intervention. Time *t* = 0 corresponds to the 25th of February 2020, and *δ*1 = 21 to the 17th of March 2020, when the lockdown was announced in Switzerland [11]. Finally, for the third case (monitoring of a second outbreak) we assume that ![Formula][37]</img>  As in equation 10, *b* is the transmission rate before the intervention while *b*1 = *c*1 *b* and *b*2 = *c*2 *b* with *c*1, *c*2 ∈ [0, 1] are the transmission rates after the two interventions. Similarly, *θ* is the mobility factor before any interventions took place, while *θ*1 = *c*3 *θ* and *θ*2 = *c*4 *θ* with *c*3, *c*4 ∈ [0, 1] are the mobility factors after the two interventions. Moreover, *δ*1 and *δ*2 correspond to the days of the interventions. The day when the measures are loosened is denoted by *δ*3. After that day, the transmission rate is gradually increasing ![Formula][38]</img>  with *λ* ∈ [0, 0.03], while the mobility factor regains its initial value of *θ*.

#### Estimation of the Expected Information Gain

The calculation of the expected utility from equation 7 is performed with Monte-Carlo integration. Samples from the prior distribution are denoted by ***ϑ***(*i*) ∼ *p*(***ϑ***) and by ![Graphic][39]</img>, while samples on the measurement space are denoted by ![Graphic][40]</img>, where *i* ∈ {1, …, *N**ϑ*} and *j* ∈ {1, …, *N**y*}. With these samples, an estimate of the expected utility is computed as ![Formula][41]</img>  In our implementation the samples ***ϑ***(*i*) and ![Graphic][42]</img>, remain the same for different values of *s*. Thus, the model evaluations ![Graphic][43]</img> are only carried out once and are stored and used in the iteration process involved in the optimization. This allows to separate the computational cost of the model evaluation from the cost of computing the utility, which scales as ![Graphic][44]</img>.

#### Optimal Location and Time of Testing

We define the optimal survey times and locations as ![Formula][45]</img>  where ![Graphic][46]</img> with ![Graphic][47]</img> denote the locations ![Graphic][48]</img> and times ![Graphic][49]</img> for the optimal surveys with *i* ∈ {1, …, *M**y*}. For a grid search, the associated computational cost is ![Graphic][50]</img> and thus grows exponentially with the number of surveys. This curse of dimensionality is avoided by using a sequential optimization method [2] to approximate the global optimum by iteratively solving ![Formula][51]</img>  where *s* = (*k, t*) is the location and time to be estimated sequentially starting with *n* = 1 and ![Formula][52]</img>  Following this, we define the expected information gain for survey *n* as ![Formula][53]</img>  

#### Quantification of Uncertainty

A data informed prior *p*(***ϑ***|***d***) of the model parameters ***ϑ*** can be computed from available data ![Graphic][54]</img>, collected at *M**d* locations and days. Here, available data ***d*** refer to the daily number of reported infectious individuals and they are contrasted from the data ***y*** of the number of unreported infectious individuals. The latter are obtained from testing strategies at selected populations using optimal experimental design. The data is mapped via a distinct model output ![Graphic][55]</img> through the following error model ![Formula][56]</img>  where 𝒩 ℬ is the negative binomial distribution with mean *f* and dispersion *ν*. Also, *s**i* = (*k**i*, *t**i*) is the location and time the data *d**i* was collected. The choice of a different error model, compared to equation 1, is based on the assumption that the data are independent and identically distributed. Such an assumption would not be acceptable in the measurement model in equation 1, as it may result in uncorrelated measurements that can become clustered in small time intervals [3].

The data ![Graphic][57]</img> are the daily number of reported infections per canton in Switzerland [12] which corresponds to the following model quantity ![Formula][58]</img>  The posterior distribution that will be used subsequently as a data informed prior is obtained using Bayes’ theorem ![Formula][59]</img>  and is sampled with a nested sampling algorithm [13]. Note the difference to equation 6 and the optimal testing methodology, where we are interested to reduce the uncertainty in ![Graphic][60]</img>, which excludes the nuisance parameters ![Graphic][61]</img>. For the dispersion parameter in equation 18, it is assumed that ![Graphic][62]</img>. The coefficient *r* is unknown and included in the parameter set, where *r* ∼ 𝒰([0, 2]).

![Figure 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/10/2020.11.09.20228320/F8.medium.gif)

[Figure 1:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/F8)

Figure 1: Marginal posterior distributions with data up to 17th of March 2020.
The used data correspond to the daily reported infectious persons in the cantons of Switzerland. The marginals with a canton label XY correspond to the initial condition ![Graphic][63]</img> for the unreported cases in that canton.

The three inferences performed are summarized in table S5, which shows the involved model parameters in each case. The histograms for the found samples are shown in figures S1, S2, and S3.

We remark that, using the present methodology, the inferred date for the beginning of the intervention is *δ*1 = 22.5, which is the 18th of March 2020, corresponding well with the 17th of March 2020 on which the lockdown was introduced in Switzerland [11]. Moreover, we infer a significant reduction in the mobility factor, which indicates that traffic between cantons was also minimized. For the inference III we plot the fit using the inferred parameters in figure S4. The daily reported cases per canton are shown, together with the data used for the inference.

![Figure 2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/10/2020.11.09.20228320/F9.medium.gif)

[Figure 2:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/F9)

Figure 2: Marginal posterior distributions with data up to 6th of June 2020.
The used data correspond to the daily reported infectious persons in the cantons of Switzerland. The marginals with a canton label XY correspond to the initial condition ![Graphic][64]</img> for the unreported cases in that canton.

![Figure 3:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/10/2020.11.09.20228320/F10.medium.gif)

[Figure 3:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/F10)

Figure 3: Marginal posterior distributions with data up to 9th of July 2020.
The used data correspond to the daily reported infectious persons in the cantons of Switzerland. The marginals with a canton label XY correspond to the initial condition ![Graphic][65]</img> for the unreported cases in that canton.

![Figure 4:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/10/2020.11.09.20228320/F11.medium.gif)

[Figure 4:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/F11)

Figure 4: Maximum a-posteriori prediction with data up to 9th of July 2020.
The red points correspond to the daily reported cases per cantons and the blue curve shows the maximum a-posteriori prediction. The 99% confidence interval is plotted in green and based on the sample shown in figure S3.

![Figure 5:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/10/2020.11.09.20228320/F12.medium.gif)

[Figure 5:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/F12)

Figure 5: Comparison of prediction uncertainty per canton.
The predictions are based on optimal strategies and non-specific testing for collection of data. They are also based on the *SEI**r**I**u**R* model output. The error bounds show the 99% confidence intervals of the unreported infectious model output for samples of the parameters with data obtained by optimal (purple) and standard testing (gray). The black dots show the actual unreported infectious for an artificial spread in Switzerland.

![Figure 6:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/10/2020.11.09.20228320/F13.medium.gif)

[Figure 6:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/F13)

Figure 6: Comparison of propagated uncertainty per canton.
The predictions are based on optimal strategies and non-specific testing. The *SEI**r**I**u**R* model output with added model error for the unreported infectious is shown. The error bounds show the 99% confidence intervals of the model output with added model error for samples of the parameters with data obtained by optimal (purple) and standard testing (gray). The black dots show the actual unreported infectious for an artificial spread in Switzerland.

View this table:
[Table 1:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/T1)

Table 1: Maximum expected information gain for outbreak of a new disease.
The corresponding optimal dates are shown in parenthesis.

View this table:
[Table 2:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/T2)

Table 2: Maximum expected information gain of non-pharmaceutical interventions.
The corresponding optimal dates are shown in parenthesis.

View this table:
[Table 3:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/T3)

Table 3: Maximum expected information gain for monitoring of a second outbreak with uninformed *b*3.
The corresponding optimal dates are shown in parenthesis.

Assume we want to estimate the proportion of a population with some margin of error *d* and a small risk *α*, i.e., we want Pr(|*P* − *p*| ≥ *d*) = *α*. Here, the proportion corresponds to the proportion of unreported infected population. The minimum number of samples to achieve this is given by Cochran’s formula [14], ![Formula][66]</img>  where *z**α* is the inverse of the standard normal cumulative distribution function evaluated at 1 − *α/*2. In this formula, we have assumed that the population is of infinite size. In order to correct for a finite size population *N*, we compute ![Formula][67]</img>  In the next figure we present the minimum number of samples needed to sample the cantons of Switzerland for *d* = 0.01 and *α* = 0.01. Notice that *α* = 0.01 corresponds to a 99% confidence interval.

If the available test-kits are more than 26×5950 = 154700 then the maximum information gain will be achieved by deploying all tests uniformly in all cantons. However, when it is not realistic to conduct over 154700 tests, we consider testing with limited resources. For example assuming 30000 available tests, will be enough to test 5 cantons 5 × 5950. The question we answer then is which 5 cantons (from the 26) should we test given that we must test a minimum population of 5950 per canton?.

Distributing less than a particular number of tests (5950) in a canton will not provide a statistically reliable estimate for the number of unreported infections there. Thus, in such a case, the measured unreported infections should not be used to estimate the expected information gain.

Finally, we note that in this work we ignore the bias in the estimate of *I**u*. This means that the estimates of unreported infected enter the Bayesian framework without explicitly accounting for this known error.

View this table:
[Table 4:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/T4)

Table 4: Maximum expected information gain to monitor a second outbreak with informed *b*3.
The corresponding optimal dates are shown in parenthesis.

View this table:
[Table 5:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/T5)

Table 5: Parameters and prior distributions used in Bayesian inference.
Here the data corresponds to the daily reported infections. In all cases, data are used from the 25th of February 2020, when the first reported case was found in the canton of Ticino. Inference I uses data up to the day non-pharmaceutical interventions were announced (17th of March 2020). Inference II uses data up to the day measures were relaxed (6th of June 2020). Inference III uses data up to the 9th of July 2020. The choice of prior distributions is consistent with the choice found in [9]; the ranges used in our study are slightly extended.

![Figure 7:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/10/2020.11.09.20228320/F14.medium.gif)

[Figure 7:](http://medrxiv.org/content/early/2020/11/10/2020.11.09.20228320/F14)

Figure 7: 
Estimated sample size using Cochran’s [14] formula for every canton for confidence level 99%, margin of error 1% and probability of infection 0.1. The cantons are sorted in descending order of their population. The maximum sample size is estimated for Zurich and is equal to 5950. All the other cantons need up to 14% less samples with the exception of the smallest canton that needs 27% less samples.

## Acknowledgments

We acknowledge discussions with Fabian Wermelinger, Lucas Amoudruz, Martin Boden (ETHZ). Sergio Martin (ETHZ) provided technical assistance with the software.

*   Received November 9, 2020.
*   Revision received November 9, 2020.
*   Accepted November 10, 2020.


*   © 2020, Posted by Cold Spring Harbor Laboratory

This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at [http://creativecommons.org/licenses/by-nc-nd/4.0/](http://creativecommons.org/licenses/by-nc-nd/4.0/)

## References

1.  [1].Du Toit A. Outbreak of a novel coronavirus. Nature Reviews Microbiology 2020;18(3):123–.
    
    
2.  [2]. Verity Rea. Estimates of the severity of coronavirus disease 2019: a model-based analysis. The Lancet Infectious Diseases 2020;20.
    
    
3.  [3].Department of Health and Social Care, UK. Real-time Assess-ment of Community Transmission findings. [https://www.imperial.ac.uk/medicine/research-and-impact/groups/react-study/real-time-assessment-of-community-transmission-findings/](https://www.imperial.ac.uk/medicine/research-and-impact/groups/react-study/real-time-assessment-of-community-transmission-findings/); 2020. Online, Accessed 2020-10-18.
    
    
4.  [4].Lohse S, Pfuhl T, Berkó-Göttel B, Rissland J, Geißler T, Gßrtner B, et al. Pooling of samples for testing for sars-cov-2 in asymptomatic people. The Lancet Infectious Diseases 2020;20(11):1231 –2. doi: [https://doi.org/10.1016/S1473-3099(20)30362-5](https://doi.org/10.1016/S1473-3099(20)30362-5).
    
    
5.  [5].Perkins TA, Cavany SM, Moore SM, Oidtman RJ, Lerch A, Poterek M. Estimating unobserved sars-cov-2 infections in the united states. Pro-ceedings of the National Academy of Sciences 2020;117(36):22597–602. doi:10.1073/pnas.2005476117.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMjoiMTE3LzM2LzIyNTk3IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjAvMTEvMTAvMjAyMC4xMS4wOS4yMDIyODMyMC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 

6.  [6].Abdalhamid B, Bilder CR, McCutchen EL, Hinrichs SH, Koepsell SA, Iwen PC. Assessment of Specimen Pooling to Conserve SARS CoV-2 Test-ing Resources. American Journal of Clinical Pathology 2020;153(6):715–8. doi:10.1093/ajcp/aqaa064.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/ajcp/aqaa064&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32304208&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F10%2F2020.11.09.20228320.atom) 

7.  [7].Jonnerby J, Lazos P, Lock E, Marmolejo-Cossío F, Ramsey CB, Sridhar D. Test and contain: A resource-optimal testing strategy for covid-19. In: AI for Social Good Workshop. 2020,.
    
    
8.  [8].Karnakov P, Arampatzis G, Kicic I, Wermelinger F, Wälchli D, Papadim-itriou C, et al. Data-driven inference of the reproduction number for covid-19 before and after interventions for 51 european countries. Swiss Medical Weekly 2020;150. doi:[https://doi.org/10.4414/smw.2020.20313](https://doi.org/10.4414/smw.2020.20313).
    
    
9.  [9].Cochran W. Sampling Techniques. Wiley, New York; 1963.
    
    
10. [10].Chaloner K, Verdinelli I. Bayesian Experimental Design: A Review. Sta-tistical Science 1995;10(2):273–304.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1214/ss/1177009939&link_type=DOI) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1995UN93700003&link_type=ISI) 

11. [11].Huan X, Marzouk YM. Simulation-based optimal bayesian experimental design for nonlinear systems. Journal of Computational Physics 2013;232(1):288–317. doi:[https://doi.org/10.1016/j.jcp.2012.08.013](https://doi.org/10.1016/j.jcp.2012.08.013).
    
    
12. [12].Ryan EG, Drovandi CC, McGree JM, Pettitt AN. A review of modern computational algorithms for bayesian optimal design. International Statistical Review 2016;84(1):128–54. doi:10.1111/insr.12107.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/insr.12107&link_type=DOI) 

13. [13].Li R, Pei S, Chen B, Song Y, Zhang T, Yang W, et al. Substantial un-documented infection facilitates the rapid dissemination of novel coronavirus (sars-cov-2). Science 2020;368(6490):489–93. doi:10.1126/science.abb3221.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEyOiIzNjgvNjQ5MC80ODkiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMC8xMS8xMC8yMDIwLjExLjA5LjIwMjI4MzIwLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 

14. [14].Speagle JS. Dynesty: a dynamic nested sampling package for estimating bayesian posteriors and evidences. Monthly Notices of the Royal Astro-nomical Society 2020;493(3):3132–3158. doi:10.1093/mnras/staa278.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/mnras/staa278&link_type=DOI) 

15. [15].Lindley D. On a measure of the information provided by an experiment. Annals of Mathematical Statistics 1956;27(4):986–1005. doi:10.1214/aoms/1177728069}.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1214/aoms/1177728069&link_type=DOI) 

16. [16].Hasell J, Mathieu E, Beltekian D, Macdonald B, Giattino C, Ortiz-Ospina E, et al. A cross-country database of covid-19 testing. Sci Data 2020;7(345). doi:[https://doi.org/10.1038/s41597-020-00688-8](https://doi.org/10.1038/s41597-020-00688-8).
    
    
17. [17].Dehning J, Zierenberg J, Spitzner FP, Wibral M, Neto JP, Wilczek M, et al. Inferring change points in the spread of covid-19 reveals the effectiveness of interventions. Science 2020;369(6500). doi:10.1126/science.abb9789.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjE3OiIzNjkvNjUwMC9lYWJiOTc4OSI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIwLzExLzEwLzIwMjAuMTEuMDkuMjAyMjgzMjAuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 

18. [18].Liu Y, Gayle AA, Wilder-Smith A, Rocklöv J. The reproductive number of COVID-19 is higher compared to SARS coronavirus. Journal of Travel Medicine 2020;27(2). doi:[https://doi.org/10.1093/jtm/taaa021](https://doi.org/10.1093/jtm/taaa021).
    
    
19. [19].Kanton Zürich,  Statistisches Amt. SARS-CoV-2 Cases communicated by Swiss Cantons and Principality of Liechtenstein (FL). [https://github.com/openZH/covid_19](https://github.com/openZH/covid_19); 2020. Online.
    
    
20. [20]. Bundesamtes für Gesundheit. Coronavirus: Informationen vom Bundesamtes für Gesundheit. [https://www.bag.admin.ch/bag/de/home/krankheiten/ausbrueche-epidemien-pandemien/aktuelle-ausbrueche-epidemien/novel-cov.html](https://www.bag.admin.ch/bag/de/home/krankheiten/ausbrueche-epidemien-pandemien/aktuelle-ausbrueche-epidemien/novel-cov.html); 2020. Online.
    
    
21. [21].Angulo FJ, Finelli L, Swerdlow DL. Reopening Society and the Need for Real-Time Assessment of COVID-19 at the Community Level. JAMA 2020;323(22):2247–8. doi:10.1001/jama.2020.7872.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1001/jama.2020.7872&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32412582&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F10%2F2020.11.09.20228320.atom) 

22. [22].Buhat CAH, Duero JCC, Felix EFO, Rabajante JF, Mamplata JB. Optimal allocation of covid-19 test kits among accredited testing centers in the philippines. medRxiv 2020;. doi:10.1101/2020.04.14.20065201.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoibWVkcnhpdiI7czo1OiJyZXNpZCI7czoyMToiMjAyMC4wNC4xNC4yMDA2NTIwMXYxIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjAvMTEvMTAvMjAyMC4xMS4wOS4yMDIyODMyMC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 

23. [23].World Health Organization. Q&A: Influenza and COVID-19 - simi-larities and differences. [https://www.who.int/emergencies/diseases/novel-coronavirus-2019/question-and-answers-hub](https://www.who.int/emergencies/diseases/novel-coronavirus-2019/question-and-answers-hub); 2020. Online.
    
    
24. [24].Stringhini S, Wisniak A, Piumatti G, Azman AS, Lauer SA, Baysson H, et al. Seroprevalence of anti-sars-cov-2 igg antibodies in geneva, switzerland (serocov-pop): a population-based study. The Lancet 2020;396(10247):313–9. doi: [https://doi.org/10.1016/S0140-6736(20)31304-0](https://doi.org/10.1016/S0140-6736(20)31304-0).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/s0140-6736(20)31304-0&link_type=DOI) 

25. [25].Launch of the first rapid test for COVID-19 on the Swiss mar-ket. [https://www.startupticker.ch/en/news/april-2020/launch-of-the-first-rapid-test-for-covid-19-on-the-swiss-market](https://www.startupticker.ch/en/news/april-2020/launch-of-the-first-rapid-test-for-covid-19-on-the-swiss-market); 2020. Online.
    
    
## References

1.  [1].Ryan EG, Drovandi CC, McGree JM, Pettitt AN. A review of modern computational algorithms for bayesian optimal design. International Statistical Review 2016;84(1):128–54. doi:10.1111/insr.12107.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/insr.12107&link_type=DOI) 

2.  [2].Papadimitriou C. Optimal sensor placement methodology for parametric identification of structural systems. Journal of Sound and Vibra-tion 2004;278(4):923 –47. doi: [https://doi.org/10.1016/j.jsv.2003.10.063](https://doi.org/10.1016/j.jsv.2003.10.063).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jsv.2003.10.063&link_type=DOI) 

3.  [3].Papadimitriou C, Lombaert G. The effect of prediction error correlation on optimal sensor placement in structural dynamics. Mechanical Systems and Signal Processing 2012;28:105 –27. doi: [https://doi.org/10.1016/j.ymssp.2011.05.019](https://doi.org/10.1016/j.ymssp.2011.05.019); interdisciplinary and Integration Aspects in Struc-tural Health Monitoring.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ymssp.2011.05.019&link_type=DOI) 

4.  [4].Chau CH, Strope JD, Figg WD. COVID-19 Clinical Diagnostics and Testing Technology. Pharmacotherapy: The Journal of Human Pharmacology and Drug Therapy 2020;40(8):857–68. doi:10.1002/phar.2439.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/phar.2439&link_type=DOI) 

5.  [5].Long C, Xu H, Shen Q, Zhang X, Fan B, Wang C, et al. Diagnosis of the Coronavirus disease (COVID-19): rRT-PCR or CT? European Journal of Radiology 2020;126:108961. doi: [https://doi.org/10.1016/j.ejrad.2020.108961](https://doi.org/10.1016/j.ejrad.2020.108961).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ejrad.2020.108961&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F10%2F2020.11.09.20228320.atom) 

6.  [6].Heller L, Mota CR, Greco DB. COVID-19 faecal-oral transmission: Are we asking the right questions? Science of The Total Environment 2020;729:138919. doi: [https://doi.org/10.1016/j.scitotenv.2020.138919](https://doi.org/10.1016/j.scitotenv.2020.138919).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.scitotenv.2020.138919&link_type=DOI) 

7.  [7].Lindley D. On a measure of the information provided by an experiment. Annals of Mathematical Statistics 1956;27(4):986–1005. doi:10.1214/aoms/1177728069}.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1214/aoms/1177728069&link_type=DOI) 

8.  [8].Karni E, Schmeidler D. Utility theory with uncertainty. In: Handbook of Mathematical Economics; vol. 4. Elsevier; 1991, p. 1763–831.
    
    
9.  [9].Li R, Pei S, Chen B, Song Y, Zhang T, Yang W, et al. Substantial un-documented infection facilitates the rapid dissemination of novel coron-avirus (sars-cov-2). Science 2020;368(6490):489–93. doi:10.1126/science.abb3221.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEyOiIzNjgvNjQ5MC80ODkiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMC8xMS8xMC8yMDIwLjExLjA5LjIwMjI4MzIwLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 

10. [10].Bundesamtes für Statistik. Commuters for work purposes. [https://www.bfs.admin.ch/bfsstatic/dam/assets/8507281/master](https://www.bfs.admin.ch/bfsstatic/dam/assets/8507281/master); 2020. Online.
    
    
11. [11].Bundesamtes für Gesundheit. Coronavirus: Informationen vom Bundesamtes für Gesundheit. [https://www.bag.admin.ch/bag/de/home/krankheiten/ausbrueche-epidemien-pandemien/aktuelle-ausbrueche-epidemien/novel-cov.html](https://www.bag.admin.ch/bag/de/home/krankheiten/ausbrueche-epidemien-pandemien/aktuelle-ausbrueche-epidemien/novel-cov.html); 2020. Online.
    
    
12. [12].Kanton Zürich,  Statistisches Amt. SARS-CoV-2 Cases communicated by Swiss Cantons and Principality of Liechtenstein (FL). [https://github.com/openZH/covid_19](https://github.com/openZH/covid_19); 2020. Online.
    
    
13. [13].Speagle JS. Dynesty: a dynamic nested sampling package for estimating bayesian posteriors and evidences. Monthly Notices of the Royal Astronomical Society 2020;493(3):3132–3158. doi:10.1093/mnras/staa278.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/mnras/staa278&link_type=DOI) 

14. [14].Cochran W. Sampling Techniques. Wiley, New York; 1963.

 [1]: /embed/inline-graphic-1.gif
 [2]: /embed/inline-graphic-2.gif
 [3]: /embed/inline-graphic-3.gif
 [4]: /embed/inline-graphic-4.gif
 [5]: /embed/inline-graphic-5.gif
 [6]: /embed/inline-graphic-6.gif
 [7]: /embed/inline-graphic-7.gif
 [8]: /embed/graphic-8.gif
 [9]: /embed/inline-graphic-8.gif
 [10]: /embed/graphic-9.gif
 [11]: /embed/graphic-10.gif
 [12]: /embed/inline-graphic-9.gif
 [13]: /embed/inline-graphic-10.gif
 [14]: /embed/inline-graphic-11.gif
 [15]: /embed/inline-graphic-12.gif
 [16]: /embed/graphic-11.gif
 [17]: /embed/inline-graphic-13.gif
 [18]: /embed/inline-graphic-14.gif
 [19]: /embed/inline-graphic-15.gif
 [20]: /embed/inline-graphic-16.gif
 [21]: /embed/inline-graphic-17.gif
 [22]: /embed/inline-graphic-18.gif
 [23]: /embed/inline-graphic-19.gif
 [24]: /embed/inline-graphic-20.gif
 [25]: /embed/inline-graphic-21.gif
 [26]: /embed/graphic-12.gif
 [27]: /embed/graphic-13.gif
 [28]: /embed/graphic-14.gif
 [29]: /embed/inline-graphic-22.gif
 [30]: /embed/graphic-15.gif
 [31]: /embed/graphic-16.gif
 [32]: /embed/inline-graphic-23.gif
 [33]: /embed/graphic-17.gif
 [34]: /embed/inline-graphic-24.gif
 [35]: /embed/inline-graphic-25.gif
 [36]: /embed/graphic-18.gif
 [37]: /embed/graphic-19.gif
 [38]: /embed/graphic-20.gif
 [39]: /embed/inline-graphic-26.gif
 [40]: /embed/inline-graphic-27.gif
 [41]: /embed/graphic-21.gif
 [42]: /embed/inline-graphic-28.gif
 [43]: /embed/inline-graphic-29.gif
 [44]: /embed/inline-graphic-30.gif
 [45]: /embed/graphic-22.gif
 [46]: /embed/inline-graphic-31.gif
 [47]: /embed/inline-graphic-32.gif
 [48]: /embed/inline-graphic-33.gif
 [49]: /embed/inline-graphic-34.gif
 [50]: /embed/inline-graphic-35.gif
 [51]: /embed/graphic-23.gif
 [52]: /embed/graphic-24.gif
 [53]: /embed/graphic-25.gif
 [54]: /embed/inline-graphic-36.gif
 [55]: /embed/inline-graphic-37.gif
 [56]: /embed/graphic-26.gif
 [57]: /embed/inline-graphic-38.gif
 [58]: /embed/graphic-27.gif
 [59]: /embed/graphic-28.gif
 [60]: /embed/inline-graphic-39.gif
 [61]: /embed/inline-graphic-40.gif
 [62]: /embed/inline-graphic-41.gif
 [63]: F8/embed/inline-graphic-42.gif
 [64]: F9/embed/inline-graphic-43.gif
 [65]: F10/embed/inline-graphic-44.gif
 [66]: /embed/graphic-38.gif
 [67]: /embed/graphic-39.gif