Abstract
Small populations (e.g., hospitals, schools or workplaces) are characterised by high contact heterogeneity and stochasticity affecting pathogen transmission dynamics. Empirical individual contact data provide unprecedented information to characterize such heterogeneity and are increasingly available, but are usually collected over a limited period, and can suffer from observation bias. We propose an algorithm to stochastically reconstruct realistic temporal networks from individual contact data in healthcare settings (HCS) and test this approach using real data previously collected in a long-term care facility (LTCF).
Our algorithm generates full networks from recorded close-proximity interactions, using hourly inter-individual contact rates and information on individuals’ wards, the categories of staff involved in contacts, and the frequency of recurring contacts. It also provides data augmentation by reconstructing contacts for days when some individuals are present in the HCS without having contacts recorded in the empirical data. Recording bias is formalized through an observation model, to allow direct comparison between the augmented and observed networks. We validate our algorithm using data collected during the i-Bird study, and compare the empirical and reconstructed networks.
The algorithm was substantially more accurate to reproduce network characteristics than random graphs. The reconstructed networks reproduced well the assortativity by ward (first– third quartiles observed: 0.54–0.64; synthetic: 0.52–0.64) and the hourly staff and patient contact patterns. Importantly, the observed temporal correlation was also well reproduced (0.39–0.50 vs 0.37–0.44), indicating that our algorithm could recreate a realistic temporal structure. The algorithm consistently recreated unobserved contacts to generate full reconstructed networks for the LTCF.
To conclude, we propose an approach to generate realistic temporal contact networks and reconstruct unobserved contacts from summary statistics computed using individual-level interaction networks. This could be applied and extended to generate contact networks to other HCS using limited empirical data, to subsequently inform individual-based epidemic models.
Author summary Contact networks are the most informative representation of the contact heterogeneity, and therefore infectious disease transmission risk, in small populations. However, the data collection required is costly and complex, usually limited to a few days only and likely to suffer from partially observed data, making the practical integration of networks into models challenging. In this article, we present an approach leveraging empirical individual contact data to stochastically reconstruct realistic temporal networks in healthcare settings. The algorithm accounts for population specificities including the hourly distribution of contact rates between different individuals (staff categories, patients) and the probability for contact repetition between the same individuals. We illustrate and validate this algorithm using a real contact network measured in a long-term care facility. Our approach outperforms random graphs informed by the same data to accurately reproduce observed network characteristics and hourly staff-patient contact patterns. The algorithm recreates unobserved contacts, providing data augmentation for times with missing information. This method should improve the usability and reliability of contact networks, and therefore promote integration of empiric contact data in individual-based models.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
AD, LT and LO received funding from the French National Research Agency (SPHINX-17-CE36-0008-01). DG received funding from the National Clinical Research Program and the Investissement d'Avenir program, Laboratoire d'Excellence "Integrative Biology of Emerging Infectious Diseases" (ANR-10-LABX-62-IBEID).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
Improved description of the algorithm in the Methods, added references and discussion on previous reconstruction algorithms
Data Availability
The relevant contact networks and analysis code are available in the following GitHub repository: https://github.com/qleclerc/network_algorithm