A machine learning-based approach to determine infection status in recipients of BBV152 whole virion inactivated SARS-CoV-2 vaccine for serological surveys
Abstract
Data science has been an invaluable part of the COVID-19 pandemic response with multiple applications, ranging from tracking viral evolution to understanding the effectiveness of interventions. Asymptomatic breakthrough infections have been a major problem during the ongoing surge of Delta variant globally. Serological discrimination of vaccine response from infection has so far been limited to Spike protein vaccines used in the higher-income regions. Here, we show for the first time how statistical and machine learning (ML) approaches can discriminate SARS-CoV-2 infection from immune response to an inactivated whole virion vaccine (BBV152, Covaxin, India), thereby permitting real-world vaccine effectiveness assessments from cohort-based serosurveys in Asia and Africa where such vaccines are commonly used. Briefly, we accessed serial data on Anti-S and Anti-NC antibody concentration values, along with age, sex, number of doses, and number of days since the last vaccine dose for 1823 Covaxin recipients. An ensemble ML model, incorporating a consensus clustering approach alongside the support vector machine (SVM) model, was built on 1063 samples where reliable qualifying data existed, and then applied to the entire dataset. Of 1448 self-reported negative subjects, 724 were classified as infected. Since the vaccine contains wild-type virus and the antibodies induced will neutralize wild type much better than Delta variant, we determined the relative ability of a random subset of such samples to neutralize Delta versus wild type strain. In 100 of 156 samples, where ML prediction differed from self-reported uninfected status, Delta variant, was neutralized more effectively than the wild type, which cannot happen without infection. The fraction rose to 71.8% (28 of 39) in subjects predicted to be infected during the surge, which is concordant with the percentage of sequences classified as Delta (75.6%-80.2%) over the same period.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study was funded by MLP2007 project grant of CSIR.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Institutional Human Ethics of CSIR-IGIB gave ethical approval for this work. CSIR-IGIB vide approval CSIR-IGIB/IHEC/2019-20.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
All data produced in the present study are available upon reasonable request to the authors
Subject Area
- Addiction Medicine (321)
- Allergy and Immunology (623)
- Anesthesia (162)
- Cardiovascular Medicine (2334)
- Dermatology (205)
- Emergency Medicine (373)
- Epidemiology (11692)
- Forensic Medicine (10)
- Gastroenterology (692)
- Genetic and Genomic Medicine (3675)
- Geriatric Medicine (345)
- Health Economics (630)
- Health Informatics (2361)
- Health Policy (924)
- Hematology (339)
- HIV/AIDS (771)
- Medical Education (363)
- Medical Ethics (104)
- Nephrology (395)
- Neurology (3429)
- Nursing (194)
- Nutrition (519)
- Oncology (1798)
- Ophthalmology (531)
- Orthopedics (215)
- Otolaryngology (285)
- Pain Medicine (229)
- Palliative Medicine (66)
- Pathology (444)
- Pediatrics (1020)
- Primary Care Research (415)
- Public and Global Health (6081)
- Radiology and Imaging (1254)
- Respiratory Medicine (820)
- Rheumatology (375)
- Sports Medicine (320)
- Surgery (396)
- Toxicology (50)
- Transplantation (171)
- Urology (144)