RT Journal Article SR Electronic T1 Development of second and third-trimester population-specific machine learning pregnancy dating model (Garbhini-GA2) derived from the GARBH-Ini cohort in north India JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2021.10.02.21264450 DO 10.1101/2021.10.02.21264450 A1 Damaraju, Nikhita A1 Xavier, Ashley A1 Vijayram, Ramya A1 Desiraju, Bapu Koundinya A1 Misra, Sumit A1 Khurana, Ashok A1 Wadhwa, Nitya A1 , A1 Rengaswamy, Raghunathan A1 Thiruvengadam, Ramachandran A1 Bhatnagar, Shinjini A1 Sinha, Himanshu YR 2021 UL http://medrxiv.org/content/early/2021/10/04/2021.10.02.21264450.abstract AB Background The prevalence of preterm birth (PTB) is high in lower and middle-income countries (LMIC) such as India. In LMIC, since a large proportion seeks antenatal care for the first time beyond 14-weeks of pregnancy, accurate estimation of gestational age (GA) using measures derived from ultrasonography scans in the second and third trimesters is of paramount importance. Different models have been developed globally to estimate GA, and currently, LMIC uses Hadlock’s formula derived from data based on a North American cohort. This study aimed to develop a population-specific model using data from GARBH-Ini, a multidimensional and ongoing pregnancy cohort established in a district hospital in North India for studying PTB.Methods Data obtained by longitudinal ultrasonography across all trimesters of pregnancy was used to develop and validate GA models for second and third trimesters. The first trimester GA estimated by ultrasonography was considered the Gold Standard. The second and third trimester GA model named, Garbhini-GA2 is a multivariate random forest model using five ultrasonographic parameters routinely measured during this period. Garbhini-GA2 model was compared to Hadlock and INTERGROWTH-21st models in the TEST set by estimating root-mean-squared error, bias and PTB rate.Findings Garbhini-GA2 reduced the GA estimation error by 23-45% compared to the published models. Furthermore, the PTB rate estimated using Garbhini-GA2 was more accurate when compared to published formulae that overestimated the rate by 1·5-2·0 times.Interpretation The Garbhini-GA2 model developed is the first of its kind developed solely using Indian population data. The higher accuracy of GA estimation by Garbhini-GA2 emphasises the need to apply population-specific GA formulae to improve antenatal care and better PTB rate estimates.Funding Centre for Integrative Biology and Systems Medicine, IIT Madras; Department of Biotechnology, Government of India; Grand Challenges India, BIRAC.Evidence before this study The appropriate delivery of antenatal care and accurate delivery date estimation is heavily dependent on accurate pregnancy dating. Unlike GA estimation using crown-rump length in the first trimester, dating using foetal biometry during the second and third trimesters is prone to inaccuracies. This is a public health concern, particularly in low and middle-income countries like India, where nearly 40% of pregnant women seek their first antenatal care beyond 14 weeks of gestation. The dating formulae used in LMIC were developed using foetal biometry data from the Caucasian population, and these formulae are prone to be erroneous when used in ethnically different populations.Added value of this study This study developed a dating model, the Garbhini-GA2 model for second and third trimesters of pregnancy using multiple candidate biometric predictors measured in a North Indian population. When evaluated internally, this model outperformed the currently used dating models by reducing the errors in the estimation of gestational age by 25-40%. Further, Garbhini-GA2 estimated a PTB rate similar to that estimated by the Gold Standard in our population, while the published formulae overestimated the PTB rates.Implications of all the available evidence Our Garbhini-GA2 model, after due validations in independent cohorts across the Southeast Asian regions, has the potential to be quickly translated for clinical use across the region. A precise dating will benefit obstetricians and neonatologists to plan antenatal and neonatal care more exactly. From an epidemiologist standpoint, using the Garbhini-GA2 dating formulae will improve the precision of the estimates of pregnancy outcomes that heavily depend on gestational age, such as preterm birth, small for gestational age and stillbirth in our population. Additionally, our dating models will improve phenotyping by reducing the risk of misclassification between outcomes for mechanistic and biomarker research.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis study was funded by an alumni endowment from Prakash Arunachalam to the Initiative for Biological Systems Engineering, IIT Madras (BIO/18-19/304/ALUM/KARH). GARBH-Ini cohort study is funded by the Department of Biotechnology, Government of India (BT/PR9983/MED/97/194/2013) and for some components of the biospecimen and ultrasound repository by the Grand Challenges India-All Children Thriving Program (supported by the Programme Management Unit), Biotechnology Industry Research Assistance Council, Department of Biotechnology, Government of India (BIRAC/GCI/0114/03/14-ACT). The data analysis exercise was supported by the Grand Challenges India- ki' Data Challenge for Maternal and Child Health grant (supported by the Programme Management Unit), Biotechnology Industry Research Assistance Council, Department of Biotechnology, Government of India (BT/kiData0394/06/18).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Ethics approvals were obtained from the institutional ethics committees of Gurugram Civil Hospital, Safdarjung Hospital, Translational Health Science and Technology Institute, and Indian Institute of Technology MadrasAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe datasets used and analysed in the current study are available from the corresponding author upon reasonable request. All the codes used for this paper are available at https://github.com/HimanshuLab/Garbhini-GA2.