Abstract
Miscarriages are the most common type of pregnancy loss, mostly occurring in the first 12 weeks of pregnancy due to known factors of different natures. Pregnancy risk assessment aims to quantify evidence in order to reduce such maternal morbidities during pregnancy, and personalized decision support systems are the cornerstone of high-quality, patient-centered care in order to improve diagnosis, treatment selection, and risk assessment. However, the increasing number of patient-level observations and data sparsity requires more effective forms of representing clinical knowledge in order to encode known information that enables performing inference and reasoning. Whereas knowledge embedding representation has been widely explored in the open domain data, there are few efforts for its application in the clinical domain. In this study, we discuss differences among multiple embedding strategies, and we demonstrate how these methods can assist on clinical risk assessment of miscarriage both before and specially in the earlier pregnancy stages. Our experiments show that simple knowledge embedding approaches that utilize domain-specific metadata perform better than complex embedding strategies, although both are able to improve results comparatively to a population probabilistic baseline in both AUPRC, F1-score, a proposed normalized version of these evaluation metrics that better reflects accuracy for unbalanced datasets.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
No funding was received
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Data controllers of the InfoSaude system (the Municipal Health Secretary of Florianopolis, which is part of the Brazilian Public Healthcare System) have granted us permission to use and perform analysis on a (semi)de-identified version of this dataset, and no time limit has been set for data usage.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
Dataset is not fully de-identified and it is not publicly available.