Evaluation synthesis analysis can be accelerated through text mining, searching, and highlighting: A case-study on data extraction from 631 UNICEF evaluation reports

Lena Schmidt; Pauline Addis; Erica Mattellone; Hannah O’Keefe; Kamilla Nabiyeva; Uyen Kim Huynh; Nabamallika Dehingia; Dawn Craig; Fiona Campbell

doi:10.1101/2024.08.27.24312630

Abstract

Background The United Nations Children’s Fund (UNICEF) is the United Nations agency dedicated to promoting and advocating for the protection of children’s rights, meeting their basic needs, and expanding their opportunities to reach their full potential. They achieve this by working with governments, communities, and other partners via programmes that safeguard children from violence, provide access to quality education, ensure that children survive and thrive, provide access to water, sanitation and hygiene, and provide life-saving support in emergency contexts. Programmes are evaluated as part of UNICEF Evaluation Policy¹, and the publicly available reports² include a wealth of information on results, recommendations, and lessons learned.

Objective To critically explore UNICEF’s impact, a systematic synthesis of evaluations was conducted to provide a summary of UNICEF main achievements and areas where they could improve, as a reflection of key recommendations, lessons learned, enablers, and barriers to achieving their goals and to steer its future direction and strategy. Since the evaluations are extensive, manual analysis was not feasible, so a semi-automated approach was taken.

Methods This paper examines the automation techniques used to try and increase the feasibility of undertaking broad evaluation syntheses analyses. Our semi-automated human-in-the-loop methods supported data extraction of data for 64 outcomes across 631 evaluation reports;³ each of which comprised hundreds of pages of text. The outcomes are derived from the five goal areas within UNICEF 2022-2025 Strategic Plan. For text pre-processing we implemented PDF-to-text extraction, section parsing, and sentence mining via a neural network. Data extraction was supported by a freely available text-mining workbench, SWIFT-Review. Here, we describe using comprehensive adjacency-search-based queries to rapidly filter reports by outcomes and to highlight relevant sections of text to expedite data extraction.

Results While the methods used were not expected to produce 100% complete results for each outcome, they present useful automation methods for researchers facing otherwise non-feasible evaluation syntheses tasks. We reduced the text volume down to 8% using deep learning (recall 0.93) and rapidly identified relevant evaluations across outcomes with a median precision of 0.6. All code is available and open-source.

Conclusions When the classic approach of systematically extracting information from all outcomes across all texts exceeds available resources, the proposed automation methods can be employed to speed up the process while retaining scientific rigour and reproducibility.

Strengths and limitations of this study

- Systematic impact evaluation syntheses are a vital tool to critically evaluate and plan future work of organisations such as UNICEF; but they are often not feasible due to the size, structure, and amount of evaluation report documents.
- To increase feasibility of analysis we describe a semi-automated human-in-the-loop system which was applied in a synthesis of 631 evaluations across 64 outcomes.
- The proposed open-source code and methods made an evaluation synthesis feasible by reducing text and streamlining the identification of relevant reports for each outcome.
- By making code open-source and adaptable we aim to encourage accelerated, yet transparent and reproducible results.
- While the methods cannot produce 100% complete or correct results for each outcome, they present useful automation methods for researchers facing otherwise non-feasible evaluation syntheses tasks.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This project was funded by the UNICEF Evaluation Office. The UNICEF Evaluation Office operates independently within the organization, with a mandate to produce impartial and rigorous evidence that informs UNICEF's policies, advocacy efforts, and programmes. For further details, please refer to the revised evaluation policy, available at https://www.unicef.org/executiveboard/revised-evaluation-policy-unicef-srs-2023. LS, PA, HO, DC, and FC were in part supported by the NIHR Innovation Observatory (National Institute for Health and Care Research (NIHR) [HSRIC-2016-10009/Innovation Observatory]). The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Footnotes

We re-uploaded the paper to remove errors with cross-references to figures within the text. None of the content has changed.
↵¹ E/ICEF/2023/27 (undocs.org) (last accessed 06/08/2024)
↵² Evaluation reports | UNICEF Evaluation (last accessed 06/08/2024)
↵³ https://www.unicef.org/evaluation/reports (last accessed 06/08/2024)

Data availability statement

All programming code for the automations described in this paper is available on GitHub: https://github.com/NIHRIO/EvaluationSynthesisMethods

The weights for the trained SPECTER model for UNICEF data are available here: https://drive.google.com/drive/folders/1-0VXJcY_GKBNq6-5GprPvwdnTc4Raud3?usp=sharing

The SWIFT-Review project is available as Appendix 3, it can be loaded and used using the free desktop application available here: https://www.sciome.com/swift-review/

The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.