ABSTRACT
Objective To describe the development of INSIGHT, a real-world data quality tool to assess completeness, consistency, and fitness-for-purpose of observational health data sources.
Material and Methods We designed a three-level pipeline with data quality assessments (DQAs) to be performed in ConcePTION Common Data Model (CDM) instances. The pipeline has been coded using R.
Results INSIGHT is an open-source tool that identifies potential data quality issues in CDM-standardized instances through the systematic execution and summary of over 588 configurable DQAs. Level 1 focuses on compliance with the ConcePTION CDM specifications. Level 2 evaluates the temporal plausibility of events and uniqueness of records. Level 3 provides an overview of distributions, outliers, and trends over time. The DQAs are run locally and assessed centrally by a data quality revisor together with the data access provider’s representatives.
Discussion NSIGHT aligns with recent conceptual frameworks that identify five dimensions of data quality: reliability, extensiveness, coherence, timeliness, and relevance. Data quality is the sum of several internal and external features of the data and while DQAs provide reassurance about fitness-for-purpose for secondary-use data sources, improvements in data collection and generation stages are essential to reduce bias, misclassification, and measurement errors, thereby enhance overall data quality for Real World Evidence.
Conclusion INSIGHT aims to support clinical and regulatory decision-making for medicines and vaccines by evaluating the quality of observational health data sources to support fit for purpose assessment. Assessing and improving data quality will enhance the reliability and quality of the generated evidence.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
The ConcePTION project has received funding from the Innovative Medicines Initiative 2 Joint Undertaking under gran agreement No 821520. this Joint Undertaking receives support from the European Union's Horizon 2020 research and innovation programme and EFPIA
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
All data produced in the present study are available upon reasonable request to the authors