Abstract
Objective Crafting high-quality value sets is time-consuming and requires a range of clinical, terminological, and informatics expertise. Despite widespread agreement on the importance of reusing value sets, value set repositories suffer from clutter and redundancy, greatly complicating efforts at reuse. When users encounter multiple value sets with the same name or ostensibly representing the same clinical condition, it can be difficult to choose amongst them or determine if any differences among them are due to error or intentional decision.
Methods This paper offers a view of value set development and reuse based on a field study of researchers and informaticists. The results emerge from an analysis of relevant literature, reflective practice, and the field research data.
Results Qualitative analysis of our study data, the relevant literature, and our own professional experience led us to three dichotomous concepts that frame an understanding of diverse practices and perspectives surrounding value set development:
Permissible values versus analytic value sets;
Prescriptive versus descriptive approaches to controlled medical vocabulary use; and
Semantic and empirical types of value set development and evaluation practices and the data they rely on.
This three-fold framework opens up the redundancy problem, explaining why multiple value sets may or may not be needed and advancing academic understanding of value set development.
Conclusion The paper catalogues the methods and practices used and provides practical aid in managing the value set development process. It offers recommendations for improving that process and for software innovation in to support. In order for value set repositories to become more rather than less useful over time, software must channel user efforts into either improving existing value sets or making new ones only when absolutely necessary.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
SG was partially supported while doing this work by a National Science Foundation (https://www.nsf.gov/) training grant, DGE- 1632976. Beyond providing the stipend, NSF had no role or influence on the research or the manuscript.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
IRB of The University of Maryland gave ethical approval for this work in IRB #1405794-8
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
The primary difference is in using the term value set instead of code set.
Data Availability
Survey data have been anonymized and made available at Gold S. Value sets and the problem of redundancy in value set repositories. Survey data. OSF. 2024. doi:10.17605/OSF.IO/ABTJU Interview data cannot be anonymized and are not included to protect participant privacy.