ABSTRACT
Background Computable phenotypes are increasingly important tools for patient cohort identification. As part of a study of risk of chronic opioid use after surgery, we used a Resource Description Framework (RDF) triplestore as our computable phenotyping platform, hypothesizing that the unique affordances of triplestores may aid in making complex computable phenotypes more interoperable and reproducible than traditional relational database queries.
To identify and model risk for new chronic opioid users post-surgery, we loaded several heterogeneous data sources into a Blazegraph triplestore: (1) electronic health record data; (2) claims data; (3) American Community Survey data; and (4) Centers for Disease Control Social Vulnerability Index, opioid prescription rate, and drug poisoning rate data. We then ran a series of queries to execute each of the rules in our “new chronic opioid user” phenotype definition to ultimately arrive at our qualifying cohort.
Results Of the 4,163 patients in the denominator, our computable phenotype identified 248 patients as new chronic opioid users after their index surgical procedure. After validation against charts, 228 of the 248 were revealed to be true positive cases, giving our phenotype a PPV of 0.92.
Conclusion We successfully used the triplestore to execute the new chronic opioid user phenotype logic, and in doing so noted some advantages of the triplestore in terms of schemalessness, interoperability, and reproducibility. Future work will use the triplestore to create the planned risk model and leverage the additional links with ontologies, and ontological reasoning.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
The project described was supported by the National Center for Advancing Translational Sciences (NCATS), National Institutes of Health, through Grant Award Number UL1TR002489. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. The database infrastructure used for claims data portion of this project was funded by the Cecil G. Sheps Center for Health Services Research; the Department of Health Policy and Management, UNC Gillings School of Global Public Health; the CER Strategic Initiative of UNC's Clinical and Translational Science Award (UL1TR002489); and the UNC School of Medicine.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
IRB of UNC Chapel Hill gave ethical approval for this work.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
The electronic health record and claims data analyzed in this study are not publicly available due to restrictions of the Health Insurance Portability and Accountability Act (HIPAA). American Community Survey data (2012-2016) is available from the US Census Bureau at https://www.census.gov/acs/www/data/data-tables-and-tools/data-profiles/2016. Social vulnerability index data (2016) are available from the Centers for Disease Control at https://svi.cdc.gov/data-and-tools-download.html. US county opioid prescribing and drug poisoning rate data (2016) are available from the Centers for Disease Control at https://www.cdc.gov/drugoverdose/maps/rxcounty2016.html and https://data.cdc.gov/NCHS/NCHS-Drug-Poisoning-Mortality-by-County-United-Sta/pbkm-d27e, respectively.
https://www.census.gov/acs/www/data/data-tables-and-tools/data-profiles/2016
https://www.cdc.gov/drugoverdose/maps/rxcounty2016.html
https://svi.cdc.gov/data-and-tools-download.html
https://data.cdc.gov/NCHS/NCHS-Drug-Poisoning-Mortality-by-County-United-Sta/pbkm-d27e
LIST OF ABBREVIATIONS
- ACS
- American Community Survey
- CDC
- Centers for Disease Control and Prevention
- CDWH
- Carolina Data Warehouse for Health
- CPT
- Current Procedural Terminology
- EHR
- Electronic health record(s)
- FHIR
- Fast Healthcare Interoperability Resources
- HITECH Act
- Health Information Technology for Economic and Clinical Health Act
- HL7
- Health Level 7
- ICD
- International Statistical Classification of Diseases and Related Health Problems
- JSON
- JavaScript Object Notation
- LOD
- Linked Open Data
- LOINC
- Logical Observation Identifiers Names and Codes
- NDC
- National Drug Code
- NoSQL
- Not only SQL
- OWL
- Web Ontology Language
- PPV
- Positive predictive value
- RDF
- Resource Description Framework
- SPARQL
- SPARQL Protocol and RDF Query Language
- SQL
- Structured query language
- SVI
- Social Vulnerability Index
- UNC
- University of North Carolina at Chapel Hill
- URI
- Uniform resource identifier
- VANDF
- Department of Veterans Affairs National Drug File
- W3C
- World Wide Web Consortium
- XML
- Extensible Markup Language