Abstract
Introduction The FDA Adverse Event Reporting System (FAERS) receives drug names in various forms, including brand names, active ingredients, abbreviations, and misspellings, which creates challenges in nomenclature standardization. The lack of consensus on standardization strategies and of transparency hampers replicability and accuracy in conducting disproportionality analysis using FAERS data.
Aim We have developed an open-source drug-to-ingredient dictionary called the DiAna dictionary (short for Disproportionality Analysis). Additionally, we have linked the DiAna dictionary to the WHO Anatomic Therapeutic Chemical (ATC) classification system.
Methods We retrieved all drug names reported to the FAERS from 2004 to December 2022. Using existing dictionaries such as RxNorm and string editing techniques, we automatically translated the drug names to active ingredients. Manual revision was performed to correct errors and improve translation accuracy. The resulting DiAna dictionary was linked to the ATC classification, proposing a primary ATC code for each ingredient.
Results We retrieved 18,151,842 reports, with 74,143,411 drug entries. We automatically translated and manually checked the first 14,832 terms, up to terms occurring at least 200 times (96.88% of total drug entries), to 6,282 unique active ingredients. Automatic unchecked translations extend the standardization to 346,854 terms (98.94%). After linking to the ATC classification, the most prominent drug classes in FAERS reports were immunomodulating (37.40%) and nervous system drugs (29.19%).
Conclusion We present the DiAna dictionary as an open-source tool and encourage experts to provide input and feedbacks. Regular updates can improve research quality and promote a common pharmacovigilance toolbox, ultimately advancing safety and improving study interpretability.
Key points
Drug name standardization impacts signal detection accuracy.
DiAna dictionary cleanses drugs in FAERS for improved data control.
DiAna’s transparency and flexibility improves interpretability.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study did not receive any funding
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
https://fis.fda.gov/extensions/FPD-QDE-FAERS/FPD-QDE-FAERS.html
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
Availability of data and material: The dictionary and the linkage to the ATC classification are available at https://osf.io/zqu89/?view_only=237d052047c142cabd5d8ea2e765efc6. Code availability: All the processing, analyses, and visualization were obtained through R-software (version 4.2.1). The code for using the dictionary in the cleaning of the FAERS database is available at https://github.com/fusarolimichele/DiAna.