Abstract
Background The National Institutes of Health Stroke Scale (NIHSS) scores have been used to evaluate Acute Ischemic Stroke (AIS) severity in clinical settings. Through the International Classification of Diseases, Tenth Revision Code (ICD-10), documentation of NIHSS scores has been made possible for administrative purposes and has since been increasingly adopted in insurance claims. Per CMS guidelines, the stroke ICD-10 diagnosis code must be documented by the treating physician, but ICD-10 NIHSS scores can be documented by any healthcare provider involved in the patient’s care. Accuracy of the administratively collected NIHSS compared to expert clinical evaluation as documented in the Paul Coverdell registry is however still uncertain.
Methods Leveraging a linked dataset comprised of the Paul Coverdell National Acute Stroke Program (PCNASP) clinical registry and probabilistically matched individuals on Medicare Claims data, we sampled patients aged 65 and above admitted for AIS across nine states, from 2016 to 2019. We excluded those lacking documentation for either clinical or ICD-10 based NIHSS scores. We then examined score concordance from both databases and measured discordance as the absolute difference between the PCNASP and ICD-10-based NIHSS scores.
Results Among 66,837 matched patients, mean NIHSS scores for PCNASP and Medicare ICD-10 were 7.26 (95% CI: 7.20 – 7.32) and 7.40 (95% CI: 7.34 – 7.46), respectively. Concordance between the two scores was high as indicated by an intraclass correlation coefficient of 0.93.
Conclusion The high concordance between clinical and ICD-10 NIHSS scores highlights the latter’s potential as measure of stroke severity derived from structured claims data.
Introduction
The National Institutes of Health Stroke Scale (NIHSS) has long served as a vital tool for assessing the severity of acute ischemic strokes1. This well-established instrument has been employed in numerous clinical trials related to stroke and has garnered validation as a reliable predictor of stroke outcomes2–4. Beyond its direct clinical utility, hospitals have also been recording the NIHSS score for administrative risk adjustment and claims purposes, which was enabled by the tenth revision of the International Classification of Diseases (ICD) codes5.
However, during the initial national optional reporting period in the United States, NIHSS scores were documented through administrative claims in only 15% of hospital admissions for acute ischemic stroke5.
A notable gap in the existing literature pertains to an evaluation of the accuracy of ICD-based NIHSS scores when compared to clinical NIHSS scores. The incorporation of ICD-10 based NIHSS is advocated for its effectiveness in hospital risk adjustment6 and in enhancing the measurement of quality of care and patient outcomes7. A recent study took a step in this direction by assessing the validity of ICD-10 based NIHSS scores using data from a single center stroke registry8. While this study revealed a high level of agreement between ICD-10 based NIHSS scores and clinical NIHSS scores (R2 = 0.86), it is worth noting that ICD-10 NIHSS scores were available for only a limited subset of patients (n =395 out of 1357, 29.1%).
Furthermore, it is essential to recognize that single institution studies do not fully capture the potential diversity of ICD coding practices across different states and patient populations 8. In light of these considerations, we examined the concordance between the NIHSS score reported for clinical assessment in the Paul Coverdell National Acute Stroke Program (PCNASP) and the score reported in Medicare claims database, with data gathered over multiple years across 9 states.
Objective
To evaluate the concordance between Acute Ischemic Stroke (AIS) severity scores from the clinically derived NIHSS scores obtained from PCNASP registry and administratively derived ICD-10 NIHSS scores obtained from Medicare claims.
Methodology
Data Source
We established a linkage between the Paul Coverdell National Acute Stroke Program (PCNASP) registry and Medicare Claims data. PCNASP aims to collect and track data on stroke-related cases to enhance patient care quality9. The PCNASP registry includes data from 2008 to 2020 across 9 US states (California, Georgia, Massachusetts, Michigan, Minnesota, New York, Ohio, Washington, and Wisconsin), and captures the NIHSS scores as reported by clinicians or hospital staff.
Medicare, a national health insurance program administered by the Centers for Medicare & Medicaid Services (CMS), primarily serves individuals aged 65 or older. From October 2016, Medicare implemented the use of ICD10’s R29.7xx codes to group NIHSS scores creating distinct severity categories that aid stroke outcome research10. The Medicare Provider Analysis and Review (MEDPAR) database contains extensive information about beneficiaries and as well a broad range of information, including patient demographics, admission and discharge dates, diagnosis, and procedure codes, provider identifiers, comorbidities, and, since 2016, measurement of stroke severity through ICD-10 based NIHSS scores11. This dataset is utilized for administrative purposes s well as for healthcare research.
Linking Databases
Due to the lack of common unique patient identifiers across the two databases, we applied a validated probabilistic matching strategy to link individuals in the PCNASP and Medicare datasets12. This approach utilized variables such as age, gender, admission and discharge dates, diagnosis code, hospitals, and state. After the linking process, we retained only those patients with unique matches, excluding cases where PCNASP IDs corresponded to multiple Medicare Beneficiary IDs, and vice versa.
Study Population
With the linked dataset, we selected patients aged 65 or older who were hospitalized for AIS and discharged under ICD-10 codes within the I63 groups. 13. Patients were required to have both PCNASP (clinical) and Medicare (ICD-10-based) NIHSS scores recorded. Since the ICD-10-based NIHSS score codes were introduced with the ICD-10th revision, we limited our analysis to cases from 2016 to 2019 that used ICD-10 diagnosis and procedure codes. The cut-off in 2019 was chosen to avoid the COVID19 era and beyond
Measurements
NIHSS scores are the primary outcome measured. The scores range from 0-42 with increasing values denoting more severe deficits14. In ICD-10, the R29.7xx code group allows for direct input of integer values in that range and reflects the identical scoring criteria.
Per CMS guidelines, while stroke ICD-10 diagnosis code must be documented by the treating physician, documentation of ICD-10 NIHSS scores may be based on medical record documentation from any healthcare providers involved in the patient’s care15.
Analysis Plan
We assessed the concordance between the NIHSS scores documented in PCNASP and in Medicare claims data in the MEDPAR files. For our analysis in Medicare data, our main emphasis was on the ICD-10 NIHSS score documented at the moment of admission. Any instances where multiple ICD-10 NIHSS scores were present at admission or where there was no ICD-10 NIHSS score at admission were excluded from our analysis.
Statistical Analysis
Our analysis focused on evaluating the concordance between the ordinal values of the PCNASP and Medicare NIHSS scores. We assessed the concordance of these two scores using the Intraclass Correlation Coefficient (ICC3) with a two-way mixed-effects model16. Lastly, we calculated discordance as the absolute difference between the the PCNASP and Medicare NIHSS scores. Our study adhered to the ethical guidelines outlined by the Massachusetts General Hospital Institutional Review Board (IRB) and obtained IRB approval under the protocol number 2020P003963. We followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines for observational studies17 (Supplementary materials).
Results
From an initial dataset of 233,908 patients, 66,837 patients with 67,197 AIS admissions met our inclusion criteria. These patient data came from 477 hospitals across nine states, with state-specific demographics and results provided in the Supplementary materials (Table S1)
Table 1 details the breakdown of PCNASP (clinical) and Medicare (ICD-10-based) NIHSS scores and their demographic characteristics. The mean age was 79.1 ± 8.67 years. Our population was predominantly white (81.0%).
The mean PCNASP clinical NIHSS score was 7.26 (95%CI: 7.20 – 7.32), while the mean Medicare ICD-10-based NIHSS score was 7.40 (95%CI: 7.34 – 7.46). Both scores had the same median of 4.0 with interquartile range of (2.0 - 11.0).
The two scores demonstrated high concordance, as indicated by ICC3 of 0.93. As illustrated in Figure 1, the Bland-Altman plot demonstrates a minimal mean difference between the clinical and ICD-10 NIHSS scores, consistent with a high level of agreement rates.
In state-specific data, New York led in AIS cases (n = 15,080), followed by Ohio (n = 11,364), and Washington with the lowest count (n = 3,568) (Table S1). Notably, California recorded the highest mean clinical and ICD-10 NIHSS scores at 8.32 and 8.29, respectively, while Minnesota reported the lowest clinical NIHSS of 6.44, and Massachusetts the lowest ICD-10 NIHSS of 6.52.
The discordance, which is the absolute difference between the PCNASP and Medicare NIHSS scores, was 1.10 points (95% CI: 1.08 – 1.12) for the cohort. This discordance varied by state, with the highest average absolute score difference observed in California with 1.55 points (95% CI: 1.47 – 1.63) and the lowest in Massachusetts with 0.61 points (95% CI: 0.55 – 0.65) (Table S1).
Discussion
In our investigation involving 66,837 AIS patients from 477 hospitals across nine states, we probed the concordance between clinical NIHSS score as registered in the Paul Coverdell Registry and its ICD-10-based administrative records counterpart in Medicare claims. Despite state-specific variations, our cohort consistently demonstrated an overall high concordance score of 0.93. Notably, our findings aligned with a smaller study that highlighted minimal discordance between the two scores in the state of New York8.
Bland-Altman plot revealed that the association between stroke severity and documentation concordance was non-linear. Particularly, we observed greater consistency in documentation at the extreme ends of the stroke severity spectrum. This could be due to the clinical characteristics and consequences of the stroke being more distinct and easier to document accurately at the extreme ends of the severity spectrum.
Our study identified regional and state-specific discordance discrepancies, partly explained by variations in patient populations and demographics. However, it is worth stressing that participation in the PCNASP database is voluntary and different numbers of hospitals participated in each state over time. Additionally, potential differences in stroke assessment, and in the ways through which claims are filed to Medicare by providers could also vary by location and practice 18–20.
While our approach of treating stroke severity score as a ordinal variable facilitated a rigorous statistical analysis, it’s essential to acknowledge that the slight variations we observed may not carry substantial clinical significance. Nevertheless, these documentation differences could become significant in borderline values where minimal score variations would result in change in stroke severity with subsequent reimbursement repercussions.
In addition, it is likely that with higher clinical NIHSS scores, larger absolute numerical variances can be noted as there is more granularity (i.e., more points) available in the score for strokes of greater. Therefore, larger variations for patients with higher scores may carry the same clinical impact as smaller variations in the less severe cases. Limitations
Our study predominantly analyzed data from a substantial population sample spanning 2016 to 2019, a period preceding the COVID-19 pandemic. This timeframe allows for an insightful examination of stroke-related documentation and claims in a pre-pandemic context. It is important to note, however, that stroke presentations, outcomes, and associated documentation practices have evolved during and post-COVID21–24. While these changes may influence the applicability of our findings, they remain relevant for understanding pre-pandemic trends and could offer valuable insights for future healthcare practices.
Moreover, our outcome measurements within the pre-pandemic period are not without limitations. The reliance on administrative documentation, often constrained by time and system complexities, introduces potential inaccuracies10,18,25–27. These limitations underscore the need for cautious interpretation of our findings. Furthermore, the generalizability of our results is restricted to the healthcare settings and regions included in our study. Variations in patient demographics, stroke care protocols, and documentation practices across different healthcare institutions and regions may limit the applicability of our findings to other contexts.
Conclusion
Our study revealed a remarkable alignment between clinically assessed NIHSS scores and ICD-10 NIHSS scores derived through administrative means, thus bolstering the validity of the latter as a viable proxy for gauging stroke severity. Although certain clinical and demographic factors did exhibit a slight discordance, their practical clinical significance is arguably minimal.
Conducting extensive, multi-center analyses on a larger scale with the ICD-10 NIHSS score serving as a gauge of stroke severity can provide valuable insights into stroke epidemiology and further advancements in stroke management.
Data Availability
Data are not available for public use
List: Non-standard Abbreviations and Acronyms
Abbreviation Meaning
- AIS
- Acute Ischemic Stroke
- ICD-10
- International Classification of Diseases, Tenth Revision Code
- LOS
- Length of Hospital Stay
- NIHSS
- National Institutes of Health Stroke Scale
- PCNASP
- Paul Coverdell National Acute Stroke Program
Funding
This study was funded by the NIH (1R01AG073410-01)
Disclosures
Mohamed Taha, Mamoon Habib and Victor Lomachinsky have no conflict of interest to disclose.
J.P.N. receives funding from NIH (2P01-AG032952, T32-AG51108).
L.H.S. is a scientific consultant regarding trial design and conduct on late window thrombolysis and a member of the steering committee for Genentech (TIMELESS NCT03785678); user interface design and usability to LifeImage;; member of a Data Safety Monitoring Board for Penumbra (MIND NCT03342664); principal investigator, multicenter trial of stroke prevention for Medtronic (Stroke AF NCT02700945); principal investigator, StrokeNet Network NINDS (New England Regional Coordinating Center U24NS107243).
D.B. Receives support from the NIH (5P30 AG062421-03, 2P01AG036694-11, 5U01AG032984-12, 1U24NS100591-04, 1R01AG058063-04, R01AG063975-03, 5R01AG062282-04, 3R01AG062282-03S1, 5R01AG066793-02, 1U19AG062682-03, 2P01AG032952-11, 2T32MH017119-34 Billing Agreement 010289.0001, 3P01AG032952-12S3, 1U01AG068221-01, 1U01AG076478-01, 5R01AG048351-05) and reports no conflict of interest.
L.M.V.R.M. receives support from the Centers for Diseases Control and Prevention (U48DP006377), the National Institutes of Health (NIH-NIA 1R01AG073410-01, R01AG082693, U01AG076478, P01 AG032952-11), and the Epilepsy Foundation of America and reports no conflict of interest.
Tables and Figures
STROBE Statement—checklist of items that should be included in reports of observational studies