Abstract
AImedReport is a proof-of-concept team-based documentation strategy that consolidates available AI research reporting guidelines and centrally tracks and organizes any information provided as part of following various guidelines. It functions to assist teams by a) outlining phases of the AI lifecycle and clinical evaluation; b) iteratively developing a comprehensive documentation deliverable and historical archive; and c) addressing translation, implementation, and accountability gaps. By acting as a hub for determining what information to capture, it helps navigate team responsibilities, simplify compliance with evaluation and reporting measures, and fulfill requirements to support clinical trial documentation and publications. Here, we give an overview of this system and describe how it can be used to address documentation and collaboration challenges in AI translation.
Introduction
The core of AI research in healthcare is carried out by AI data scientists, AI engineers, and clinicians; however, successfully evaluating and translating AI technologies into healthcare requires cross-collaboration beyond this group. Throughout ideation, development, and validation, successful translation requires engaging with many domains, including AI ethicists, quality management professionals, systems engineers, and more.1-5 We found through a scoping review that the prioritization of proactive evaluation of AI technologies, multidisciplinary collaboration, and adherence to investigation and validation protocols, transparency and traceability requirements, and guiding standards and frameworks are expected to help address present barriers to translation.6 However, as identified by Lu et al7 through a systematic review assessing clinical prediction model adherence to reporting guidelines that no consensus exists regarding model details that are essential to report, with some reporting items being commonly requested across reporting guidelines yet other reporting items being unique to specific reporting guidelines. Unless there is clear, consistent, and unified best practices and communication and collaboration across domains, there will be gaps in development, accountability, and implementation.6-10 Documentation is a crucial part of reporting and translation, but its coordinated maintenance throughout the AI lifecycle remains a challenge.6,9-11
We have established a proof-of-concept team-based documentation strategy for AI translation to simplify compliance with evaluation and research reporting standards through the development of AimedReport, a reporting guideline documentation repository. AimedReport organizes available reporting guidelines for different phases of the AI lifecycle, consolidating reporting items from different guidelines, assigning specific roles to team members, and guiding relevant information to capture when knowledge is generated (Appendix A).
Method of Development
We established a centralized documentation repository by first conducting a scoping review6 to investigate and understand the existing landscape of AI documentation and available resources (reporting guidelines, protocols, standards, frameworks, etc.). Within the scoping review, we found that documentation resources were fragmented throughout several reporting guidelines, prompting the consolidation and organization of such resources into AImedReport as a tool to structure available reporting guidelines in accordance with the AI lifecycle, reduce repetitive documentation burden, and promote knowledge continuity. Six research reporting guidelines make up the AImedReport, including: CONSORT-AI,12 DECIDE-AI,13 ML Test Score,14 Model Card,15 SPIRIT-AI,16 and TRIPOD17 (Table 1). The items that make up each reporting guideline are included in the AImedReport as “Reporting Items” and describe considerations for teams to document and maintain.
The AImedReport, conceptualized in 2022, was designed in concert with the AI Evaluation Framework by Overgaard et al.1 in 2022, which outlines clinical AI research and development stages. This alignment was established to support compliance with reporting standards and provide a reference for the entire AI development lifecycle, aiding in informing development phases, engaging stakeholders, and supporting interpretability, knowledge continuity, transparency, and trust. While based on the Overgaard et al.1 framework, AImedReport’s matured versatility allows it to potentially suit other frameworks such as van der Vegt’s18 SALIENT framework for broader AI implementation.
Each Reporting Item was mapped to one of the phases of the AI Evaluation Framework1 to streamline documentation when knowledge is generated: Prepare, Develop, Validate, Deploy, and Maintain. The “Prepare” phase focuses on metadata related to the owner, defining the model’s purpose and clinical impact, data preparation, and planning for model development. The “Develop” phase centers around model development and evaluation, usability related to inputs, assessing risk and bias, and protocol development for validation studies. The “Validate” phase catalogs information about the design and execution of how the model was validated, summative usability testing, generating user education, and planning for deployment. The “Deploy” phase focuses on clinical validation and generating training materials. Lastly, the “Maintain” phase plans for post-deployment surveillance and maintenance and quality monitoring and auditing. Reporting items were grouped into each of these five phases and then further classified into subgroups by identifying common themes (i.e., Prepare-Purpose and Clinical Impact; Develop-Model Development and Evaluation; Deploy-Clinical Validation). For each “Reporting Item”, the team or team members that need to be involved at each phase and in what ways (e.g., reporting, maintaining documentation, or utilization) were also defined, as shown in Table 2.
Discussion
The interactions among AI technologies, their users, and the implementation environments actively define the overall potential effectiveness of AI interventions within healthcare, especially because these tools are complex interventions designed as clinical decision support systems, not autonomous agents.8,19,20 A tailored, step-by-step approach may support the transition of AI technologies from being evaluated by statistical performance to clinical validity. To address this translational gap, AImedReport was developed to assist teams in several key areas, including a) outlining phases of the AI lifecycle and clinical evaluation, b) developing a comprehensive documentation deliverable and historical archive, and c) addressing translation, implementation, and accountability gaps. This is achieved by consolidating the existing landscape of research reporting guidelines into a repository. This repository acts as a centralized documentation hub and provides a standardized list of considerations and accountability assignments as the solution advances across the development lifecycle. AImedReport, accessible here, (Appendix A), is presently a prototype tool housed in a spreadsheet, but is planned to be made available as a Web resource and a software platform. This will likely further enhance the tool’s usability, reproducibility, and convenience by providing the ability to automate the documentation process, enhance task completion and generate deliverables in accordance with relevant reporting measures, and allow for communication and updates to model documents to be centrally available across teams.
Introducing such a platform allows for transparent communication of evaluation and reporting measures, but also embraces anticipated changes and modifications that come with development and maintenance. Each “Reporting Item” can be assigned to a team or team member to define who is responsible, accountable, consulted, and informed, who can then use the Reporting Item description as a reference to satisfy their role.21 For example, project managers, user experience researchers, and machine learning operations (MLOps) can contribute model overview, goals, and future state from their respective perspectives and reference one another’s vision. Similarly, data scientists, AI ethicists, informatics teams, and clinical practice committees may use documented demographic data of patient populations to assess items such as bias, differential model performance, appropriate clinical location, and potential clinical workflow location. During deployment and maintenance, the primary user and updater of the documentation will be an MLOps team, to ensure requirements set by previous groups are met, monitoring input and output metrics for drift, volume, and appropriate use. This can also facilitate interoperability between organizations, as the tool provides a standardized format, and documentation can be transferred across organizations and research governing bodies for consumption, auditing, and monitoring. AlmedReport also serves as a source of information describing completed evaluation and research reporting measures and can therefore fulfill reporting requirements to support clinical trial documentation and other publications. Additional descriptions of roles and responsibilities included within AImedReport can be found in Appendix B.
This paper focused on describing the theorization and development of AImedReport as a proof of concept to aid in evaluating, consolidating, and understanding available documentation resources to support AI reporting and facilitate communication across a multidisciplinary team. AlmedReport primarily concentrated on research reporting guidelines to address the immediate gaps identified within documentation practices. We note recent progress as the field rapidly advances towards enhancing implementation strategies within multidisciplinary teams. For example, in a study conducted by van der Vegt et al.18 titled “Implementation Frameworks for End-to-End Clinical AI: Derivation of the SALIENT Framework,” an extensive mapping exercise was conducted to synchronize various guidelines with an AI implementation framework. We suggest that AImedReport could further contribute to such implementation endeavors as a valuable resource. Planned future work will continue to converge with and align to various frameworks, like the SALIENT framework,18 ABCDS, 22 and organizations, including the Office of the National Coordinator,23 Food & Drug Administration,24 Coalition for Health AI,25 National Academy of Medicine,5 Health AI Partnership,26 National Institute of Standards and Technology,27 World Health Organization,28 and others.
We believe that AImedReport can be used in its current formative state for researchers and healthcare organizations to adhere to evaluation and research reporting standards, as well as to bridge some of the reporting and documentation requirements for products necessitating design controls under good manufacturing practices of the quality system regulation, such as those that may be Software as a Medical Device (SaMD).28-30 Future iterations of AlmedReport will better align translational science and regulatory science so that documentation can be used directly by teams pursuing regulated pathways, aligning with the information needed by regulatory review groups, accreditation commissions, and regulatory bodies (e.g., Food & Drug Administration).
Next Steps and Conclusion
Our multidisciplinary team developed AImedReport as a strategic effort to address collaboration and documentation challenges in AI translation. AImedReport functions to assist teams by a) outlining phases of the AI lifecycle and clinical evaluation, b) iteratively developing a comprehensive documentation deliverable and historical archive, and c) addressing translation, implementation, and accountability gaps. By consolidating the existing landscape of research reporting guidelines into a repository, AImedReport acts as a centralized documentation hub that provides a standardized list of considerations and accountability assignments to guide information capture when knowledge is generated and simplify compliance with evaluation and reporting measures as AI technologies advances across the lifecycle. Completed measures documented within the AlmedReport may also serve as a source of information to fulfill reporting requirements to support clinical trial documentation and other publications. The integration of AImedReport into existing IT infrastructure and reporting platforms has undergone phased development, starting with the creation of a Model Documentation Framework presented at the AMIA 2022 Clinical Informatics Conference, refined through feedback from the Coalition for Health AI in 2022,31 and forming the foundation for collaborative efforts across various AI evaluation considerations. Mayo Clinic’s regulatory and systems engineering teams are adapting the AIMedReport framework to fit within regulatory infrastructure, aiming to scale multidisciplinary reporting for enterprise-wide AI applications.
This integration process involves continued interdisciplinary collaboration and evaluation to ensure scalability and applicability across Mayo Clinic departments and disciplines. Future work will include expanding AImedReport beyond a proof-of-concept phase and supporting various frameworks and organizations to enhance usability, including direct alignment of translational and regulatory sciences through FDA SaMD documentation.
Data Availability
All data produced in the present work are contained in the manuscript.
https://docs.google.com/spreadsheets/d/1jenXP5miRxcteV6XRU71e-A7sJB46Ztz/edit#gid=603674031
Appendix
Appendix A AImedReport
AImedReport is comprised of reporting items, outlined following product lifecycle phases to guide translation and promote transparent and explainable AI/ML-based MMS documentation. Access to the live document can be found here: AImedReport Link
Appendix B Example AI Research Team Roles and Responsibilities
Footnotes
Financial support and conflict of interest disclosure: None to disclose.
Abstract Presentation: AMIA 2022 Clinical Informatics Conference; Houston, TX; May 2022
Revisions based on reviewer feedback
Abbreviations
- AI
- artificial intelligence
- MLOps
- machine learning operations
- UX
- user experience.