Abstract
Verbatim transcription of qualitative data is a cornerstone of analytic quality and rigor, yet the time and energy required for such transcription can drain resources, delay analysis and hinder the timely dissemination of qualitative insights. In recent years, software programs have presented a promising mechanism to accelerate transcription, but the broad application of such programs has been constrained due to expensive licensing or “per-minute” fees, data protection concerns, and limited availability of such programs in many languages. In this article, we outline our process of developing and adapting a free, open-source, speech-to-text algorithm (Whisper by OpenAI) into a usable and accessible tool for qualitative transcription. Our program, which we have dubbed “Vink” for voice to ink, is available under a permissive open-source license (and thus free of cost). We assessed Vink’s reliability in transcribing authentic interview audio data in 14 languages, and identified high accuracy and limited correction times in most languages. A majority (9 out of 12) of reviewers evaluated the software performance positively, and all reviewers whose transcript had a word-error-rate below 20% (n=9) indicated that they were likely or very likely to use the tool in their future research. Our usability assessment indicates that Vink is easy-to-use, and we are continuing further refinements based on reviewer feedback to increase user-friendliness. With Vink, we hope to contribute to facilitating rigorous qualitative research processes globally by reducing time and costs associated with transcription, and expanding the availability of this transcription software into several global languages. With Vink running on the researcher’s computers, data privacy issues arising within many other solutions do not apply.
Summary box
What is already known on this topic: Transcription is a key element to ensure quality and rigor of qualitative data for analysis. Current practices, however, often entail high costs, variable quality, data privacy concerns, stress for human transcribers, or long delays of analysis.
What this study adds: We present the development and assessment of a transcription tool (Vink) for qualitative research drawing upon an open-source automatic speech recognition system developed by OpenAI and trained on multilingual audio data (Whisper). Initial validation in real-life data from 14 languages shows high accuracy in several languages, and an easy-to-use interface.
How this study might affect research, practice or policy: Vink overcomes limitations of transcription by providing a ready to use, open source and free-of-cost tool, with minimal data privacy concerns, as no data is uploaded to the web during transcription.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
Shannon McMahon, The Scientific Software Center Heidelberg (funded as part of the Excellence Strategy of the German Federal and State Governments), the National Institute of Allergy and Infectious Diseases, NIH, USA (U01AI152087) for the Rapid Research in Diagnostics Development for TB Network (R2D2 TB Network) as well as the Ministry of Science, Research and the Arts Baden-Wuerttemberg
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The reviewers of the automatically generated transcripts did only submit technical information about the transcript. They did not provide any personal information. All assessment for perceived usefulness was done anonymously and did not include any personal or individually identifiable information. The assessment for usability of the application was done within the research group. The institutional review board of the medical faculty, University of Heidelberg, Germany, therefore exempted this study from ethical review.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
Additional data are available upon reasonable request and after approval of our IRB Out of privacy concerns we did not share the audio recordings, nor the original and corrected transcripts of these audio recordings. This also applies to the audio recordings of the usability assessment. To request access please contact the corresponding author Hannah Tolle (h.tolle{at}stud.uni-heidelberg.de).