A Dataset of Robot-Patient and Doctor-Patient Medical Dialogues for Spoken Language Processing Tasks
A new dataset named MeDial-Speech has been introduced to enhance spoken language processing in medical consultations. It includes over 111 hours of dialogue data from robot-patient and doctor-patient interactions, covering various health conditions. The dataset aims to improve the training and evaluation of medical AI systems.
- ▪MeDial-Speech is a dataset designed for training Med-AIs in medical consultations.
- ▪The dataset contains 111+ hours of speech data collected from realistic environments.
- ▪It covers four health conditions: Lewy body dementia, heart failure, shoulder pain, and angina.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2605.26747 (cs) [Submitted on 26 May 2026] Title:A Dataset of Robot-Patient and Doctor-Patient Medical Dialogues for Spoken Language Processing Tasks Authors:Heriberto Cuayahuitl, Grace Jang View a PDF of the paper titled A Dataset of Robot-Patient and Doctor-Patient Medical Dialogues for Spoken Language Processing Tasks, by Heriberto Cuayahuitl and Grace Jang View PDF HTML (experimental) Abstract:Large Language Models (LLMs) have brought huge improvements to Artificial Intelligence (AI), which can be applied to general-purpose tasks. However, their application to textual or spoken medical consultations is still an open research problem.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.