The goal of this randomized clinical trial is to learn whether an "ambient AI scribe" (Voa Health) can reduce documentation burden and improve physician well-being and patient experience in outpatient clinics. The AI scribe listens to the audio of the consultation and produces a draft of the clinical note that the physician reviews and edits. In this study, consultations are randomized to 2 groups: usual documentation (without AI) or documentation assisted by the AI scribe. Adult patients seen in participating clinics, and their physicians, are invited to take part. For both groups, the consultation audio is recorded and, at the end of the visit, physicians and patients complete short questionnaires about well-being, workload, communication, empathy, and satisfaction. The questionnaires are based on internationally used scales (such as PFI, Mini-Z, NASA-TLX, CARE, PSQ-18, and CAT) but adapted to keep them brief and feasible in routine care. The main questions are whether the AI scribe lowers the time and effort needed to document the visit, improves physician professional fulfillment and reduces burnout, and whether it affects how patients perceive the communication, empathy, and overall quality of the consultation. No drugs or devices are being tested. The results are expected to guide hospitals on the safe and effective use of ambient AI scribes in real-world clinical practice.
This study is a randomized controlled trial designed to assess the impact of an ambient artificial-intelligence (AI) scribe on physician well-being, documentation workload, and patient experience in routine outpatient care. The intervention consists of using the Voa Health ambient AI scribe during clinical encounters. The system records the audio of the consultation and generates a structured draft clinical note in real time, aligned with specialty-specific templates that reflect the routine workflow of each clinic (for example, different templates for general cardiology, heart failure, dyslipidemia, etc.). At the end of the visit, the physician reviews, edits, and signs the draft in the electronic medical record (EMR), remaining fully responsible for the accuracy and completeness of the documentation. In the control condition, physicians conduct consultations and document encounters using their usual methods without AI support. For study purposes, audio may still be recorded in the control arm, but no AI-generated note is displayed or used by the clinician. The unit of randomization is the individual consultation. For participating physicians, eligible visits are automatically allocated to one of two parallel arms: (1) usual documentation without AI and (2) documentation assisted by the ambient AI scribe. Randomization is designed to preserve the existing organization of each clinic and to avoid interference with scheduling or patient flow. Clinical care, diagnostic and therapeutic decisions, and follow-up procedures are not dictated by the protocol and follow usual practice; the only experimental element is the use (or non-use) of the AI scribe for documentation and the collection of audio and questionnaires. Adult patients seen in participating outpatient clinics, and their physicians, are invited to take part. After informed consent, the entire consultation is audio-recorded. Immediately after each visit, both patient and physician are asked to complete brief, structured questionnaires that capture the main outcomes of interest. To keep data collection feasible in a busy ambulatory setting, the instruments were built from subsets of items derived from internationally used scales, while keeping the number of questions per consultation small. For physicians, items are drawn from the Professional Fulfillment Index (PFI), the Mini-Z 2.0 survey, and the 4-item Physician Task Load / NASA-TLX. These items assess professional fulfillment and burnout (physical and emotional exhaustion), perceived sufficiency of time for documentation, work in the EMR outside direct patient contact, perceived documentation burden, and temporal demand of the visit. Additional study-specific items evaluate the perceived quality and completeness of the final note, time required to edit the AI-generated draft, confidence that key clinical details were captured, occurrence of potential AI "hallucinations" (information not actually stated in the visit), and the perceived impact of documentation on attention to the patient. For patients, questionnaires use items derived from the Consultation and Relational Empathy (CARE) Measure, the Patient Satisfaction Questionnaire Short-Form (PSQ-18), and the Communication Assessment Tool (CAT). These items cover domains such as active listening, understanding of patient concerns, clarity of explanations, adequacy of time spent with the physician, perceived empathy, overall satisfaction with care, and understanding of diagnosis and treatment. In the AI arm, one additional item specifically asks whether the use of AI during the consultation helped, did not change, or hindered the clarity of communication with the physician. The primary outcomes are physician-reported well-being and perceived documentation workload when using the ambient AI scribe compared with usual documentation. Key secondary outcomes include patient-reported experience and satisfaction, physician-rated quality and completeness of notes, time required for documentation and for editing AI-generated drafts, and the frequency and clinical relevance of AI-related documentation errors or hallucinations. All outcomes are measured at the level of the individual consultation, immediately after each visit. The trial is initially conducted in multiple outpatient clinics of Hospital de Clínicas of the Federal University of Paraná (UFPR), Brazil, across different medical specialties. In each service, structured note templates are developed in collaboration with local clinical leaders so that the AI-generated drafts reflect the real-world flow of that specialty without changing the standard of care. Medical students and residents trained in the protocol may support consent and questionnaire administration under supervision of attending physicians, to ensure consistent and feasible data collection. Data are stored in secure, access-controlled servers, with linkage between audio recordings, questionnaires, and EMR notes managed through coded identifiers. A data monitoring committee, independent from the development team of the AI scribe, periodically reviews aggregated data for protocol adherence, data quality, and any safety concerns related to the use of AI in documentation (for example, systematic documentation errors that could potentially affect patient care). Because the intervention is limited to documentation support and clinicians remain responsible for all clinical decisions and for finalizing the notes, the overall risk of participation is considered minimal. The protocol allows for future expansion to other outpatient services and collaborating centers that adopt the same randomization, data collection procedures, and outcome definitions. The results are expected to provide pragmatic evidence on how ambient AI scribes can be implemented safely and effectively in real-world clinical practice, particularly regarding their impact on physician well-being, documentation workload, and the patient's experience of the consultation.
Study Type
INTERVENTIONAL
Allocation
RANDOMIZED
Purpose
HEALTH_SERVICES_RESEARCH
Masking
NONE
Enrollment
300
Use of an ambient artificial-intelligence (AI) scribe during outpatient consultations. The Voa Health system records the audio of the visit and generates a structured draft clinical note based on specialty-specific templates that follow the usual flow of each clinic. After the consultation, the physician reviews, edits, and signs the note in the electronic medical record. The AI does not make diagnostic or therapeutic decisions; it only assists documentation. All other aspects of clinical care follow routine practice.
Clinical documentation performed using usual methods without AI support (standard care). Physicians document the encounter in the electronic medical record as they normally do (typing, dictation, or handwritten notes as applicable). Audio of the visit may be recorded for study purposes, but no AI-generated draft note is shown to the clinician. After the consultation, physicians and patients complete the same brief questionnaires about workload, well-being, communication, empathy, and satisfaction.
Complexo Hospital de Clínicas da UFPR (CHC-UFPR)
Curitiba, Paraná, Brazil
RECRUITINGPhysician documentation workload during the visit
Physician-reported documentation burden, measured immediately after each consultation using brief 5-point Likert-type items adapted from internationally used instruments (NASA-TLX, Mini-Z) and study-specific items. Items assess: (1) administrative burden of the consultation, (2) time available to focus on the patient, (3) mental demand of the consultation (adapted NASA-TLX), (4) interference of the documentation process with patient interaction, and (5) disruption of workflow due to documentation adjustments. Individual item scores and a composite burden score (mean of items; higher values = greater burden) will be compared between consultations with the ambient AI scribe and consultations with usual documentation without AI.
Time frame: Immediately after each outpatient consultation (same day)
Physician well-being / exhaustion during the visit
Physician physical exhaustion immediately after the consultation, assessed with a single item derived from the Professional Fulfillment Index (PFI): "I feel physically exhausted after this consultation," rated on a 5-point agreement scale (strongly disagree to strongly agree; higher values = greater exhaustion). Scores will be compared between consultations with the ambient AI scribe and consultations with usual documentation without AI.
Time frame: Immediately after each outpatient consultation (same day)
Patient experience of communication and empathy
Patient-reported experience of the consultation, including active listening, understanding of concerns, clarity of explanations, perceived empathy, time spent with the physician, and overall satisfaction. Items are derived from the CARE Measure, PSQ-18, and Communication Assessment Tool (CAT), rated on 5-point Likert scales. A composite score (mean of item scores; higher values = better experience) will be compared between consultations with the ambient AI scribe and consultations with usual documentation without AI.
Time frame: Immediately after each outpatient consultation (same day)
Patient understanding of diagnosis and treatment
Single item asking patients how much they understood about their diagnosis and treatment after the consultation, rated on a 5-point scale from 1 (nothing) to 5 (completely). Scores will be compared between consultations with the ambient AI scribe and consultations with usual documentation without AI.
Time frame: Immediately after each outpatient consultation (same day)
Physician-rated quality and completeness of clinical notes
Physician global rating of the final clinical note for the consultation (clarity, organization, and completeness), using a 5-point scale adapted from Mini-Z (poor, marginal, satisfactory, good, excellent). Scores will be compared between consultations with the ambient AI scribe and consultations with usual documentation without AI.
Time frame: Immediately after finalizing documentation for each consultation (same day)
Time required for documentation outside direct patient contact
Physician self-reported time spent working on the electronic medical record outside direct patient contact for that consultation (for example, after the patient leaves the room), including review and editing of AI-generated notes when applicable. Time is rated in ordered categories (almost none; minor edits; moderate edits; extensive edits). Distributions of categories will be compared between the ambient AI scribe and usual documentation conditions.
Time frame: Immediately after each outpatient consultation (same day)
Proportion of consultations with AI-related hallucinations in documentation
Among consultations in the ambient AI scribe arm, physicians indicate whether the AI inserted any information in the draft note that was not actually mentioned in the consultation. The outcome is the proportion of AI-assisted consultations with at least one reported hallucination.
Time frame: Immediately after each AI-assisted consultation (same day)
Proportion of clinical notes with documentation errors in blinded external review
De-identified clinical notes from both study arms (AI scribe and usual documentation) will be randomly mixed and independently reviewed by at least two clinicians who are blinded to group allocation. Reviewers will classify whether each note contains any clinically relevant documentation error, including fabricated or incorrect information ("hallucinations") that does not match the plausible content of a standard outpatient visit. The primary variable is the proportion of notes with at least one such error in each arm; proportions will be compared between AI-scribe and usual documentation conditions.
Time frame: Within 30 days after finalizing each clinical note
This platform is for informational purposes only and does not constitute medical advice. Always consult a qualified healthcare professional.