This study will evaluate whether Xuanwu-NeuroAid 2.0, a large language model for emergency neurology, can improve 30-day diagnostic quality in adults with acute neurological symptoms. Physicians will be randomly assigned to AI-assisted care or usual care. In the AI-assisted group, the model will provide diagnostic and management suggestions, while physicians will make all final clinical decisions. The usual-care group will receive standard emergency neurology care without large language model assistance.
This multicenter, prospective, cluster randomized trial will evaluate whether Xuanwu-NeuroAid 2.0, a domain-specific large language model for emergency neurology, can improve diagnostic quality in adults presenting with acute neurological symptoms. Physicians will be randomized to AI-assisted care or usual care. In the AI-assisted group, the model will provide diagnostic and management suggestions based on available clinical information, while physicians will remain responsible for all final clinical decisions. the model's recommendations could be disregarded when they were considered inappropriate. The usual-care group will receive standard emergency neurology care without large language model assistance.
Study Type
INTERVENTIONAL
Allocation
RANDOMIZED
Purpose
HEALTH_SERVICES_RESEARCH
Masking
SINGLE
Enrollment
1,360
Xuanwu-NeuroAid 2.0 is a large language model used to support emergency neurology evaluation and management. It generates diagnostic and management suggestions based on available clinical information, including history, physical examination, laboratory results, and imaging data. Physicians may interact with the model multiple times, but remain responsible for all final clinical decisions. Its recommendations may be overridden when considered inappropriate.
Nanyang Nanshi Hospital
Nanyang, Henan, China
RECRUITINGLiuyang Jili Hospital
Guankou, Hunan, China
RECRUITINGXuanwu Hospital,Capital Medical University
Beijing, China
NOT_YET_RECRUITINGDiagnostic-care quality risk
A composite outcome defined as the occurrence of any of the following: a diagnostic discrepancy within 30 days; unplanned medical care within 30 days; harms related to the index emergency care within 30 days; or all-cause death within 30 days.
Time frame: Day 30
Diagnostic discrepancy
Diagnostic discrepancy is defined as disagreement between the emergency department final diagnosis and the 30-day expert-adjudicated reference diagnosis.
Time frame: Day 30
Quality of diagnostic testing and management recommendations
The quality of diagnostic testing and management recommendations made during the index emergency department encounter will be assessed by blinded expert adjudicators after completion of 30-day follow-up, using a 5-point Likert scale. Higher scores indicate higher quality of diagnostic testing and management recommendations.
Time frame: Day 30
Patient satisfaction with emergency care
Patient satisfaction with emergency care will be assessed after the index emergency department encounter using a 5-point Likert scale, ranging from 1 to 5, with higher scores indicating greater satisfaction.
Time frame: Immediately after the index emergency department encounter, within 48 hours
Time spent per patient encounter
Time spent per patient encounter is defined as the duration from the start of physician evaluation to completion of the emergency department final diagnosis and disposition decision during the index emergency department encounter.
Time frame: Time from start of physician evaluation to completion of final diagnosis and disposition decision, within 48 hours.
Clinician-reported workload
Clinician-reported workload will be assessed by the treating physician using a single-item 5-point Likert scale, ranging from 1 to 5. Higher scores indicate greater perceived workload.
Time frame: Immediately after each index emergency department encounter, within 48 hours.
EQ-5D-5L at 30 days
Health-related quality of life will be assessed using the EuroQol five-dimension five-level questionnaire. The EQ-5D-5L utility index score will be derived according to the applicable value set. The maximum score is 1, indicating full health; higher scores indicate better health status.
Time frame: Day 30
This platform is for informational purposes only and does not constitute medical advice. Always consult a qualified healthcare professional.