Domain-Specific Large Language Model Assistance for Emergency Neurological Diagnosis and Treatment（DEMAND）

Capital Medical University1,360 enrolled

Overview

This study will evaluate whether Xuanwu-NeuroAid 2.0, a large language model for emergency neurology, can improve 30-day diagnostic quality in adults with acute neurological symptoms. Physicians will be randomly assigned to AI-assisted care or usual care. In the AI-assisted group, the model will provide diagnostic and management suggestions, while physicians will make all final clinical decisions. The usual-care group will receive standard emergency neurology care without large language model assistance.

This multicenter, prospective, cluster randomized trial will evaluate whether Xuanwu-NeuroAid 2.0, a domain-specific large language model for emergency neurology, can improve diagnostic quality in adults presenting with acute neurological symptoms. Physicians will be randomized to AI-assisted care or usual care. In the AI-assisted group, the model will provide diagnostic and management suggestions based on available clinical information, while physicians will remain responsible for all final clinical decisions. the model's recommendations could be disregarded when they were considered inappropriate. The usual-care group will receive standard emergency neurology care without large language model assistance.

Study Type

INTERVENTIONAL

Allocation

RANDOMIZED

Purpose

HEALTH_SERVICES_RESEARCH

Masking

SINGLE

Enrollment

1,360

Conditions

Nervous System Diseases

Eligibility

Sex: ALLMin age: 18 Years

Medical Language ↔ Plain English

Inclusion Criteria: * Age ≥18 years; * Presentation to the emergency neurology service with acute neurological symptoms; * Written informed consent provided by the patient or a legally authorized representative. Exclusion Criteria: * Presentation primarily for trauma; * Pregnancy; * Requiring immediate life-saving interventions; * Estimated life expectancy of less than 30 days; * Participation in another clinical trial within the previous 30 days or in a trial that could interfere with the study or outcome assessment; * Any condition that, in the opinion of the investigators, would interfere with the conduct of the trial or the interpretation of the results.

Outcomes

Primary Outcomes

Diagnostic-care quality risk

A composite outcome defined as the occurrence of any of the following: a diagnostic discrepancy within 30 days; unplanned medical care within 30 days; harms related to the index emergency care within 30 days; or all-cause death within 30 days.

Time frame: Day 30

Secondary Outcomes

Diagnostic discrepancy

Diagnostic discrepancy is defined as disagreement between the emergency department final diagnosis and the 30-day expert-adjudicated reference diagnosis.

Time frame: Day 30

Quality of diagnostic testing and management recommendations

The quality of diagnostic testing and management recommendations made during the index emergency department encounter will be assessed by blinded expert adjudicators after completion of 30-day follow-up, using a 5-point Likert scale. Higher scores indicate higher quality of diagnostic testing and management recommendations.

Time frame: Day 30

Patient satisfaction with emergency care

Patient satisfaction with emergency care will be assessed after the index emergency department encounter using a 5-point Likert scale, ranging from 1 to 5, with higher scores indicating greater satisfaction.

Time frame: Immediately after the index emergency department encounter, within 48 hours

Time spent per patient encounter

Time spent per patient encounter is defined as the duration from the start of physician evaluation to completion of the emergency department final diagnosis and disposition decision during the index emergency department encounter.

Time frame: Time from start of physician evaluation to completion of final diagnosis and disposition decision, within 48 hours.

Clinician-reported workload

Clinician-reported workload will be assessed by the treating physician using a single-item 5-point Likert scale, ranging from 1 to 5. Higher scores indicate greater perceived workload.

Time frame: Immediately after each index emergency department encounter, within 48 hours.

EQ-5D-5L at 30 days

Health-related quality of life will be assessed using the EuroQol five-dimension five-level questionnaire. The EQ-5D-5L utility index score will be derived according to the applicable value set. The maximum score is 1, indicating full health; higher scores indicate better health status.

Time frame: Day 30

Domain-Specific Large Language Model Assistance for Emergency Neurological Diagnosis and Treatment（DEMAND）

Overview

Conditions

Interventions

Eligibility

Locations (3)

Outcomes

Primary Outcomes

Secondary Outcomes

Central Contacts