This proof-of-concept randomized controlled trial evaluates a reinforcement learning (RL)-based clinical decision support system for intraoperative hemodynamic management during non-cardiac surgery. Background: Intraoperative hypotension is common during general anesthesia and is associated with adverse outcomes including acute kidney injury, myocardial injury, and increased mortality. Current hemodynamic management relies on the individual anesthesiologist's clinical judgment, which can vary in consistency and timeliness. An RL-based system that learns optimal vasoactive agent dosing strategies from clinical data may help standardize and improve real-time hemodynamic decision-making. Purpose: The primary objective is to evaluate whether the RL-based decision support system can learn intraoperative hemodynamic management decisions comparable to those of experienced anesthesiologists, as measured by the mean absolute error (MAE) between RL-recommended and clinician-executed vasoactive agent doses. The secondary objective is to assess whether RL-guided management improves clinical hemodynamic outcomes, including the time-weighted average of hypotension and the percentage of time with mean arterial pressure within the target range. Participants: Adult patients (aged 18 to 85 years, ASA I-IV) scheduled for elective non-cardiac surgery under general anesthesia with continuous invasive arterial blood pressure monitoring. Procedures: Participants will be randomly assigned (1:1) to one of two groups. In the RL-guided group, the anesthesiologist will receive real-time vasoactive agent dosing recommendations from the decision support system displayed on a bedside screen; the anesthesiologist retains full clinical autonomy over all final decisions. In the standard care group, the anesthesiologist will manage hemodynamics according to institutional standard practice without input from the system. The patient and the outcomes assessor will be masked to group assignment. Data collection covers the intraoperative period and 30-day postoperative follow-up.
Rationale: Intraoperative hypotension, commonly defined as mean arterial pressure (MAP) below 65 mmHg, occurs in up to 60% of patients undergoing general anesthesia and is independently associated with acute kidney injury, myocardial injury, stroke, and 30-day mortality. Current approaches to hemodynamic management are reactive and rely on individual clinician judgment. Machine learning-based prediction systems (e.g., the Hypotension Prediction Index) have shown potential in reducing hypotension burden, but do not provide actionable dosing recommendations. Reinforcement learning (RL) offers a fundamentally different approach: learning optimal sequential decision-making policies from clinical data. The RL-PRAIS system (Reinforcement Learning-based Perioperative Real-time Anesthesia Intelligent System) consists of a Transformer-based patient state encoder and a model-based RL framework (patient model plus policy model) that generates real-time vasoactive agent dosing recommendations. Hypothesis: The RL-PRAIS system can learn intraoperative hemodynamic management decisions that approximate those of experienced anesthesiologists and, when deployed as a decision support tool, can reduce the burden of intraoperative hypotension compared to standard care. Study Design: This is a prospective, dual-center, parallel-group, randomized, controlled, proof-of-concept superiority trial. The study follows a four-phase stepwise validation framework: (1) model development using retrospective electronic health records (n = 7,216), (2) retrospective validation (n = 75), (3) prospective deployment study (n = 40), and (4) proof-of-concept randomized controlled trial (n = 40, 20 per arm). This registration covers Phase 4 (the RCT). Objectives: Primary Objective: To evaluate the concordance between RL-recommended vasoactive agent dosing and attending anesthesiologist-executed dosing, quantified by mean absolute error (MAE) in norepinephrine-equivalent units. Secondary Objectives: To compare intraoperative hemodynamic outcomes between the RL-guided and standard care groups, including time-weighted average of MAP below 65 mmHg (TWA-MAP \< 65), percentage of time with MAP in the target range of 65-100 mmHg (MAP TIR), incidence of hypotensive events, cumulative vasoactive agent consumption, MAP variability, and clinician acceptance rate of RL recommendations. Exploratory Objectives: To collect pilot data on 30-day major adverse cardiac or cerebrovascular events (MACCE), acute kidney injury, perioperative myocardial injury, PACU length of stay, hospital length of stay, and 30-day all-cause mortality. Interventions: Intervention Group (RL-Guided): The RL-PRAIS system provides real-time dosing recommendations for vasoactive agents (norepinephrine, phenylephrine, and ephedrine) based on continuous invasive arterial blood pressure monitoring and patient state features. Recommendations are displayed on a bedside screen at 1-minute intervals. The attending anesthesiologist retains full clinical autonomy and is responsible for all final dosing decisions. Control Group (Standard Care): The anesthesiologist manages intraoperative hemodynamics according to institutional standard practice and clinical judgment, with continuous invasive arterial blood pressure monitoring but without input from the RL-PRAIS system. Study Population: Adult patients aged 18 to 85 years, ASA physical status I-IV, scheduled for elective non-cardiac surgery under general anesthesia with continuous invasive arterial blood pressure monitoring. Surgery is expected to last at least 2 hours. Key exclusion criteria include emergency surgery, severe obesity (BMI \>= 35), and allergy to study-related agents. Statistical Considerations: The sample size of 40 participants (20 per arm) is designed for a proof-of-concept evaluation, consistent with the benchmarking study (Wang et al., Nature Medicine 2023, RL-DITR, n = 16 in feasibility RCT). The primary analysis will compare MAE between groups using a two-sided paired or independent t-test (or non-parametric equivalent). Secondary hemodynamic outcomes will be compared using the Wilcoxon rank-sum test. Exploratory outcomes will be reported descriptively without formal hypothesis testing.
Study Type
INTERVENTIONAL
Allocation
RANDOMIZED
Purpose
TREATMENT
Masking
DOUBLE
Enrollment
40
An AI-based software system that provides real-time, on-screen recommendations to the anesthesiologist regarding drug administration and adjustments to maintain optimal patient physiological parameter
Standard anesthesia management according to institutional guidelines. The attending anesthesiologist makes all clinical decisions based on their expertise and judgment without input from the investigational AI system.
Peking University People's Hospital
Beijing, Beijing Municipality, China
Beijing Tsinghua Changgung Hospital
Beijing, Beijing Municipality, China
Mean Absolute Error (MAE) of RL-Recommended Vasoactive Agent Dosing Versus Anesthesiologist-Executed Dosing
The primary outcome is the mean absolute error (MAE) between vasoactive agent doses recommended by the RL-PRAIS system and doses actually administered by the attending anesthesiologist. Vasoactive agents (norepinephrine, phenylephrine, ephedrine) are converted to norepinephrine-equivalent units (NEq, mcg/kg/min). MAE is computed at each 1-minute decision epoch during hemodynamic optimization and averaged across all epochs per patient. Concordance rate is also reported as the proportion of epochs where the RL-recommended dose falls within a clinically acceptable range of the executed dose.
Time frame: From the onset of hemodynamic optimization (15 minutes after surgical incision) until surgical wound closure
Time-Weighted Average of Mean Arterial Pressure Below 65 mmHg (TWA-MAP < 65 mmHg)
The time-weighted average of hypotension during the hemodynamic optimization period, calculated as the area under the MAP threshold of 65 mmHg (depth in mmHg multiplied by duration in minutes) divided by the total hemodynamic optimization time (minutes). This composite metric integrates both the severity and duration of intraoperative hypotension. Continuous invasive arterial blood pressure data sampled at 5-second intervals are used for calculation via the trapezoidal rule. A lower value indicates better hemodynamic management.
Time frame: From the onset of hemodynamic optimization (15 minutes after surgical incision) until surgical wound closure
Percentage of Time with MAP in Target Range (MAP TIR 65-100 mmHg)
The proportion of total hemodynamic optimization time during which mean arterial pressure remains within the target range of 65 to 100 mmHg, expressed as a percentage. A higher percentage indicates more stable hemodynamic control.
Time frame: From the onset of hemodynamic optimization (15 minutes after surgical incision) until surgical wound closure
Incidence of Intraoperative Hypotensive Events
The number of discrete hypotensive events per patient, defined as episodes where MAP falls below 65 mmHg for at least 1 minute continuously. An event is considered terminated when MAP returns to 65 mmHg or above for at least 1 minute.
Time frame: From the onset of hemodynamic optimization (15 minutes after surgical incision) until surgical wound closure
TWA-MAP Below 60 mmHg and Below 55 mmHg
Time-weighted average of MAP below 60 mmHg and below 55 mmHg, calculated using the same method as TWA-MAP \< 65 mmHg but with lower thresholds. These metrics capture moderate and severe intraoperative hypotension, respectively.
Time frame: From the onset of hemodynamic optimization (15 minutes after surgical incision) until surgical wound closure
Cumulative Vasoactive Agent Consumption
Total cumulative dose of all vasoactive agents administered during the hemodynamic optimization period, converted to norepinephrine-equivalent units (NEq, mcg/kg). This metric serves as a safety indicator to assess whether RL-guided management leads to excessive or reduced pharmacological intervention compared to standard care.
Time frame: From the onset of hemodynamic optimization (15 minutes after surgical incision) until surgical wound closure
Mean Arterial Pressure Variability
Coefficient of variation (CV) of mean arterial pressure during the hemodynamic optimization period, calculated as the standard deviation of MAP divided by the mean MAP, expressed as a percentage. A lower CV indicates more stable hemodynamic control.
Time frame: From the onset of hemodynamic optimization (15 minutes after surgical incision) until surgical wound closure
Clinician Acceptance Rate of RL Recommendations (Experimental Arm Only)
The proportion of RL-PRAIS dosing recommendations that are accepted and executed by the attending anesthesiologist without modification, expressed as a percentage of total recommendation epochs. Partial acceptance (dose modification in the recommended direction) is recorded separately.
Time frame: From the onset of hemodynamic optimization (15 minutes after surgical incision) until surgical wound closure
This platform is for informational purposes only and does not constitute medical advice. Always consult a qualified healthcare professional.