The American Society of Anesthesiologists (ASA) Physical Status Classification System is widely used to assess perioperative risk, but it does not explicitly include frailty as a standardized variable. In daily clinical practice, anesthesiologists may implicitly incorporate frailty-related information into ASA classification based on individual clinical judgment, which may lead to variability between evaluators. In recent years, large language models (LLMs), a type of artificial intelligence, have been increasingly used in medical decision-support research. Unlike human clinicians, these models process information in a structured and explicit manner, without relying on intuition or implicit reasoning. The primary objective of this study is to compare ASA Physical Status classifications assigned by anesthesiologists and by two different large language models using standardized preoperative clinical data from adult patients undergoing elective surgery. A secondary objective is to evaluate how the addition of a frailty index influences ASA classification decisions made by human experts and artificial intelligence models. This prospective observational study aims to improve understanding of differences in clinical reasoning between anesthesiologists and artificial intelligence systems and to explore the role of frailty in perioperative risk assessment.
This study is designed as a prospective, observational, comparative, multirater investigation evaluating differences in ASA Physical Status Classification between anesthesiologists and large language models (LLMs). The ASA Physical Status Classification System is a cornerstone of perioperative risk assessment; however, it lacks explicit incorporation of frailty, a multidimensional concept reflecting reduced physiological reserve and vulnerability. In clinical practice, anesthesiologists often integrate frailty-related information implicitly into ASA assessments, potentially contributing to interobserver variability. Large language models process clinical information using explicit, structured inputs and do not rely on experiential or intuitive reasoning. This characteristic provides a unique opportunity to explore cognitive differences between human experts and artificial intelligence in clinical classification tasks. Adult patients (≥18 years) scheduled for elective surgery will be included. Emergency cases, pediatric patients, and individuals with insufficient clinical data to allow ASA classification will be excluded. For each patient, standardized preoperative clinical data will be collected, including demographic characteristics, body mass index, comorbidities, regular medications, and type of planned surgical procedure. ASA Physical Status Classification will be independently assigned by four board-certified anesthesiologists with at least five years of clinical experience, as well as by two large language models developed by different organizations. All evaluations will be conducted using the same standardized dataset, and evaluators will be blinded to each other's assessments. The study will be conducted in two sequential phases. In the first phase, ASA classification will be performed using standard clinical data alone. In the second phase, a validated frailty index will be added to the same patient dataset, and the evaluation process will be repeated. This design will allow assessment of how frailty information affects ASA classification decisions in human and artificial intelligence evaluators. Large language models will be prompted using a predefined, standardized prompt that remains unchanged throughout the study. Models will be instructed to generate a single ASA Physical Status category (I-V) without providing explanations or additional commentary, and no iterative prompting or feedback will be allowed. Interrater agreement among anesthesiologists, between artificial intelligence models, and between human and artificial intelligence evaluators will be analyzed using Cohen's Kappa and Fleiss' Kappa statistics, as appropriate. Changes in ASA classification following the addition of frailty information will be evaluated using paired statistical methods. Statistical significance will be defined as p \< 0.05. By comparing ASA classification patterns between anesthesiologists and large language models, both with and without frailty data, this study aims to clarify the role of implicit and explicit reasoning in perioperative risk assessment and to contribute to the development of future artificial intelligence-assisted clinical decision-support systems.
Study Type
OBSERVATIONAL
Enrollment
200
This is an observational study with no clinical intervention. No treatment, procedure, drug, or device is assigned as part of the study. ASA Physical Status Classification is assessed using existing preoperative clinical data.
Istanbul Provincial Health Directorate Fatih Sultan Mehmet Training and Research Hospital
Istanbul, Turkey (Türkiye)
Interrater Agreement in ASA Physical Status Classification Between Anesthesiologists and Large Language Models
Agreement in ASA Physical Status Classification (ASA I-V) between four anesthesiologists and two large language models based on standardized preoperative clinical data, assessed using interrater agreement statistics.
Time frame: At the time of preoperative evaluation, prior to surgery
Effect of Frailty Information on ASA Physical Status Classification
Change in ASA Physical Status Classification assigned by anesthesiologists and large language models after the addition of a frailty index to standardized preoperative clinical data.
Time frame: At the time of preoperative evaluation, prior to surgery
Agreement Between Large Language Models in ASA Physical Status Classification
Interrater agreement in ASA Physical Status Classification between two different large language models using identical standardized preoperative clinical datasets, with and without frailty information.
Time frame: At the time of preoperative evaluation, prior to surgery
Agreement Among Anesthesiologists in ASA Physical Status Classification
Interrater agreement in ASA Physical Status Classification among four board-certified anesthesiologists based on standardized preoperative clinical data, with and without frailty information.
Time frame: At the time of preoperative evaluation, prior to surgery
This platform is for informational purposes only and does not constitute medical advice. Always consult a qualified healthcare professional.