The goal of this study is to learn how accurately two artificial intelligence (AI) models, Gemini 2.5 Pro and ChatGPT-5.1, can interpret ultrasound videos of the Transversus Abdominis Plane (TAP) block, a regional anesthesia technique used for pain control after surgery. The main questions this study aims to answer are: How accurately can each AI model identify anatomical structures on TAP block ultrasound videos? Can the AI models correctly evaluate the spread of local anesthetic and determine whether the block is successful? How closely do the AI models' answers match the evaluations of expert anesthesiologists? No additional procedures will be performed on patients. TAP blocks will be done as part of routine clinical care, and the ultrasound videos will be recorded and de-identified. Participants will not need to do anything extra for the study. Experienced anesthesiologists will review the videos and provide expert answers. The AI models will be given the same videos and asked the same questions. A second expert, who does not know which answers came from humans or AI, will compare all responses. The results will help researchers understand whether advanced AI systems can safely support clinicians in interpreting ultrasound-guided regional anesthesia procedures and improve education and decision-making in anesthesia practice.
This study aims to evaluate how two advanced artificial intelligence (AI) models, Gemini 2.5 Pro and ChatGPT-5.1, interpret ultrasound videos of Transversus Abdominis Plane (TAP) block procedures. TAP blocks are performed as part of routine clinical care by experienced anesthesiologists. The ultrasound videos recorded during these procedures serve as the data source for this study. No additional procedures or patient involvement are required beyond standard care. Ultrasound Video Processing All ultrasound recordings will be fully de-identified by removing patient names, dates, and any other identifying information. Gemini 2.5 Pro will receive original video files. ChatGPT-5.1 will receive high-resolution GIF segments generated from the same recordings. Both models will be given identical structured prompts consisting of eight clinically relevant questions about anatomic structures, needle placement, local anesthetic spread, dermatomal effects, and potential safety concerns. Expert Participation Two anesthesiology experts will participate independently: Expert A will review each ultrasound video and answer the same set of eight clinical questions. These answers will serve as the primary human clinical reference. Expert B will independently evaluate all responses, those from Expert A, Gemini, and ChatGPT-5.1, after they have been anonymized and randomly ordered. Expert B will not know whether a response originated from an AI model or a human expert. Expert B will assess anatomical accuracy, clarity, clinical appropriateness, and overall content quality for each answer. If Expert A and Expert B disagree on the interpretation or quality assessment of any response, a third expert (Expert C), who is also experienced in ultrasound-guided regional anesthesia, will independently review the relevant responses. Expert C's evaluation will be used to resolve discrepancies and establish the final consensus. Data Collected For each TAP block video, the following information will be recorded: Ultrasound and procedural details. Patient demographic descriptors (age, sex, BMI, ASA classification), used only for general characterization of the sample. AI-related performance features such as response completeness, relevance, confidence level, and response time.
Study Type
OBSERVATIONAL
Enrollment
40
Analysis of de-identified ultrasound videos by the Gemini 2.5 Pro artificial intelligence model. The model receives standardized prompts and provides anatomical interpretation, block success assessment, and clinical suggestion outputs.
Analysis of de-identified ultrasound videos by the ChatGPT-5.1 artificial intelligence model. The model receives the same standardized questions and produces anatomical and clinical interpretations for comparison.
Health Science University İstanbul Kanuni Sultan Süleyman Education and Training Hospital
Istanbul, Istanbul, Turkey (Türkiye)
RECRUITINGAnatomical Interpretation Accuracy
For each ultrasound video, the ability of both AI models (ChatGPT-5.1 and Gemini 2.5 Pro) to correctly identify key anatomical structures of the lateral TAP block (internal oblique, transversus abdominis, fascial plane, needle tip) will be evaluated. The accuracy of each model will be compared with the expert-defined reference answer.
Time frame: At the time of video analysis
Block Success Interpretation
Assessment of whether each AI model correctly determines block success based on needle placement and local anesthetic spread, compared with expert reference evaluation.
Time frame: At the time of video analysis.
Needle Plane Evaluation
Determination of whether AI models correctly assess the needle tip location and whether it is within the correct interfascial plane (IO-TA fascia), compared with the expert reference.
Time frame: At the time of video analysis.
Dermatomal Level Prediction
Comparison of each AI model's predicted dermatomal coverage (e.g., T10-T12) with the expert-provided reference dermatomal level.
Time frame: At the time of video analysis.
Risk Awareness Assessment
Evaluation of whether each AI model correctly identifies potential risks on ultrasound images (e.g., peritoneal proximity, vascular structures). 0 = no risk awareness, 1 = partial, 2 = complete and appropriate risk identification.
Time frame: At the time of video analysis.
Recommendation Quality
Assessment of the appropriateness of each model's suggestions (e.g., need for additional injection, repositioning) based on the ultrasound appearance. Qualitative scoring by expert evaluator (0-10).
Time frame: At the time of video analysis.
Agreement Between Experts
To evaluate whether Expert A and Expert B provide consistent judgments for each parameter; and to resolve discrepancies through Expert C when needed. Agreement / Disagreement resolved by third expert.
Time frame: During expert evaluation phase.
AI Response Time
Time required for each AI model to generate answers to the eight standardized questions. Seconds (continuous variable).
Time frame: Captured automatically during model output.
This platform is for informational purposes only and does not constitute medical advice. Always consult a qualified healthcare professional.