This study aims to evaluate whether targeted video feedback generated by an artificial intelligence (AI)-based surgical performance assessment model can support improvement in technical skills among cardiac surgeons performing coronary artery bypass grafting (CABG). This is a single-group, self-controlled, pre-post interventional study. Participating surgeons will submit a baseline CABG surgical video, which will be assessed by both an AI model and blinded human expert raters using standardized scoring criteria. Based on the AI assessment, surgeons will receive personalized video feedback highlighting operative steps associated with lower technical performance. After a one-month self-directed review period, a follow-up CABG surgical video will be submitted and evaluated using the same process. Changes in human-rated technical skill scores between baseline and follow-up will be used to assess the potential educational impact of AI-generated video feedback.
Coronary artery bypass grafting (CABG) is a complex surgical procedure that requires a high level of technical skill from cardiac surgeons. Variability in surgical technique may influence procedural quality and patient outcomes. Recent advances in artificial intelligence (AI) have enabled automated assessment of surgical performance using operative video data, creating new opportunities for objective feedback and surgical education. This study aims to evaluate whether targeted video feedback generated by an AI-based surgical performance assessment model can help cardiac surgeons improve their technical skills in CABG procedures. Participating surgeons whose baseline technical performance ranked in the lower half of the AI scoring system will receive personalized video feedback highlighting operative steps and maneuvers associated with lower performance scores. In this single-group, self-controlled study, each participating surgeon will submit a baseline CABG surgical video, which will be independently evaluated by both the AI model and a panel of experienced cardiac surgeons using standardized scoring criteria. After receiving AI-generated video feedback, surgeons will be given one month to review and reflect on the feedback without additional formal training or coaching. A follow-up CABG surgical video will then be submitted and assessed using the same evaluation process. The primary outcome of the study is the change in technical skill scores assigned by human expert raters between the baseline and follow-up videos. Secondary outcomes include surgeons' self-assessments of AI-identified performance deficits, agreement between AI-generated feedback and human expert feedback, and selected patient postoperative in-hospital outcomes. The findings of this study may inform the role of AI-assisted video feedback as a scalable educational tool for surgical skill development.
Study Type
INTERVENTIONAL
Allocation
NA
Purpose
OTHER
Masking
NONE
Enrollment
100
Participants in this study will receive a personalized educational intervention consisting of AI-generated video feedback based on their baseline coronary artery bypass grafting (CABG) surgical videos. The AI model analyzes surgical performance and identifies specific operative steps with lower technical skill scores. Curated video clips highlighting these areas are provided to the surgeons for self-review and reflection. No additional formal training or coaching is given during the one-month intervention period, after which a follow-up surgical video is submitted for re-evaluation.
Fuwai Hospital
Beijing, Beijing Municipality, China
Change in Human Expert-Rated Technical Skill Score Between Baseline and Follow-Up CABG Videos
The primary outcome is the change in technical skill scores assigned by a panel of blinded human expert raters, who independently evaluate anonymized coronary artery bypass grafting (CABG) surgical videos submitted at baseline and one month after receiving AI-generated video feedback. The scoring uses a standardized rubric to assess overall surgical technical performance. The higher score, the better performance: respect for tissue, time and motion, instrument handling, knowledge of instruments, use of assistants, flow of operation and forward planning, and knowledge of the specific procedure. Each domain was scored on a 5-point Likert scale, where 1 indicated poor performance and 5 represented excellence. In addition, each rater provided an overall impression score (1-5) to capture their holistic assessment of surgical performance. The two scores were scaled to 100 points and the final score consists of 70% of 7-domain rating sum scores and 30% of overall impression score.
Time frame: Baseline, 1 month
Surgeon self-assessments of the AI feedback
Surgeons will review the AI-generated video clips highlighting technical performance deficits and complete a self-assessment questionnaire evaluating their satisfaction of the AI feedback.
Time frame: 1 month
Consistency between AI feedback and human expert feedback.
This outcome assesses the consistency between the AI-generated surgical performance feedback and evaluations provided by human expert raters. AI-score was generated by a two-stage deep learning framework and human expert raters' score was generated by a validated 7-domain rating scale as detailed in the primary outcome. The intraclass correlation coefficient (ICC) will be used to assess the consistency between the surgical technique scores assessed by the AI model and those rated by human expert raters.
Time frame: Baseline, 1 month
Postoperative in-hospital outcomes: the icidence of major complications
Patient postoperative in-hospital outcomes will be collected and analyzed to explore any associations with changes in surgeon technical performance following the AI feedback intervention. The incidence of major complications (a composite outcome of death, acute kidney injury, myocardial infarction and stroke)
Time frame: Baseline, 1 month
Postoperative in-hospital outcomes: the incidence of death
The associations with changes in surgeon technical performance following the AI feedback intervention and the incidence of death.
Time frame: Baseline, 1 month
Postoperative in-hospital outcomes: the incidence of acute kidney injury
The associations with changes in surgeon technical performance following the AI feedback intervention and the incidence of acute kidney injury.
Time frame: Baseline, 1 month
Postoperative in-hospital outcomes: the incidence of myocardial infarction
The associations with changes in surgeon technical performance following the AI feedback intervention and the incidence of myocardial infarction.
Time frame: Baseline, 1 month
Postoperative in-hospital outcomes: the incidence of stroke
The associations with changes in surgeon technical performance following the AI feedback intervention and the incidence of stroke.
Time frame: Baseline, 1 month
Postoperative in-hospital outcomes: the incidence of secondary thoracotomy
The associations with changes in surgeon technical performance following the AI feedback intervention and the incidence of secondary thoracotomy.
Time frame: Baseline, 1 month
Postoperative in-hospital outcomes: the incidence of IABP implantation
The associations with changes in surgeon technical performance following the AI feedback intervention and the incidence of IABP implantation.
Time frame: Baseline, 1 month
Postoperative in-hospital outcomes: the incidence of ECMO implantation
The associations with changes in surgeon technical performance following the AI feedback intervention and the incidence of ECMO implantation.
Time frame: Baseline, 1 month
Postoperative in-hospital outcomes: the incidence of bedside hemofiltration
The associations with changes in surgeon technical performance following the AI feedback intervention and the incidence of bedside hemofiltration.
Time frame: Baseline, 1 month
Postoperative in-hospital outcomes: the incidence of peritoneal dialysis
The associations with changes in surgeon technical performance following the AI feedback intervention and the incidence of peritoneal dialysis.
Time frame: Baseline, 1 month
Postoperative in-hospital outcomes: the incidence of tracheotomy
The associations with changes in surgeon technical performance following the AI feedback intervention and the incidence of tracheotomy.
Time frame: Baseline, 1 month
This platform is for informational purposes only and does not constitute medical advice. Always consult a qualified healthcare professional.