Effect of Intelligent Tutor Induced Pausing on Learning Simulated Surgical Skills

McGill University129 enrolled

Overview

Traditional training of surgical technical skills relies on mentorship from experienced surgeons, who continuously evaluate and change trainee performance to prevent errors and potential patient harm by providing verbal instructions. These educators may also pause the procedure, explaining the risks associated with the trainee's actions, and may personally demonstrate proper techniques to the students. Studies examining pausing while providing medical care outline that these approaches allow for learning. An artificial intelligent (AI) tutoring system, the Intelligent Continuous Expertise Monitoring System (ICEMS), improves learning in a surgical simulated operation by providing trainees with verbal instructions upon error identification. However, the effect of including a pause during this AI teaching has not been studied. Therefore, the ICEMS post-error identification methodology has been altered to include a pause with the intelligent tutor voice instruction. The aim of this study is to determine the effect of pausing on surgical skill acquisition and transfer among pre-medical and medical students. This will be done by comparing their performance in repeated simulated tumour resection tasks.

Background: Surgical skill assessment is shifting from a quantitative, time-based approach towards a qualitative evaluation of a trainee's competency. During surgical procedures, instructors continuously monitor trainee performance and utilize various teaching methods focused on enhancing acquisition of surgical skills. One such method includes pausing the operation, either to outline the risks associated with the trainee's performance or to personally demonstrate the best practice technique(s). Pausing in such situations has been shown to allow learners to re-assess best practice, interrupt negative momentum, and allow for learning. Specifically, pausing after an error can prevent introduction of new information that may affect one's ability to reflect on their error and reduce stress before continuing. Rationale: The ICEMS, an AI tutoring system, was developed by our group using a Long-Short Term Memory deep learning algorithm to assess surgical performance and provide guidance. This was then integrated with the NeuroVR simulation platform. Using this AI system, the provision of verbal feedback on error identification demonstrated the potential of intelligent tutoring to improve learning in two previous randomized control trials (RCTs). However, these RCTs did not incorporate pausing methodology post-error identification. To further emulate the mentorship of an experienced surgeon in a clinical setting, the ICEMS platform has been modified to both initiate pausing when learner error is identified and provide a video demonstrating expert performance. Research aims: To compare the effect of incorporating a pause after intelligent tutor instruction to intelligent tutor instruction alone on medical and pre-medical students' surgical skill acquisition and skill transfer. Hypotheses: 1. The pause with video group will significantly improve the composite score between the first and sixth repetition of the practice scenario. 2. The pause with video group will have a composite score statistically higher than the control group in the sixth repetition of the practice scenario. 3. The pause with video group will have a composite score statistically non-inferior to the pause without video group in the sixth repetition of the practice scenario. 4. The pause with video group will have a global OSATS score statistically higher than the control group in the realistic scenario. 5. The pause with video group will have a global OSATS score statistically non-inferior to the pause without video group in the realistic scenario. 6. The pause with video group will not elect a difference in emotional stress or cognitive load compared to the control group. Specific objectives: 1. To assess if the efficacy of AI mediated real-time tailored feedback combined with pausing methodology is statistically superior to AI mediated real-time tailored feedback alone in improving medical students' surgical skills on two virtual reality surgery tasks. 2. To determine if different emotions and cognitive load are elicited by the pausing methodology as compared to only hearing the AI mediated feedback. Design: A three-arm single blinded randomized controlled trial of AI feedback with pausing methodology and an expert demonstration video versus AI feedback with only pausing methodology versus AI feedback alone. Setting: Neurosurgical Simulation and Artificial Intelligence Learning Centre. Participants: Students who are enrolled in a Quebec medical school in a preparatory year, and first and second year. Task: Using the NeuroVR surgical simulator by CAE Healthcare, resect a simulated practice tumour six times and a complex simulated realistic brain tumour once using an Ultrasonic Aspirator and Bipolar pincers while minimizing bleeding and preserving the surrounding, simulated healthy brain structures. Intervention: A 90-minute training session where participants will have seven simulated subpial tumour resection attempts (six repetitions of a simple practice scenario and one attempt at a complex realistic scenario). All participants will receive auditory feedback from the ICEMS but will differ in what follows: 1. Continuously perform the procedure (i.e., no pause) (Group 1); 2. A pause followed by a reflection period (Group 2); 3. A pause followed by an expert-level demonstration video and a reflection period (Group 3). Auditory feedback will be based on 4 metrics: 1. Instrument tip separation distance; 2. Low bipolar force; 3. High aspirator force; 4. High bipolar force. Initially, feedback will only be given based on the first metric (instrument tip separation), but once a repetition is completed without receiving feedback, the subsequent repetition will assess the next metric in the list above, and so on and so forth. Main outcomes and measures: The two co-primary outcomes are: 1. The improvement in surgical performance, as dictated by the composite score computed by the previously validated evaluation module of the ICEMS. Performance improvement is measured by the difference in composite score between each of the 6 repetitions of the practice scenario. Transfer of learning will be measured by the participant's composite score for the complex realistic scenario. 2. The performance score of the participants in the complex realistic scenario, assessed by two blinded experts using the Objective Structured Assessment of Technical Skills (OSATS) global rating scale. The secondary outcome is the differences in the strength of emotions elicited, measured before the practice scenario, immediately before the realistic scenario, and after completion of all attempts using the Duffy's Medical Emotional Scale (MES). Cognitive load will also be measured following completion of all tasks using Leppink's Cognitive Load Index (CLI). Both outcomes are measured using self-reports. Statistical Analysis Plan: Participant data will be anonymized and stored. The ICEMS will assess the participant's surgical performance and provide a performance score at 0.2 second intervals throughout each repetition of the simulated surgical task. An average composite score will then be calculated for each repetition. Using ANCOVA, improvement in performance and participant learning will be assessed by comparing the composite score of the first practice scenario repetition (baseline) and the composite score of the sixth repetition (summative). Meanwhile, the composite score of the complex realistic scenario will be used to assess the transfer of learning using a one-way ANOVA. With an effect size of 0.25 and a significance of 0.05, a total sample size of 129 provides 80% power to detect a significant interaction. Videos of participant performance in the complex realistic scenario will be evaluated by two blinded expert raters using the OSATS global rating scale. The OSATS score will be analyzed between groups using a one-way ANOVA to compare efficiency of learning and skill retention. Emotional changes before, during, and after learning in the simulated scenarios will be evaluated using a two-way mixed ANOVA, while one-way ANOVA will be used to assess cognitive load after learning.

Outcomes

Primary Outcomes

Change in performance

Evaluated by comparing the average composite-score, calculated by the ICEMS, from each practice scenario. Scores range from expert/skilled level (a score of 1.00) to novice/less-skilled level (a score of -1.00).

Time frame: 1 day of study

Transfer of learning

Time frame: 1 day of study

Objective Structured Assessment of Technical Skills (OSATS) global rating scale

Performance score of the participants in the complex realistic scenario, assessed by two blinded experts using the Objective Structured Assessment of Technical Skills (OSATS) global rating scale on a 7-point Likert scale (1= novice to 7 = expert). Efficacy in learning with pausing methodology and an expert-level demonstration video will be compared to pausing methodology alone and to no pausing methodology.

Time frame: 1 day of study

Secondary Outcomes

Differences in strength of emotions elicited

Measured by Duffy's Medical Emotional Scale (MES) before, during, and after learning. Participants will self-report the intensity of each emotion on a 5-point Likert scale (1 = not at all to 5 = very strong).

Time frame: 1 day of study

Difference in Cognitive Load

Measured using Leppink's Cognitive Load Index (CLI) after the intervention. Participants will self-report their level of agreement with each statement on a 5-point Likert scale (1 = strongly disagree to 5 = strongly agree).

Time frame: 1 day of study

Effect of Intelligent Tutor Induced Pausing on Learning Simulated Surgical Skills

Overview

Conditions

Interventions

Eligibility

Locations (1)

Outcomes

Primary Outcomes

Secondary Outcomes

Central Contacts