Patients increasingly consult artificial intelligence (AI) chatbots such as ChatGPT for health information before clinical visits, yet the impact of an actual orthopedic consultation on patient trust in AI-derived information remains unknown. This prospective longitudinal observational study quantifies how a single orthopedic outpatient consultation modifies patient trust in AI chatbots, the concordance between AI-derived and physician-delivered information, and patient anxiety, using a paired pre-post survey design supplemented by a matched physician-side assessment. Adult patients (18 years and older) presenting to two orthopedic outpatient clinics in Cyprus complete a brief pre-consultation questionnaire (T0) capturing demographics, AI use patterns, prior AI consultation regarding the current complaint, baseline trust, expectations, and anxiety. Immediately after their consultation they complete a second questionnaire (T1) assessing concordance with physician advice, trust change, consultation facilitation, post-consultation anxiety, and future intention. The consulting physician completes a brief 30-second post-visit form capturing whether AI was discussed, the medical accuracy of AI-derived information conveyed by the patient, and the effect of the AI discussion on consultation duration. The primary outcomes are the paired within-patient change in AI trust between T0 and T1 and physician-patient concordance on AI versus physician advice. Target enrollment is 180 to obtain 150 paired completed assessments.
Background and Rationale: Cross-sectional surveys have documented increasing patient use of AI chatbots for health information seeking. However, no published study has assessed how an actual physician consultation modifies patient trust in AI in a paired pre/post design, nor has any study captured the physician perspective on the same encounter in a matched dyad. Routine clinical encounters may be the primary mechanism by which patients calibrate their trust in AI-derived medical information. Setting and Population: Two university-affiliated orthopedic outpatient clinics in North Cyprus. Procedures: * T0 (pre-consultation, waiting room, approximately 5 minutes): 14-item self-report questionnaire. * Consultation: usual care. * T1 (post-consultation, departure, approximately 5 minutes): 10-item self-report questionnaire. * Physician form (post-consultation, approximately 30 seconds): 5-item brief assessment. * Patient and physician forms are linked by an anonymous Participant ID. Statistical Analysis Plan: Paired t-tests or Wilcoxon signed-rank tests for paired continuous outcomes; McNemar test or Stuart-Maxwell for paired categorical outcomes; Cohen's kappa for inter-rater agreement (AI versus physician); multinomial logistic regression for predictors of trust shift. All analyses two-sided, alpha equals 0.05. SPSS version 28. Data Management: Anonymous CSV stored locally, encrypted, retained for 5 years per institutional policy. De-identified participant-level data available upon reasonable request after publication. No formal pilot study is conducted. Instead, the first 20 participants will be prospectively monitored for protocol feasibility (mean completion time, drop-out rate, item-level missing data) as an embedded running pilot.
Study Type
OBSERVATIONAL
Enrollment
180
University of Kyrenia, Dr. Suat Gunsel Hospital - Orthopedic Outpatient Clinic
Kyrenia, Cyprus
Near East University Hospital - Orthopedic Outpatient Clinic
Nicosia, Cyprus
Mean within-patient change in self-reported trust in artificial intelligence-derived health information, measured by a study-specific 5-point Likert item (T0.11) and a study-specific 3-level categorical change item (T1.4).
Trust in AI-derived health information is assessed pre-consultation by a study-specific single-item 5-point Likert scale (item T0.11: "How much do you trust the AI's answer?"; anchors 1 = not at all, 5 = completely), administered only to patients who reported pre-consultation AI use (item T0.9 = Yes). Post-consultation, trust change is reassessed by a study-specific 3-level categorical item (item T1.4: increased trust / unchanged / decreased trust). For paired analysis, the post-consultation score is derived by mapping T1.4 categories to integer shifts (+1 / 0 / -1, with floor 1 and ceiling 5) relative to T0.11. Unit of measure: Likert score points on a 1-5 scale (continuous derived score) and proportion of patients per 3-level category. Primary analysis: paired Wilcoxon signed-rank test on the derived continuous score; sensitivity analysis: McNemar test on the 3-level categorical change.
Time frame: Baseline (within 15 minutes pre-consultation in the orthopaedic outpatient waiting room) and immediately after the consultation (within 15 minutes of consultation exit, same-day index visit).
Patient-physician concordance on artificial intelligence-versus-physician medical advice agreement, measured by Cohen's kappa coefficient between a study-specific 4-category patient item (T1.2) and a study-specific 5-point physician-rated AI medical accu
Concordance is assessed by Cohen's kappa coefficient comparing patient-reported AI-physician concordance (item T1.2: fully concordant / partially concordant / discordant / physician did not address; dichotomized to concordant vs. non-concordant) and physician-reported AI medical accuracy (item H2: 5-point Likert anchored 1 = entirely incorrect to 5 = entirely correct; dichotomized at ≥ 3 as concordant). Unit of measure: kappa coefficient (range -1 to +1) with 95% confidence interval, and percentage of dyads classified as concordant on each instrument.
Time frame: Immediately after the consultation (within 15 minutes of consultation exit), for both patient (T1.2) and physician (H2) forms; same-day index visit.
Mean within-patient change in self-reported anxiety, measured by an 11-point 0-to-10 visual analogue scale anchored 0 = no anxiety and 10 = worst possible anxiety (items T0.14 baseline, T1.5 post-consultation).
Anxiety is measured pre-consultation (item T0.14) and post-consultation (item T1.5) using the same 0-to-10 visual analogue scale. Within-patient change is calculated as T1.5 minus T0.14. Unit of measure: scale points (range -10 to +10). Analysis: paired t-test with Wilcoxon signed-rank as sensitivity analysis; Cohen's d effect size reported.
Time frame: Baseline (within 15 minutes pre-consultation) and immediately after the consultation (within 15 minutes of consultation exit), same-day index visit.
Percentage of enrolled patients reporting pre-consultation artificial intelligence use for the current orthopaedic complaint, measured by a study-specific single-item yes/no question (T0.9).
Proportion of enrolled patients responding "Yes" to item T0.9 ("Before today's appointment, did you ask an AI chatbot a question about this health concern?"). Unit of measure: percentage of participants, reported with exact (Clopper-Pearson) 95% confidence interval.
Time frame: Baseline (within 15 minutes pre-consultation, same-day index visit).
Percentage of pre-consultation artificial-intelligence users whose physician independently confirmed that AI was raised during the consultation, measured by a study-specific yes/no physician item (H1).
Among patients responding "Yes" to T0.9, the proportion in whom the treating physician independently reported "Yes" to item H1 ("Did the patient raise AI during this consultation?"). Unit of measure: percentage of patients with exact 95% confidence interval.
Time frame: Baseline (T0.9, pre-consultation) and immediately after the consultation (H1, within 15 minutes of consultation exit), same-day index visit.
Percentage of consultations in which the physician reported that the artificial-intelligence discussion shortened, did not change, or prolonged the encounter, measured by a study-specific 3-category physician item (H3).
Among consultations in which the patient raised AI (H1 = Yes), the physician's categorical rating of effect on consultation duration (H3: "shortened" / "no change" / "prolonged"). Unit of measure: percentage of consultations per category (descriptive).
Time frame: Immediately after the consultation (within 15 minutes of consultation exit), same-day index visit.
Mean patient rating of how prior artificial-intelligence use facilitated the consultation, measured by a study-specific 5-point Likert item (T1.4b: 1 = much more difficult, 5 = much easier).
Among patients with T0.9 = Yes, patient-reported facilitation by prior AI use (item T1.4b). Unit of measure: Likert score points (mean with standard deviation), and percentage of participants endorsing scores ≥ 4.
Time frame: Immediately after the consultation (within 15 minutes of consultation exit), same-day index visit.
Mean patient-reported future intention to use and to recommend artificial intelligence for health information, measured by two study-specific 5-point Likert items (T1.7 future use; T1.8 recommendation to a friend).
Future-use intention (item T1.7: 1 = definitely will not, 5 = definitely will) and recommendation intention (item T1.8: 1 = definitely will not, 5 = definitely will). Unit of measure: Likert score points (mean with standard deviation), and percentage of participants endorsing scores ≥ 4 on each item.
Time frame: Immediately after the consultation (within 15 minutes of consultation exit), same-day index visit.
This platform is for informational purposes only and does not constitute medical advice. Always consult a qualified healthcare professional.