The goal of this observational study is to develop and pre-validate a machine learning algorithm to predict early recovery of mobility in patients undergoing hip or knee joint replacement surgery. The primary research question is: Can a machine learning model accurately classify patients with faster versus slower recovery of autonomous mobility in the first days after joint replacement surgery? Patients who have undergone elective hip or knee arthroplasty and received post-operative physiotherapy will have their clinical and perioperative data collected retrospectively (2020-2023) and prospectively (March 2026-December 2027). The algorithm will be trained on retrospective data and tested prospectively to evaluate its predictive performance for early mobilization and length of hospital stay.
This observational study aims to develop and pre-validate a machine learning algorithm to predict early mobility recovery and hospital length of stay in patients undergoing elective hip or knee arthroplasty. The study includes a retrospective phase (2020-2023) using existing clinical and physiotherapy data, and a prospective phase (March 2026-December 2027) to validate the model in routine clinical practice. Data Collection and Outcomes: Mobility recovery: assessed by the ability to ascend and descend three steps within the first four postoperative days, recorded in the physiotherapy diary and electronic health record. Length of stay: considered regular if discharged by the fifth postoperative day; longer stays are defined as prolonged. Predictors: Baseline demographics (age, sex, BMI, ASA score, preoperative hemoglobin) and clinical/perioperative characteristics (type of surgery and anesthesia, initiation of physiotherapy, pain level, urinary catheter use, orthostatic intolerance). Sample Size: 943 patients total (600 retrospective, 343 prospective), based on model development requirements and AUROC estimation. Data Analysis: The dataset will be split into training, validation, and test sets. Multiple supervised learning algorithms (e.g., logistic regression, random forest, gradient boosting) will be compared. Model performance will be evaluated using AUROC, sensitivity, specificity, precision, F1-score, and calibration. Missing data will be handled with imputation or native algorithm methods when supported. Model Validation: Prospective data will be used to assess model discrimination and calibration, and to identify potential temporal or clinical biases. Retraining may be performed using combined datasets to improve generalizability. Study Flow: Retrospective patients identified via hospital records; prospective patients identified on the first postoperative physiotherapy session, provided with study information, and consented. Predictive results are stored in a separate registry inaccessible to treating clinicians. Participating Centers: IRCCS Istituto Ortopedico Rizzoli, Bologna - patient enrollment. Complex Structure of Medical Physics, Arcispedale S. Maria Nuova - data analysis and AI modeling.
Study Type
OBSERVATIONAL
Enrollment
943
Application of a machine learning-based predictive algorithm to retrospectively and prospectively analyze clinical and perioperative data in patients undergoing hip or knee arthroplasty, without influencing clinical decision-making.
SAITeR IRCCS Istituto Ortopedico Rizzoli
Bologna, Italy
Area under the receiver operating characteristic curve (AUROC) for discrimination ability of the machine learning predictive model
The discrimination ability of the machine learning predictive model will be assessed using the area under the receiver operating characteristic curve (AUROC). AUROC summarizes the trade-off between sensitivity and specificity across all possible classification thresholds. AUROC values range from 0.5 (no discrimination) to 1.0 (perfect discrimination). Higher values indicate better model performance. Values above 0.8 will be considered indicative of good discriminatory performance.
Time frame: Through study completion, an average of 2 years
Calibration of the machine learning predictive model assessed by calibration plots
Agreement between predicted probabilities and observed outcomes will be evaluated using calibration plots. Calibration will be visually assessed by plotting predicted versus observed event probabilities.
Time frame: through study completion, an average of 2 years
Predictive performance of the machine learning model assessed by precision and F1-score
Predictive performance will be evaluated using precision and F1-score derived from the confusion matrix by comparing predicted class labels with observed outcomes. Precision reflects the proportion of correctly predicted positive cases among all predicted positives. The F1-score represents the harmonic mean of precision and recall. Higher values indicate better predictive performance.
Time frame: Through study completion, an average of 2 years
This platform is for informational purposes only and does not constitute medical advice. Always consult a qualified healthcare professional.