Introduction Foot ulcers in diabetes mellitus (DM) are a common and serious complication that can lead to infection, amputation, and increased mortality. Early identification of patients at high risk is crucial in order to implement preventive measures at an early stage. The number of people with DM is increasing globally, from 540 million in 2021 to an estimated 780 million by 2045. Foot ulcers cause considerable suffering for the individual and entail substantial costs for the healthcare system. Despite national guidelines recommending regular, structured foot examinations and risk classification to assess the risk of developing foot ulcers, current risk models do not take into account the complex interactions between risk factors and socioeconomic factors such as marital status, level of education, and place of residence. Data-driven advances and artificial intelligence (AI) offer new opportunities to refine risk identification, but their use in predicting the risk of diabetic foot ulcers remains limited. The need for foot screening is considerable. In Sweden, there are approximately 600,000 patients with DM, and half of them live with an increased risk due to nerve damage in the feet. This means that, based on risk level, around 300,000 patients in Sweden may require preventive interventions, including medical foot care, customised footwear, and access to specialist care for those with foot ulcers. Improved preventive efforts are emphasised in the person-centred and integrated care pathway for people with diabetes at high risk of foot ulcers. However, accurate identification of foot ulcer risk is currently lacking. Prevention leads not only to good quality of life for the individual but also to reduced healthcare costs. Estimates by Ragnarsson Tennvall show that a hard-to-heal ulcer costs approximately SEK 100,000 per year, while an amputation costs around SEK 300,000-500,000. Given a prevalence of foot ulcers of 5% among patients with diabetes, the annual cost of ulcer care amounts to SEK 3 billion. In addition, there are costs of approximately SEK 750 million for amputations, according to data from the quality register SwedAmp. The aim of the study is to develop, test, and validate prediction models (statistical and AI-based) to identify patients with DM who are at risk of developing foot ulcers. The models will be based on retrospective electronic health record data from primary care in the Västra Götaland Region (VGR), as well as data from Statistics Sweden (SCB) concerning demographic factors such as marital status, level of education, occupation, and place of residence. Methods The study has two methodological approaches: AI-based modelling and statistical modelling. AI-based approach Machine learning models will be developed to predict patients at risk of developing diabetic foot ulcers. The models will be trained using cross-validation on a large dataset in which variables will be iteratively excluded. Conformal prediction will be used to quantify uncertainty in patient-level predictions. The resulting models will be analysed to identify the strongest predictors and will be compared with classical statistical modelling and findings from the literature. Steps in AI modelling: Data extraction: Electronic health record data from primary care in VGR, supplemented with sociodemographic data from SCB. Data processing: Use of, among other variables, diagnostic codes (ICD-10), healthcare interventions (KVÅ codes), visit types, visit frequency, ECG parameters, and free-text data to construct predictors. Model development: Prediction models will be developed and trained using cross-validation. Measures of uncertainty will be generated using conformal prediction. Validation: A separate cohort will be used to test model performance (sensitivity, specificity, positive predictive value \[PPV\]). Interpretation: The models will be reviewed for transparency and clinical interpretability in collaboration with patient representatives, clinicians, and researchers. The results of the statistical and AI-based models will be compared with regard to their respective strengths and weaknesses. Statistical modelling Two populations will be analysed: patients with diabetes without foot ulcers and patients with diabetes with foot ulcers. Co-variation and causal relationships between risk factors and foot ulcers will be identified. A model describing causal pathways leading to ulcer development will be developed, and its certainty and uncertainty will be analysed.
In this register-based study, using data from Närhälsan's electronic health record system in the Västra Götaland Region (VGR) and linkage with data from Statistics Sweden (SCB), the research questions will be addressed through the development and validation of AI-based models. At a later stage of the process, the ability of the AI models to predict foot ulcers will be compared with that of statistical models. From Asynja Whisp, Närhälsan's electronic health record system in VGR, data will be retrieved for all adult patients (18 years or older) with diagnoses (according to ICD-10) who either have a diabetes diagnosis (E10-E14) or have been prescribed any diabetes medication after the age of 18, covering the period from 2014 to 30 June 2025. Based on, among other variables, diagnostic codes (ICD-10), procedure codes (KVÅ), visit types, visit frequency, ECG parameters, and free-text/clinical notes, predictors will be identified, such as neuropathy, impaired circulation, previous ulcers, antibiotic treatment, foot deformities, and skin status. The data will be validated and, if necessary, supplemented with additional parameters. Methods to address the research questions Machine learning-based models will be trained to predict the risk of developing foot ulcers. Cross-validation will be used to identify optimal hyperparameters for each model. In the first phase, the models' ability to discriminate between patients with diabetic foot ulcers and patients without foot ulcers will be evaluated. In the second phase, the models' ability to prospectively predict ulcer development will be assessed. Redundant variables will be excluded, and the models will be retrained in an iterative process to increase robustness. The models will be combined with conformal prediction to integrate uncertainty estimation into the predictions and to identify patients for whom the model is unsuitable for prediction. Finally, the most predictive variables will be identified using Shapley values (SHAP). Statistical models Using electronic health record data from the Asynja Whisp care information system in VGR primary care, together with SCB data and scientific and empirical evidence, variables and categories that constitute potential risk factors for foot ulcers will be identified. A case-control design will be applied, in which the control group consists of people with diabetes who have not developed foot ulcers, compared with patients who have developed foot ulcers. In the development of statistical prediction models, the workflow involves analysing populations, i.e. all patients with diabetes without foot ulcers compared with all patients with diabetes who have foot ulcers. This allows investigation of potential associations between the occurrence of foot ulcers in patients with diabetes and other factors. In collaboration with the medical profession, causal relationships underlying the occurrence of foot ulcers will be identified. A model will be developed that describes chains of causation leading to the occurrence of foot ulcers in patients with diabetes, and information will be provided on the degree of certainty of the model. Based on the results of the models (both AI-generated and statistical), strengths and weaknesses of each approach will be compared. Validation of the developed models will be performed on an independent dataset to ensure that the results are generalisable and robust over time. The validation strategy ensures that the model performs well on new patients and not only on the dataset from which it was developed. Outcome measures for validation include sensitivity (how well the model identifies those who truly have a high risk of foot ulcers), specificity (how well the model avoids false alarms), and positive predictive value (PPV). Furthermore, the model will be interpreted to ensure transparency and clinical interpretability. Development, testing, and validation will be conducted in collaboration with patient representatives, clinicians, and researchers.
Study Type
OBSERVATIONAL
Enrollment
100,000
Region Västra Götaland
Jonsered, Sweden
RECRUITINGPerformance of machine learning-based prediction models for diabetic foot ulcer risk
The primary outcome is the predictive performance of machine learning-based models developed to estimate the risk of diabetic foot ulceration in patients with diabetes. Models will be trained using supervised machine learning techniques, with optimal hyperparameters identified through cross-validation. In the initial evaluation phase, model performance will be assessed for the ability to discriminate between patients with and without existing diabetic foot ulcers. In a subsequent phase, the models' ability to prospectively predict the development of diabetic foot ulcers during follow-up will be evaluated. Model robustness will be improved through an iterative process in which redundant variables are excluded and models are retrained. Predictive performance will be quantified using established metrics such as discrimination, calibration, and classification accuracy. To account for uncertainty in individual predictions, the final models will be combined with Conformal Prediction meth
Time frame: From study start to 2027-12-31
Identification and interpretability of risk factors for diabetic foot ulcer development
The secondary outcome is the identification and validation of clinical, demographic, and socioeconomic variables that are potential risk factors for diabetic foot ulcer development in patients with diabetes. Variables and risk factor categories will be identified using electronic health record data from the primary care information system Assynja Whisp in Region Västra Götaland, linked with national registry data from Statistics Sweden (SCB), together with established scientific and empirical evidence. A case-control study design will be applied, in which patients with diabetes who develop foot ulcers are compared with a control group of patients with diabetes who do not develop foot ulcers. Population-level analyses will be conducted to examine associations and co-variation between the occurrence of diabetic foot ulcers and other relevant factors.
Time frame: From study start to 2027-12-31
This platform is for informational purposes only and does not constitute medical advice. Always consult a qualified healthcare professional.