Artificial Intelligence-driven Tuberculosis Landscape Analysis & Stratification Research

N/ANot Yet RecruitingNCT07611695

Huashan Hospital31,600 enrolled

Overview

The goal of this observational study is to establish and validate a comprehensive AI-driven clinical decision support system (AI-CDSS) in whole-chain management for pulmonary tuberculosis (TB) patients. The main question it aims to answer is: How is the predictive performance of this system in terms of multiple key links during TB diagnosis and treatment? Can real-world benefits be derived from this system? This AI framework supports clinicians in making smarter decisions, ultimately improving cure rates and ensuring that every patient receives the most effective, personalized care possible.

This study establishes TB-ATLAS (Artificial Intelligence-driven Tuberculosis Landscape Analysis \& Stratification Research), a modular framework for whole-chain TB management. The objective is to develop and validate an umbrella suite of AI-driven models to optimize clinical decision-making from initial diagnosis to post-treatment follow-up. The core hypothesis is that multimodal patient data can stratify TB phenotypes and predict critical clinical events, enabling precision medicine. Beyond the primary focus on distinguishing Easy-to-Treat (ETT) from Hard-to-Treat (HTT) categories, the system incorporates satellite modules for pre-DST drug resistance risk, treatment adherence monitoring, adverse event (AE) early warning, and risk of post-TB lung disease (PTLD). This study employs a retrospective-prospective cohort design. By utilizing retrospective IPD from clinical trials and real-world EHRs (\>30,000 patients), the investigators apply advanced AI, including foundation models for feature representation and multi-task learning for modular development. Integration of structured clinical variables, microbiological profiles, radiomics, and host signatures ensures high-dimensional input. Model interpretability is prioritized via SHAP/LIME to ensure clinical trust. Then the performance will be evaluated using AUROC and calibration metrics. External validation will occur in a prospective cohort (n≥1,600) to assess the system's impact on predicting real-world outcomes compared to standardized care. The expected output is the TB-ATLAS Clinical Decision Support System (AI-CDSS). By providing evidence-based guidance on regimen intensity, resistance risk, and relapse monitoring, this platform facilitates the transition from "one-size-fits-all" standardized care towards individualized precision management, significantly enhancing clinical decision-making across diverse healthcare settings.

Study Type

OBSERVATIONAL

Eligibility

Sex: ALL

Medical Language ↔ Plain English

Inclusion Criteria for Model Development Cohort: * Patient with clinically diagnosed or bacteriologically confirmed pulmonary tuberculosis (TB) who received TB treatment; * Initiation of TB treatment on or after January 1, 2021; * Complete key diagnosis and treatment data available in the electronic medical record system. Inclusion Criteria for External Validation Cohort: * Patient with clinically diagnosed or bacteriologically confirmed pulmonary tuberculosis (TB) who is planning to start TB treatment; * Voluntary participation with signed informed consent form (for adults ≥18 years); parental / guardian consent and co-signed informed consent form are required for minors aged ≤ 18 years. Exclusion Criteria: * Co-morbidity confounding: the presence of other active, life-threatening disease (e.g. late-stage malignancy, non-HIV severe immunodeficiency) for which the expected survival or priority of treatment may substantially interfere with the attribution of TB treatment outcomes; * Extremely poor treatment adherence: documented evidence indicating that the patient either never initiated treatment or was permanently lost to follow-up within the early treatment period (\<2 weeks), precluding the collection of any valid outcome data.

Outcomes

Primary Outcomes

Predictive Performance of the "Easy-to-Treat" versus "Hard-to-Treat" stratification model for pulmonary tuberculosis (PTB)

The Area Under the Receiver Operating Characteristic curve (AUROC) of the model for discriminating between PTB patients classified as "Easy-to-Treat" versus "Hard-to-Treat". "Easy-to-Treat" patients are defined as patients with PTB who can achieve favorable outcome when treated with a short-course regimen (≤4 months for drug-sensitive TB, ≤6 months for rifampin-resistant TB). "Hard-to-Treat" patients are defined as patients with PTB who will experience unfavorable outcome on short-course treatment (≤4 months for drug-sensitive TB, ≤6 months for rifampin-resistant TB).

Time frame: from treatment initiation to 6 months post treatment

Secondary Outcomes

Brier Score of the "Easy-to-treat" versus "Hard-to-treat" Model

The Brier score will be used to assess the overall prediction accuracy and reliability of the model. It measures the mean squared difference between the predicted probabilities and the actual observed outcomes. The score ranges from 0 to 1, where 0 represents perfect accuracy and 1 represents total inaccuracy. Lower scores mean better model performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

Calibration Slope of the "Easy-to-treat" versus "Hard-to-treat" Model

The calibration slope will be calculated to evaluate the agreement between the model's predicted probabilities and the actual observed outcomes. An ideal calibration slope value is 1. Values closer to 1 indicate better calibration performance, meaning the predicted probabilities perfectly reflect the true risk, which indicates a better predictive outcome.

Time frame: 6 months post-treatment

Area Under the Receiver Operating Characteristic (AUROC) Curve of the Pre-Drug Susceptibility Testing (Pre-DST) Drug Resistance Predictive Model

The AUROC curve will be used to evaluate the discrimination performance of the pre-DST (Drug Susceptibility Testing) drug resistance predictive model. The score ranges from 0 to 1, where 0.5 indicates random guessing and 1 represents perfect discrimination. Higher scores mean better discrimination performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

Area Under the Precision-Recall Curve (AUPRC) of the Pre-Drug Susceptibility Testing (Pre-DST) Drug Resistance Predictive Model

The AUPRC will be used to evaluate the prediction performance of the pre-drug susceptibility testing (Pre-DST) drug resistance predictive model, particularly under conditions of data imbalance. The score ranges from 0 to 1. Higher scores mean better precision and recall performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

F1-score of the Secondary Decision Models for Pre-Drug Susceptibility Testing (Pre-DST) Drug Resistance Prediction

The F1-score, calculated as the harmonic mean of precision and recall, will be used to evaluate the classification performance of the secondary decision models for pre-drug susceptibility testing (Pre-DST) drug resistance prediction. The score ranges from 0 to 1. Higher scores mean better classification performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

Area Under the Receiver Operating Characteristic (AUROC) Curve of the Adherence Forecasting Model

The AUROC curve will be used to evaluate the discrimination performance of the adherence forecasting model. The score ranges from 0 to 1, where 0.5 indicates random guessing and 1 represents perfect discrimination. Higher scores mean better discrimination performance, indicating a better predictive outcome.

Time frame: From treatment initiation until treatment completion, assessed up to 6 months

Area Under the Precision-Recall Curve (AUPRC) of the Adherence Forecasting Model

The AUPRC will be used to evaluate the prediction performance of the adherence forecasting model under data imbalance. The score ranges from 0 to 1. Higher scores mean better precision and recall performance, indicating a better predictive outcome.

Time frame: From treatment initiation until treatment completion, assessed up to 6 months

F1-score of the Secondary Decision Models for Adherence Forecasting

The F1-score, calculated as the harmonic mean of precision and recall, will be used to evaluate the classification performance of the secondary decision models for adherence forecasting. The score ranges from 0 to 1. Higher scores mean better classification performance, indicating a better predictive outcome.

Time frame: From treatment initiation until treatment completion, assessed up to 6 months

Area Under the Receiver Operating Characteristic (AUROC) Curve of the Treatment Response Predictive Model

The AUROC curve will be used to evaluate the overall discrimination performance of the treatment response predictive model. The score ranges from 0 to 1, where 0.5 indicates random guessing and 1 represents perfect discrimination. Higher scores mean better discrimination performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

Area Under the Precision-Recall Curve (AUPRC) of the Treatment Response Predictive Model

The AUPRC will be used to evaluate the prediction performance of the treatment response predictive model, particularly under conditions of data imbalance. The score ranges from 0 to 1. Higher scores mean better precision and recall performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

F1-score of the Secondary Decision Models for Treatment Response Prediction

The F1-score, calculated as the harmonic mean of precision and recall, will be used to evaluate the classification performance of the secondary decision models for treatment response prediction. The score ranges from 0 to 1. Higher scores mean better classification performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

Area Under the Receiver Operating Characteristic (AUROC) Curve of the Adverse Event (AE) Predictive Model

The AUROC curve will be used to evaluate the overall discrimination performance of the adverse event predictive model. The score ranges from 0 to 1, where 0.5 indicates random guessing and 1 represents perfect discrimination. Higher scores mean better discrimination performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

Area Under the Precision-Recall Curve (AUPRC) of the Adverse Event (AE) Predictive Model

The AUPRC will be used to evaluate the prediction performance of the adverse event predictive model, particularly under conditions of data imbalance. The score ranges from 0 to 1. Higher scores mean better precision and recall performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

F1-score of the Secondary Decision Models for Adverse Event (AE) Prediction

The F1-score, calculated as the harmonic mean of precision and recall, will be used to evaluate the classification performance of the secondary decision models for adverse event prediction. The score ranges from 0 to 1. Higher scores mean better classification performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

Area Under the Receiver Operating Characteristic (AUROC) Curve of the Relapse Predictive Model

The AUROC curve will be used to evaluate the overall discrimination performance of the relapse predictive model. Relapse is defined per the World Health Organization (WHO) standard. The score ranges from 0 to 1, where 0.5 indicates random guessing and 1 represents perfect discrimination. Higher scores mean better discrimination performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

Area Under the Precision-Recall Curve (AUPRC) of the Relapse Predictive Model

The AUPRC will be used to evaluate the prediction performance of the relapse predictive model, particularly under conditions of data imbalance. Relapse is defined per the World Health Organization (WHO) standard. The score ranges from 0 to 1. Higher scores mean better precision and recall performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

F1-score of the Secondary Decision Models for Relapse Prediction

The F1-score, calculated as the harmonic mean of precision and recall, will be used to evaluate the classification performance of the secondary decision models for relapse prediction. Relapse is defined per the World Health Organization (WHO) standard. The score ranges from 0 to 1. Higher scores mean better classification performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

Area Under the Receiver Operating Characteristic (AUROC) Curve of the Post-Tuberculosis (TB) Lung Disease Predictive Model

The AUROC curve will be used to evaluate the overall discrimination performance of the post-tuberculosis (TB) lung disease predictive model. The score ranges from 0 to 1, where 0.5 indicates random guessing and 1 represents perfect discrimination. Higher scores mean better discrimination performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

Area Under the Precision-Recall Curve (AUPRC) of the Post-Tuberculosis (TB) Lung Disease Predictive Model

The AUPRC will be used to evaluate the prediction performance of the post-tuberculosis (TB) lung disease predictive model, particularly under conditions of data imbalance. The score ranges from 0 to 1. Higher scores mean better precision and recall performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

F1-score of the Secondary Decision Models for Post-Tuberculosis (TB) Lung Disease Prediction

The F1-score, calculated as the harmonic mean of precision and recall, will be used to evaluate the classification performance of the secondary decision models for post-tuberculosis (TB) lung disease prediction. The score ranges from 0 to 1. Higher scores mean better classification performance, indicating a better predictive outcome.

Time frame: 6 months post-treatment

Artificial Intelligence-driven Tuberculosis Landscape Analysis & Stratification Research

Overview

Conditions

Eligibility

Locations (2)

Outcomes

Primary Outcomes

Secondary Outcomes

Central Contacts