The goal of this observational study is to evaluate the accuracy, completeness, and clinical consistency of large language model-generated cardiac magnetic resonance (CMR) imaging reports compared with expert radiologist reports in patients undergoing routine clinical CMR examinations. The main question(s) it aims to answer are: Can automatically generated CMR reports produced by a large multimodal model accurately reflect key imaging findings and diagnoses when compared with reports written by experienced cardiovascular radiologists? How does the quality of generated reports perform in terms of clinical correctness, completeness, and linguistic clarity, as assessed by quantitative metrics and expert review? If there is a comparison group: Researchers will compare AI-generated CMR reports with ground-truth reports authored by board-certified cardiovascular radiologists to see if the automated system achieves comparable diagnostic accuracy and report quality across different cardiac pathologies. Participants will: Undergo standard-of-care cardiac MRI examinations as part of routine clinical practice. Have their anonymized CMR image data and corresponding radiologist reports retrospectively collected. Contribute data that will be used to generate automated CMR reports, which will then be evaluated against expert reports using objective metrics (e.g., diagnostic agreement, entity-level accuracy) and subjective clinical scoring by radiologists.
Study Type
OBSERVATIONAL
Enrollment
20,000
The intervention consists of an automated CMR report generation system based on a large multimodal deep learning model. The model takes de-identified CMR image data as input, including standard clinical sequences (e.g., cine LGE), and automatically generates a free-text radiology report describing cardiac structure, function, and imaging findings. The generated reports are produced offline and retrospectively, and are not used for clinical decision-making or patient management. No changes are made to the imaging acquisition protocol or standard clinical workflow. For evaluation purposes, the AI-generated reports are compared with reference reports authored by experienced cardiovascular radiologists, using predefined quantitative accuracy metrics and expert qualitative assessment of clinical correctness, completeness, and readability. This intervention is intended solely for research and performance evaluation of automated report generation and does not influence patient care.
Fuwai Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College
Beijing, Beijing Municipality, China
Diagnostic Accuracy of AI-Generated Cardiac MRI Reports
The primary outcome is the diagnostic accuracy of automatically generated cardiac magnetic resonance (CMR) reports produced by a large multimodal model. Diagnostic accuracy is assessed by comparing AI-generated reports with reference reports written by board-certified cardiovascular radiologists. Agreement is evaluated at the level of key clinical findings and final imaging impressions, using predefined criteria. Accuracy metrics include correctness of major diagnoses and presence or absence of clinically relevant imaging findings.
Time frame: Baseline
This platform is for informational purposes only and does not constitute medical advice. Always consult a qualified healthcare professional.