This study focuses on improving early detection of Alzheimer's disease (AD) in patients with subjective cognitive decline (SCD), a preclinical stage of cognitive impairment, in the context of emerging disease-modifying therapies (DMTs). Current biomarkers, such as brain MRI, PET scans, and cerebrospinal fluid (CSF) markers, are highly accurate but costly, invasive, and not widely accessible. The study aims to provide cost-effective, scalable tools for early identification of individuals at risk, enabling personalized assessment and timely DMT administration. Objectives: * Evaluate the accuracy of innovative, easily accessible biomarkers in predicting biologically confirmed AD. * Assess the predictive utility of previously studied methods for SCD patients. * Explore new approaches, including automated speech analysis, to identify cognitive decline. * Evaluate genetic contributions to AD risk. * Integrate data from these various modalities using machine learning to create a predictive model for AD in SCD patients. Study Design: This is a multicenter, longitudinal, low-intervention study conducted at IRCCS Policlinico San Donato, San Donato Milanese, Milan, Italy (UO1) and the Center for Research and Innovation in Dementia, Careggi Hospital, Florence, Italy (UO2). Eligible participants are adults with SCD, intact daily functioning, and Mini-Mental State Examination (MMSE) scores \>24. Exclusion criteria include neurological or systemic diseases, major psychiatric disorders, substance use, or prior head injury. Participants undergo: * Detailed medical and family history collection. * Comprehensive neuropsychological, personality, and independence in daily activities assessment * EEG recording in resting state. * Blood sampling for plasma biomarkers (Aβ42, Aβ40, p-tau181, p-tau217, t-tau, NfL, GFAP). * CSF biomarker analysis (Aβ42, Aβ40, p-tau, t-tau). * Genetic analysis of AD-related genes (PSEN1, PSEN2, APOE, TREM2, ABCA7, BDNF, HTT). * Speech recording and analysis using standardized tasks to extract features for automated evaluation. The study expects to create a machine learning-based predictive model combining biomarker, neuropsychological, EEG, speech, and genetic data to improve early detection and guide personalized patient care. Procedures: * Neuropsychological evaluations occur at baseline and two-year follow-up. * Language recordings are conducted in controlled settings using standardized picture description tasks. * EEG is recorded using 21-channel systems. * Blood and CSF samples are collected, processed, and stored at -80°C for subsequent analysis at respective institutional laboratories. * Plasma biomarkers are analyzed with Simoa technology; CSF biomarkers are analyzed using chemiluminescent enzyme immunoassay (CLEIA). * Genetic analyses employ PCR, high-resolution melting analysis (HRMA), sequencing, and capillary electrophoresis as appropriate for specific genes or polymorphisms. The study expects to create a machine learning-based predictive model combining biomarker, neuropsychological, EEG, speech, and genetic data to improve early detection and guide personalized patient care.
1. INTRODUCTION Alzheimer's disease (AD) research and clinical practice are at a turning point. As disease-modifying therapies (DMTs) for AD are becoming available, neurologists, researchers, and health services will face a predictable, increasing demand for diagnostic evaluations for patients with cognitive impairment. In addition, there is a consensus that DMTs should be administered in the earliest stages of the disease to halt the disease process before neurodegeneration begins. For this reason, international research is focusing on subjective cognitive decline (SCD) and mild cognitive impairment (MCI), considered the earliest manifestations of AD and the optimal target population for future DMTs. However, both MCI and SCD are very common and heterogeneous conditions with different possible trajectories and many potential underlying causes. Current recognized biomarkers for the disease (brain MRI, PET neuroimaging, and cerebrospinal fluid \[CSF\] biomarkers) are highly accurate in identifying patients with SCD and MCI due to Alzheimer's disease, but their large-scale use is extremely limited due to high cost, poor accessibility, and invasiveness. For this reason, previous studies have suggested considering demographic, cognitive, and genetic characteristics to estimate the risk of dementia. In addition, blood-based biomarkers are considered promising tools to enable assessment at the primary care level. However, none of these assessments or tools alone can guarantee sufficient accuracy to be used at the screening level. 2. OBJECTIVE OF THE STUDY We aim to: 1. Evaluate the accuracy of more innovative and easily accessible biomarkers in the earliest stages of cognitive decline in predicting the presence of biologically diagnosed AD based on CSF biomarkers. 2. To clarify the utility of techniques that have already been studied in this context but have produced conflicting results, such as neuropsychological scores and electroencephalography (EEG). 3. To explore new analysis techniques not yet applied to this field, such as automated analysis of speech recording (speech analysis). 4. To evaluate the contribution of genetic variants to AD risk in patients with SCD. 5. Combine the features and data extracted from these techniques using a machine learning approach to develop a predictive model of AD in patients with SCD. 3\. STUDY DESIGN This is a multicenter, longitudinal, low-intervention study. Patients will be recruited from the U.O.C. of Neurology at IRCCS Policlinico San Donato (henceforth referred to as UO1) and from the Center for Research and Innovation in Dementia (CRIDEM) at Azienda Ospedaliero-Universitaria Careggi in Florence, Italy (AOUC, henceforth referred to as UO2). All recruited patients will undergo: 1. in-depth collection of family and clinical history; 2. extensive neuropsychological evaluation, including language recording, estimation of cognitive reserve, assessment of depression and personality traits; 3. EEG recording in resting state; 4. analysis of the following blood biomarkers: Aβ42, Aβ40, p-tau181, p-tau217, t-tau, NfL and GFAP; 5. analysis of the following genes: PSEN1, PSEN2, APOE, TREM2, ABCA7, BDNF, HTT; 6. analysis of biomarkers in cerebrospinal fluid (CSF): Aβ42, Aβ40, p-tau and t-tau. 3.1. STUDY POPULATION. Patients who meet the following criteria will be recruited: 1. Age ≥18 years 2. Clinical diagnosis of SCD according to SCD-I criteria4 ; 3. Mini-Mental State Examination (MMSE) score greater than 24, adjusted for age and education level; 4. Normal functionality on the Activities of Daily Living (ADL) and Instrumental Activities of Daily Living (IADL) scales. Patients with: a. History of head injury; b. Ongoing neurological and/or systemic disease; c. Symptoms of psychosis, major depression, or substance use disorder. 4\. STUDY PROCEDURE 4.1. NEUROPSYCHOLOGICAL EVALUATION, ASSESSMENT OF DEPRESSION, AND ESTIMATION OF COGNITIVE RESERVE The in-depth neuropsychological examination will be conducted at baseline (enrollment) and at follow-up (after two years) at IRCCS Policlinico San Donato (for patients enrolled from UO1) and at AOU Careggi (for patients enrolled at UO2) and will include: global cognitive measures (Mini Mental State Examination); tasks exploring verbal and spatial short- and long- term memory, attention, language, constructive praxis, and executive function; subjective perception of memory deficits as described by Mazzeo et al.; indicators of cognitive reserve such as premorbid intelligence and recreational activities; depressive symptoms; independence index in activities of daily living; personality traits. 4.2. RECORDING AND PRE-PROCESSING OF LANGUAGE Language recording will occur during the Screening for Aphasia in Neurodegeneration (SAND) picture description task. In a silent and controlled environment, patients will be asked to describe the "Summer Time Picture" following a standardized procedure. Voice recording will be made with a dedicated system, with a sampling rate of 44,100 Hz, normalized to reach a peak of -1 dB, and reduced to 48 kHz. The Voice Activity Detection (VAD) webRTC algorithm will be used to extract audio segments. 4.3. EEG RECORDING, PRE-PROCESSING, AND FEATURE EXTRACTION Resting EEG data will be collected at IRCCS Policlinico San Donato (for patients enrolled from UO1) and at AOU Careggi (Florence, Italy) (for patients enrolled at UO2). The standard 21-channel system will be used. EEG recording will begin with 10 minutes with eyes closed, followed by alternating 3 minutes with eyes open and 3 minutes with eyes closed, repeated twice. Only the closed-eye segments will be used for analysis. 4.4. COLLECTION, HANDLING, AND STORAGE OF BIOLOGICAL SAMPLES Blood samples will be collected by venipuncture in standard polypropylene tubes with EDTA (Sarstedt, Nümbrecht, Germany) at the Neurology Unit of the San Donato Polyclinic and the Neurology Unit of the University Hospital of Careggi. As per clinical routine for patients with cognitive decline, after acquiring informed consent for rachicentesis, CSF samples will be collected at 8 a.m. by lumbar puncture at the Neurology Unit of Policlinico San Donato and the Neurology Unit of Careggi University Hospital. Samples will be immediately centrifuged and stored at -80°C until analysis is performed. Blood samples will be collected at two times: at the time of the first clinical assessments (baseline collection) and two years after the first collection (follow-up collection). Biological samples from patients enrolled at UO1 and associated data collected for the study will be transferred to the BioCor Biobank at IRCCS Policlinico San Donato for processing and subsequent storage. Samples will be stored at the BioCor Biobank for up to 25 years. Biological samples from patients enrolled at UO2 and associated data collected for the study will be transferred to the AOUC Neurogenetics laboratory for processing and subsequent storage. They will be centrifuged within two hours at 1300 rcf at 4°C for 10 minutes, and the plasma will be isolated and stored at -80°C until analysis. 4.5. ANALYSIS OF BLOOD BIOMARKERS Analysis of plasma biomarkers will be performed on the samples taken at baseline and two-year follow-up samples at the AOU Careggi Neurogenetics Laboratory using the Simoa kit for human samples provided by Quanterix Corporation (Lexington, MA, USA) on the automated Simoa SR-X platform (GBIO, Hangzhou, China), following the manufacturer's instructions. Plasma biomarker concentrations of all samples will be measured in a single run. Quality controls will be included in the analysis and tested together with the samples. A calibration curve will be determined by serially diluted calibrator measurements provided by Quanterix. 4.6. ANALYSIS OF CSF BIOMARKERS Concentrations of Aβ42, Aβ40, t-tau, and p-tau will be measured at the AOUC Neurogenetics Laboratory using a chemiluminescent enzyme immunoassay analyzer (CLEIA) LUMIPULSE G600 (Fujirebio, Tokyo, Japan). Cut-off values for CSF will be determined following the guidelines provided by Fujirebio. 4.7. GENETIC ANALYSIS Genetic analyses will be performed at the AOUC Neurogenetics Laboratory. A standard automated method (QIAcube, QIAGEN) will be used to isolate DNA from blood samples. The selected genes will be analyzed as follows: * APP, PSEN1, PSEN2: All coding exons and intron/exon boundaries will be amplified by polymerase chain reaction (PCR) using primers designed with Primer3 software. * APOE genotypes: APOE genotypes will be investigated by HRMA. Samples with known APOE genotypes validated through DNA sequencing will be used as standard references. * CAG repeat expansions in HTT: They will be determined by polymerase chain reaction amplification assay, using fluorescently labeled primers. Fragment size will be determined by capillary electrophoresis using the "SeqStudio Genetic Analyzer" (ThermoFisher) and GeneMapper version 4.0 software (Applied Biosystems). A set of HTT CAG alleles, whose lengths will be confirmed by DNA sequencing, will be used as the size standard. 5\. DATA ANALYSIS Data analysis will consist of the following steps: 1. Descriptive statistical analysis 2. Training and testing of the machine learning model 5.1. DESCRIPTIVE STATISTICAL ANALYSIS The distribution of variables will be evaluated using the Shapiro-Wilk test. The groups of patients will be characterized using mean and standard deviation, medians and interquartile ranges (IQRs), frequencies or percentages, and 95% confidence intervals for continuous variables, non-normally distributed continuous variables, and categorical variables, respectively. Depending on the distribution of data, we will use ANOVA or nonparametric Kruskal-Wallis tests for between-group comparisons, and Pearson's or Spearman's correlation coefficient to assess correlations between numerical measures. Chi-square tests will be used to compare categorical data. Effect sizes will be calculated using Cohen's d for normally distributed numerical measures, η² for Mann-Whitney's U test, and Cramer's V test for categorical data. 5.2. TRAINING AND TESTING OF THE MACHINE LEARNING MODEL We will define a set of multimodal features including neuropsychological scores, indicators of cognitive reserve, personality traits, EEG features, plasma biomarkers, and genetic variants. A common metric will be defined based on the dispersion of each profile dimension, and then we will train a machine learning algorithm to associate each profile "vector" with biomarker profiles. Initially, we will perform this classification using standard machine learning procedures. In a separate set of analyses, we will follow a deep learning approach by training a multi-level feedforward artificial neural network (ANN) to predict patients' CSF biomarker profiles. The model with the best performance (as assessed by the AUCs of the ROC curves) will be tested on the test set to obtain an unbiased estimate of model performance. 6\. SAMPLE SIZE Assuming a sample AUC value of 0.8, the standard error calculation method proposed by Hanley and McNeil (1982) will allow the limits of the confidence interval to be determined, with 40 patients in the group with Alzheimer's disease (20%) and 160 in the group without (80%, percentages estimated from data of previous studies conducted at AOU Careggi15 ), of width of 0.173 (0.713-0.887). All things being equal, assuming an AUC of 0.9, the confidence interval will be 0.131 (0.834-0.966). Considering a drop-out of 20%, the sample of subjects to be enrolled results in 250.
Study Type
OBSERVATIONAL
Enrollment
250
IRCCS Policlinico San Donato
San Donato Milanese, Milan, Italy
RECRUITINGDiagnostic accuracy of biomarkers in detecting Alzheimer's disease
The accuracy of blood-based biomarkers will be evaluated for predicting a biological diagnosis of AD and for predicting progression of cognitive decline during follow-up. The biological diagnosis of AD will be defined by cerebrospinal fluid biomarker positivity, specifically an abnormal Aβ42/Aβ40 ratio and elevated CSF p-tau181. Progression of cognitive decline will be defined as worsening in at least one cognitive domain, loss of autonomy, or progression to MCI or dementia.
Time frame: 12-24 months
Accuracy of neuropsychological and neurophysiological measures in predicting AD pathology defined according to CSF biomarker profile.
Classical regression models and machine learning models to predict the AD pathology. AD pathology will be defined according to the CSF biomarker profile as follows: A+ if at least one of the core biomarkers (CSF Aβ42/Aβ40 ratio\<0.069, p-tau\>56,5 ng/L) is positive, and as A- if none of the core biomarkers is positive. The following neuropsychological scores will be considered as predictors: global measurements, Digit and Visuo-spatial Span, Rey Auditory Verbal Learning Test, Short Story Recall, Rey-Osterrieth complex figure copy and recall, Trail Making Test, attentional matrices, Multiple Features Targets Cancellation, Category Fluency Task, Phonemic Fluency Task, the Screening for Aphasia NeuroDegeneration, Copying drawings. Regarding neurophysiological measures, we will use as predictors the following EEG features: absolute and relative power in frequency bands (alpha, beta, theta, delta); Peak frequency and individual alpha frequency; Power spectral density metrics.
Time frame: 12-24 months
Accuracy of automated speech analysis in predicting AD
Classical regression models and machine learning models to predict the AD pathology. AD pathology will be defined according to the CSF biomarker profile as follows: A+ if at least one of the core biomarkers (CSF Aβ42/Aβ40 ratio\<0.069, p-tau\>56,5 ng/L) is positive, and as A- if none of the core biomarkers is positive. Automated speech analysis will be used to extract the following features that will be used as predictors in the model: acoustic and voice quality measures (fundamental frequency, intensity, jitter, shimmer, formant frequencies); prosodic features (speech and articulation rate, pitch variability, pause frequency and duration, intonation patterns); temporal and fluency measures (phonation time, response latency, filled and unfilled pauses, disfluencies); and linguistic features derived from automatic transcripts, including lexical diversity, word frequency, part-of-speech distributions, sintactic complexity, and semantic coherence.
Time frame: 12-24 months
Effect of genetic variants on the risk of AD in patients with SCD
A logistic regression model and a Cox Regression model will be used to estimate the risk of AD. AD pathology will be defined according to the CSF biomarker profile as follows: AD pathology will be defined according to the CSF biomarker profile as follows: A+ if at least one of the core biomarkers (CSF Aβ42/Aβ40 ratio\<0.069, p-tau\>56,5 ng/L) is positive, and as A- if none of the core biomarkers is positive.. Genetic variants in candidate genes (PSEN1, PSEN2, APOE, TREM2, ABCA7, BDNF, HTT and APOE), will be considered as indipendent variables in the model.
Time frame: 12-24 months
A machine learning model to predict AD
Features identified in the previous sections (plasma biomarkers, neuropsychological, neurophysiological, and genetic measures) will be included as predictors in a multimodal machine learning model. A machine learning algorithm will be trained to predict AD pathology defined according to the CSF biomarker profile as follows: A+ if at least one of the core biomarkers (CSF Aβ42/Aβ40 ratio\<0.069, p-tau\>56,5 ng/L) is positive, and as A- if none of the core biomarkers is positive. Thirty percent of the entire dataset will be reserved as a test set, while the remaining 70% will be used for training and validation. A five-fold cross-validation approach will be employed to train the models and optimize hyperparameters. The best-performing model, as evaluated by the area under the ROC curve (AUC), will be tested on the held-out test set to obtain an unbiased estimate of model performance.
Time frame: 12-24 months
This platform is for informational purposes only and does not constitute medical advice. Always consult a qualified healthcare professional.