Study design
To identify circRNAs in blood associated with AD, we generated RNA-sequencing (RNA-seq) data from 816 CU individuals and 405 participants with AD, covering the entire AD continuum (Fig. 1, Table 1, Supplementary Table 1 and Supplementary Fig. 1). The mean age at blood draw was similar between individuals with AD and CU individuals, with average ages of 76.8 and 74.5 years, respectively. A total of 717 samples had CSF Aβ and pTau181 measurements, 776 had amyloid-PET data, and 915 had plasma pTau217. We used multiple circRNA bioinformatic tools to perform high-quality and robust circRNA calls (Supplementary Table 2). Next, we analyzed if blood circRNA levels were associated with clinical AD. Blood circRNAs were considered significant if they were associated with AD after multiple-testing correction.
Schematic of the CU and AD groups within the logistic models predicting AD status of Knight-ADRC discovery blood samples. Sample sizes of biomarker-positive groups were independent of clinical AD status. AD-related diseases included PD, DLB and FTD. Model replication was performed in the Knight-ADRC replication cohort and the A4 cohort. AT, Aβ and Tau; DE, differential expression.
Next, we developed a predictive model for biomarker-confirmed status (CSF Aβ and Tau levels) using the associated circRNAs and benchmarked the exact model against amyloid-PET status or plasma pTau217 levels. As up to 30% of CU individuals present with amyloid pathology later develop disease, we leveraged the longitudinal clinical data to also determine if the circRNA model identified those who progressed to symptomatic AD and to determine the time in which circRNAs change in relation to clinical disease onset. To benchmark our model against other established biomarkers, we performed comparison analyses against plasma pTau217, amyloid-PET and CSF biomarkers and analyzed whether integrating the circRNA model with the blood-based biomarker pTau217 led to better predictive power for progression to symptomatic AD.
We also performed sensitivity analyses stratified by sex, APOE4, and ancestry. To determine if the model was specific to AD or captured neurodegeneration in general, we tested the model in additional Parkinson’s disease (PD), frontotemporal dementia (FTD) and dementia with Lewy body (DLB) samples.
Last, replication analyses were performed in additional independent cohorts: the Knight-ADRC and A4 datasets. The Knight-ADRC cohort is at Washington University in St. Louis and conducts prospective studies on memory and aging for the treatment and prevention of AD. The A4 cohort examines anti-amyloid treatment and cognitive decline in individuals with preclinical AD and evidence of amyloid accumulation based on amyloid-PET. For the A4 cohort, only blood RNA-seq data were available for baseline samples from participants who were CU; we replicated the association of the circRNA model against biomarker-confirmed AD status and AD progression. Because all but one participant was recruited as CU in the A4 cohort, the average age at blood draw of CU individuals was 71.4 years, and the individual recruited at an early disease stage had an age at blood draw of 82 years.
Identification of blood circRNAs associated with AD
To identify circRNAs associated with AD in blood, we analyzed the levels of circRNAs and clinical AD status in individuals with AD (n = 405) and CU individuals (n = 829) from the Knight-ADRC cohort. A total of 1,601 circRNAs passed stringent quality control (QC) in the two bioinformatics programs used to identify and quantify circular RNAs: DCC16 and CIRI2 (ref. 17). Of the 203 circRNAs with nominally significant association (P < 0.05) with clinical status, using DCC counts, 35 circRNA transcripts passed false discovery rate (FDR) correction (Extended Data Fig. 1a), including circDNAJC6 (P = 1.88 × 10−8), circMBOAT2 (P = 3.80 × 10−4) and circPICALM (P = 1.23 × 10−3; Supplementary Table 3).
To confirm that our results were robust, we performed sensitivity analyses using CIRI2 and CIRI3 counts (Supplementary Table 4) on the 203 circRNAs found to be associated with AD using DCC. The effect sizes of the 203 circRNAs when using CIRI2 or DCC were highly correlated (R2 = 0.78, P < 0.001), as well as those between the individual circRNA counts (Extended Data Fig. 1b). In addition, 34 of the 35 DCC circRNAs remained FDR significant, with consistent direction and high effect size correlation (Extended Data Fig. 1c and Supplementary Table 5). CIRI3 (ref. 18) identified all 34 of the circRNAs, and 32 passed stringent QC, with circUBAP2 and circSEC61A1 being just below the count filtering threshold. Of the 34 significant circRNAs, 23 passed FDR correction using CIRI3 counts, and 25 had nominally significant associations with clinical status (Extended Data Fig. 1d–g). Furthermore, the circRNA counts of this third tool were highly correlated with both DCC (R2 = 0.84, P < 0.001) and CIRI2 (R2 = 0.85, P < 0.001).
Expression of blood circRNAs across tissues
We next examined the expression of these 34 AD-associated circRNA in the brain and other tissues. The CircAtlas 3.0 database contains mean circRNA expression (counts per million) data for 33 different tissues. We performed percentile ranking of circRNA expression in the brain, and 30 of the 34 circRNAs had brain expression ranked in the 80th or above percentile of expression across tissues (Extended Data Fig. 2a–c and Supplementary Table 6). Moreover, we examined the relative levels of these circRNAs using real-time quantitative PCR (qPCR), and 14 of the 21 circRNAs with validated qPCR primers showed higher expression in the brain (Extended Data Fig. 2d–x). Together, most of the 34 circRNAs showed high expression in the brain, suggesting that these transcripts are brain enriched, as circRNAs are able to cross the blood–brain barrier.
CircRNA independence from linear mRNA in blood
Changes in circRNA expression can be independent of cognate linear mRNA, and we next determined whether these circRNA associations were separate from their cognate mRNA. Of the 34 circRNA transcripts, 33 transcripts (31 genes) had a cognate linear mRNA. The association of the circRNAs was considered independent of the linear forms if the circRNA retained a significant association (P < 0.05) in a multivariable model. Over 87% (29 of 33) of the blood circRNAs were considered to have an independent association from their cognate linear mRNA, including circDNAJC6, circPICALM and circNUP54 (Supplementary Table 7).
AD diagnostic accuracy using circRNAs in blood
We analyzed the diagnostic accuracy of the 34 circRNAs associated with AD (Supplementary Table 8). The Knight-ADRC discovery cohort included 405 individuals with AD and 816 CU individuals with APOE4 data available. As CU individuals may present with AD pathology and a small percentage of participants with memory problems may have non-AD dementia, we trained the circRNA model on distinguishing between CU individuals who are biomarker negative (Aβ−Tau− (A−T−)) and individuals with AD who are biomarker positive (Aβ+Tau+ (A+T+)). The circRNA model showed an area under the receiver operating characteristic (ROC) curve (AUC) of 0.945 for A−T− versus A+T+ biomarker-confirmed samples, which was higher than that of plasma pTau217 (AUC = 0.877; Fig. 2a, Supplementary Fig. 2 and Supplementary Table 9). Furthermore, combining the circRNA model with the pTau217 model led to an AUC of 0.967.
a,b, ROC curve analysis using the 34 circRNAs with biomarker-confirmed status (79 A+T+ AD, 252 A−T− CU; a) and amyloid-PET positivity (256 A+, 520 A−; b). c, Whisker plots of biomarker-confirmed status (79 A+T+ AD, 252 A−T− CU) and amyloid-PET (256 A+, 520 A−) positivity prediction using CSF Aβ42/pTau181, amyloid-PET, plasma pTau217, and the 34 blood circRNAs. CircRNA models were compared to known AD biomarkers alone. Whisker plots of the 34 blood circRNAs using Knight-ADRC (WU) discovery and replication (36 A+T+ AD, 174 A−T− CU; 147 A+, 303 A−) compared to the A4 replication (131 A+T+, 285 A−T−; 457 A+, 1,310 A−) cohort for biomarker-confirmed status (d) and amyloid-PET prediction (e). The circRNA models included library-size-normalized counts of the top status circRNAs and the covariates (age at draw, sex, median transcript integrity number (TIN) and number of APOE4 alleles). The biomarker-confirmed model in the A4 cohort is amyloid-PET and plasma pTau217 (A−T− versus A+T+); thus, pTau217 comparisons (AUC = 1) are excluded from the A4 whisker plot. The whiskers of the whisker plots show the 95% CI of the AUC.
The sex-stratified analyses19,20 showed robust AUC for women (AUC = 0.903) and men (AUC = 0.909; Supplementary Tables 10 and 11). The APOE-stratified analyses3 showed an AUC of 0.856 in APOE4− individuals and 0.883 in APOE4+ individuals (Supplementary Tables 12 and 13).
We next examined the circRNA predictive ability to identify individuals with brain amyloidosis based on amyloid-PET (520 A−, 256 A+). The same circRNA model (same circRNAs, weights and cutoff) showed an AUC of 0.757 (Fig. 2b) and improved up to an AUC of 0.931 when integrating with pTau217 (Fig. 2c–e). The integrated circRNA and pTau217 model showed similar predictive performance as CSF biomarkers Aβ42 and pTau181 together (A+T+; Supplementary Fig. 3). Together, these data show that the 34 blood circRNAs have higher predictive ability using biomarker-confirmed status (Supplementary Table 14) than amyloid positivity in the brain.
Robustness of the blood circRNA predictive model across ancestries
To determine if the circRNA model can also be applied to samples from diverse genetic backgrounds, we tested the same model in individuals of European (n = 978) and African (n = 92) ancestries and an additional 35 individuals from diverse backgrounds (African-American (n = 16), admixed American (n = 8), East Asian (n = 1), South Asian (n = 1) and mixed ancestries (n = 9); Supplementary Fig. 4). The model showed similar predictive ability across ancestries, including European (AUC = 0.906), African (AUC = 0.934; Fig. 3a) and mixed ancestries (AUC = 0.922). Together, the circRNA model shows robust AUC across ancestries.
a, ROC curve analysis predicting biomarker-confirmed status in European (EUR; n = 243; 68 AD, 175 CU), African (AFR; n = 38; 4 AD, 34 CU) and MIX (n = 41; 5 AD, 36 CU) ancestry samples. MIX combines the 38 African and 3 admixed blood samples with CSF biomarker data available. Populations were determined using the 1000 Genomes Project dataset as a reference. b, ROC curve analysis of logistic models predicting biomarker-confirmed status in Knight-ADRC blood samples using AD-related dementias (276 PD, 26 DLB and 11 FTD). c, Violin plot of AD versus CU prediction values over non-AD dementias. The dashed line represents the AUC threshold (0.342) for AD versus CU classification using the 34 blood circRNAs. The box plots display the Q1, median, Q3 and whiskers within the range of 1.5× the interquartile range. The significance levels from two-sided Wilcoxon rank-sum tests include P values equal to or less than 0.001 (***). The models include library-size-normalized counts of the 34 circRNAs and the covariates (age at blood draw, sex, median TIN and number of APOE4 alleles).
Specificity of the blood circRNA model for AD
To determine whether the 34-circRNA model is specific to AD, we also tested the same model in non-AD dementias (276 PD, 26 DLB and 11 FTD) using the same CU participants. The 34-circRNA model showed very low AUC values when applied to non-AD dementias (PD AUC = 0.413, DLB AUC = 0.545 and FTD AUC = 0.404; Fig. 3b,c). We further examined if there were any circRNAs that were nominally significant in all neurodegenerative diseases compared to controls (Extended Data Fig. 3a–d). When comparing across all diseases, 3 of the 34 predictive model circRNAs (circRBM23, circEPB41 and circNUP54) were at least nominally associated across all diseases. These results suggest that most of the circRNAs included in the predictive model are AD-specific, which could explain the low predictive power in other diseases.
Progression to symptomatic AD
As clinical data were available showing that several individuals (78 participants) progressed from CU to symptomatic AD after blood collection, we performed survival analyses to determine if the 34 blood circRNAs could also predict progression to symptomatic AD (Table 1). Using a Cox proportional hazard model, the circRNA model showed a hazard ratio (HR) of 2.92 (95% confidence interval (95% CI) 1.63–5.23), which was significantly higher than that of pTau217 alone (HR = 1.81, 95% CI 1.11–2.94; P = 0.002; Fig. 4a,b).
a,b, Kaplan–Meier curves of AD progression using the 34 circRNAs and plasma pTau217 positivity in separate models (a) and in a combined model (b). The Cox proportional hazards models are two-sided tests. c, ROC analysis of logistic models predicting AD progression within 5 years in Knight-ADRC blood CU samples (n = 688; 78 progressors, 610 CU) using the 34 circRNAs compared to pTau217 and amyloid-PET alone. The circRNA models included circRNA counts and the covariates (age at blood draw, sex, median TIN and number of APOE4 alleles). d, Violin plot of prediction values over TTO (TTO: age at blood draw − age at onset (AAO)) subsets (2-year intervals: ‘−8 to −6’ (n = 5), ‘−6 to −4’ (n = 20), ‘−4 to −2’ (n = 33), ‘−2 to 0’ (n = 18)) from estimated onset in AD progressors compared to biomarker-confirmed status samples (79 A+T+ AD, 252 A−T− CU). The dashed line indicates the threshold for AD versus CU classification using the 34 blood circRNAs. The box plots display the Q1, median, Q3 and whiskers within the range of 1.5× the interquartile range. The significance levels from the two-sided Wilcoxon rank-sum test include P values equal to or less than 0.001 (***) and 0.01 (**). e, Whisker plot of progression to symptomatic AD HRs in Knight-ADRC (WU) discovery and replication (61 progressors, 371 CU) and A4 replication (97 progressors, 221 CU) cohorts. The whisker plots show the 95% CI for the HRs.
Similar results were found when we analyzed progression to symptomatic AD within 5 years, with the circRNA model showing a significantly higher AUC (AUC = 0.870; P = 1.86 × 10−5) than pTau217 alone (AUC = 0.676; Fig. 4c and Extended Data Fig. 4a,b). The sex and APOE-stratified analyses also showed high AUCs (AUC > 0.80) that were significantly higher than those of pTau217 (P < 0.05). Thus, the 34 blood circRNAs predicted AD progression better than plasma pTau217.
Next, we analyzed whether combining pTau217 with the circRNA model could further improve the identification of individuals who progress to symptomatic AD. For these analyses, we compared the individuals who were negative for both biomarkers (pTau217 and the 34 circRNAs; n = 411) to those who were positive for both (n = 77), those positive for pTau217 but negative for the circRNA model (n = 146) and those showing the opposite pattern (n = 54; Fig. 4b). Our analyses indicated that individuals positive for both biomarkers indeed progressed faster than those in any of the previous models (HR = 4.83, 95% CI 2.19–10.68). In general, only 15% of individuals negative for both progressed to AD, compared to 84% who progressed to AD and were positive for both. In addition, these analyses identified two intermediate groups: one with medium to low risk, defined by those positive for pTau217 but negative for circRNA (HR = 3.25, 95% CI 1.36–7.76), from which 41% progressed to AD; and one with medium to high risk, from which 74% progressed to AD (HR = 3.96, 95% CI 1.50–10.48).
Determining the time of circRNA changes in relation to disease
Pseudotrajectories of samples progressing to symptomatic AD
We used survival modeling to calculate the time to onset (TTO) of AD progressors to infer when the overall 34-circRNA model changes in relation to clinical onset. TTO was define as the difference between the age at onset and the age at blood draw (age at blood draw- age at onset). We created bins based on TTO with 2-year intervals, starting from ‘−8 to −6’ (n = 5), ‘−6 to −4’ (n = 20), ‘−4 to −2’ (n = 33) and ‘−2 to 0’ years (n = 18). The circRNAs showed significant changes starting at the ‘−4 to −2’ TTO range and continuing closer to onset (Fig. 4d). Together, these observations suggest that there is a linear and consistent increase in the overall circRNA levels that starts in the presymptomatic phase around 2–4 years before onset and continues increasing until symptomatic AD.
We next examined the association of the overall circRNA model with predicting progression of dementia severity based on the Clinical Dementia Rating (CDR)21. Within the samples with no cognitive impairment at the time of blood draw, the blood circRNAs showed an AUC of 0.781 for differentiating between samples from participants who did not progress (CDR = 0; n = 737) and those from participants who progressed to cognitive impairment (CDR > 0; n = 42) by the last clinical visit (Supplementary Fig. 5). Furthermore, the blood circRNA model showed better differentiation between participants who did not progress and participants who progressed to later stages of dementia (CDR = 0.5+; n = 34). Likewise, differentiating between participants with no cognitive impairment (CDR = 0) and those in the very early disease stage (CDR = 0.5) had an AUC of 0.722, suggesting that blood circRNAs capture changes in the early disease stage (Supplementary Fig. 6). Together, overall levels of the 34 circRNAs are associated with dementia severity.
Replication of blood circRNA prediction models
Orthogonal replication using CIRI2 and CIRI3 counts
We next examined whether the DCC-based models had predictive ability using read counts from the independent bioinformatic tools CIRI2 (ref. 17) and CIRI3 (ref. 18). Overall, analyses using any of the circRNA tools showed similar predictive ability, and prediction of biomarker-confirmed status using CIRI2 (AUC = 0.848) and CIRI3 (AUC = 0.801) was comparable to the DCC counts model (AUC = 0.945; Extended Data Fig. 5a–c). Similar results were found for the prediction of progression to symptomatic AD and survival analyses, indicating that either circRNA quantification tool could lead to robust and consistent results.
Validation of the circular RNA model in AD-related phenotypes in independent samples from the Knight-ADRC
To analyze the robustness of the model in biomarker status prediction, we analyzed the association of the circRNA model with amyloid-PET (n = 543) and plasma pTau217 (n = 602) status from the discovery dataset that was not used in training the biomarker-confirmed model. The circRNA predictive ability for AD biomarker status was similar in the subset of samples not used in training, including amyloid-PET (training AUC = 0.757; additional samples AUC = 0.726) and plasma pTau217 (training AUC = 0.747; additional samples AUC = 0.716; Extended Data Fig. 6a–c).
Replication in independent Knight-ADRC samples
To validate the blood circRNA model in independent samples, we used separate samples from the Knight-ADRC cohort as a first replication cohort (Table 1). These samples (n = 551) included biomarker-confirmed status (36 A+T+ AD, 174 A−T− CU) and amyloid-PET (147 A+, 303 A−) data. Of the participants with plasma pTau217, 61 progressed to symptomatic AD within 8 years. The circRNA model showed an AUC of 0.863 for biomarker-confirmed status, which increased to 0.955 when integrating with pTau217. The same model also showed high prediction for amyloid-PET positivity (AUC = 0.771; Extended Data Fig. 7a–c). The circRNA model also replicated for progression to symptomatic AD, showing an HR of 3.83 by itself and 13.95 when combined with pTau217 (Fig. 4e).
Replication in the independent A4 cohort
To replicate the blood circRNA model in a second independent AD cohort, we mined the existing A4 RNA-seq data (Table 1). As the RNA-seq data from this study were collected at baseline when all participants were CU, we were not able to perform a head-to-head comparison of the circRNA model for biomarker-confirmed status. Instead, we analyzed the model performance to identify biomarker-positive (amyloid-PET and plasma pTau217) individuals. The circRNA model showed robust prediction for biomarker status (AUC = 0.723; Supplementary Figs. 7 and 8), which was comparable with the AUC for amyloid positivity in the Knight-ADRC discovery (AUC = 0.755) and replication (AUC = 0.757) cohorts. The circRNA model also replicated for progression to symptomatic AD, showing an HR of 2.93, which was significantly higher than that of the pTau217 model (HR = 1.87), further increasing to 4.58 when pTau217 and the circRNA model were integrated. No significant differences were found when using DCC, CIRI2 or CIRI3 counts (Fig. 4e). Together, the prediction model created using the Knight-ADRC cohort was replicated in the A4 dataset for brain amyloidosis and progression to symptomatic AD models, highlighting model generalizability and applicability to AD diagnosis in blood.
