Comparing the age predictor by Weidener et al 2014 with the age predictor by Horvath 2013

In their article, Weidener et al (2014) present a predictor of age based on human blood methylation levels. Excitingly, this age predictor only uses 3 CpGs. Last year, I published an age predictor (referred to as epigenetic clock) that works well in most human cells/tissues/organs (Horvath 2013, PMID: 24138928). However, my age predictor makes use of 353 CpGs. Given that a sparse predictor has obvious practical advantages, readers may be interested in learning how the epigenetic clock compares to the predictor by Weidner et al.

To provide a fair comparison, I applied both predictors to the test sets mentioned in Figure 2 of (Horvath 2013). The comparison is fair because the epigenetic clock was not constructed (trained) on these test data sets.

A Figure that shows the results of the comparison can be found here

FigureComparison.pdf

or on my webpage

http://labs.genetics.ucla.edu/horvath/htdocs/dnamage/weidener2014

The Figure shows that the predictor by Weidener leads to a good correlation with chronological age (r=0.72 across all test sets, Figure panel A) but to an unacceptably high median absolute error (13 years). When restricting the analysis to whole blood, the correlation is slightly higher (r=0.76, panel D) but the median error of 11 years remains high.

By comparison, the epigenetic clock leads to a very high correlation (r=0.96, panel M) with age across the test set data and a very low median error=3.6 years. In all considered tissues and cell types, the epigenetic clock greatly outperforms the predictor by Weidener in terms of age correlation and median error (Figure).

My analysis has two limitations. First, Weidener et al mention that their predictor works best on pyrosequencing data. I could only evaluate these two predictors on Illumina array data. A future comparison could try to compare both predictors on pyrosequencing data. Second, Weidener et al use a CpG site upstream of cg17861230. Since I only had Illumina array data, I had to use cg17861230 in my implementation of their predictor.

Specifically, I implemented the predictor by Weidener as follows

Predicted age=38.0-26.4*cg02228185-23.7*cg25809905+164.7*cg17861230

 

Despite these limitations, the results strongly suggest that sparsity comes at a cost in terms of accuracy and in terms of applicability to other tissues.

 

References:

·         Horvath S (2013) DNA methylation age of human tissues and cell types. Genome Biology.2013, 14:R115. DOI: 10.1186/10.1186/gb-2013-14-10-r115. PMID: 24138928

·         Weidner CI, Lin Q, Koch CM, Eisele L, Beier F, Ziegler P, Bauerschlag DO, Jöckel KH, Erbel R, Mühleisen TW, Zenke M, Brümmendorf TH, Wagner W (2014) Aging of blood can be tracked by DNA methylation changes at just three CpG sites. Genome Biol. 2014 Feb 3;15(2):R24. PMID: 24490752

Figure: Evaluation of the predictor by Weidener et al 2013 and the epigenetic clock in test data sets

Each scatter plot shows how chronological age (y-axis) relates to predicted age (x-axis). The caption of each plot reports the median absolute error and the correlation between predicted and true value. Points are labelled and colored by data set as described in Horvath 2013. A-L) The first two rows show how the age predictor by Weidener performs in different tissues/fluids/cell types. Note that the predictor leads to a moderate correlation but an unacceptably high median error even in blood tissue. M-V) The last two rows show how the epigenetic clock (Horvath 2013) performs in the same data sets. Note that it consistenly outperforms the predictor by Weidener in terms of age correlation and median error.