Somatic mutation as an explanation for epigenetic aging

Somatic mutation as an explanation for epigenetic aging

Play all audios:

Loading...

ABSTRACT DNA methylation marks have recently been used to build models known as epigenetic clocks, which predict calendar age. As methylation of cytosine promotes C-to-T mutations, we


hypothesized that the methylation changes observed with age should reflect the accrual of somatic mutations, and the two should yield analogous aging estimates. In an analysis of multimodal


data from 9,331 human individuals, we found that CpG mutations indeed coincide with changes in methylation, not only at the mutated site but with pervasive remodeling of the methylome out to


±10 kilobases. This one-to-many mapping allows mutation-based predictions of age that agree with epigenetic clocks, including which individuals are aging more rapidly or slowly than


expected. Moreover, genomic loci where mutations accumulate with age also tend to have methylation patterns that are especially predictive of age. These results suggest a close coupling


between the accumulation of sporadic somatic mutations and the widespread changes in methylation observed over the course of life. Access through your institution Buy or subscribe This is a


preview of subscription content, access via your institution ACCESS OPTIONS Access through your institution Access Nature and 54 other Nature Portfolio journals Get Nature+, our best-value


online-access subscription $29.99 / 30 days cancel any time Learn more Subscribe to this journal Receive 12 digital issues and online access to articles $119.00 per year only $9.92 per issue


Learn more Buy this article * Purchase on SpringerLink * Instant access to full article PDF Buy now Prices may be subject to local taxes which are calculated during checkout ADDITIONAL


ACCESS OPTIONS: * Log in * Learn about institutional subscriptions * Read our FAQs * Contact customer support SIMILAR CONTENT BEING VIEWED BY OTHERS MAKING SENSE OF THE AGEING METHYLOME


Article 02 May 2022 DISENTANGLING AGE-DEPENDENT DNA METHYLATION: DETERMINISTIC, STOCHASTIC, AND NONLINEAR Article Open access 28 April 2021 UNIVERSAL DNA METHYLATION AGE ACROSS MAMMALIAN


TISSUES Article Open access 10 August 2023 DATA AVAILABILITY All data analyzed were from The Cancer Genome Atlas Pan-Can cohort34,35,36 (http://xena.ucsc.edu/) and the Pan-Cancer Analysis of


Whole Genomes48 (https://xenabrowser.net/datapages/?hub=https://pcawg.xenahubs.net:443). Data can be accessed from the provided links and are described further in the respective


publications (https://doi.org/10.1038/ng.2764, https://doi.org/10.1038/s41586-020-1969-6)35,37. Data to replicate the figures in this manuscript can be found on figshare (‘Somatic mutation


as an explanation for epigenetic aging (Koch et al. 2024)’, https://figshare.com/projects/Somatic_mutation_as_an_explanation_for_epigenetic_aging_Koch_et_al_2024_/224232)75. The panel of


normal and gnomAD resources used for filtering the somatic mutation calls can be accessed by downloading Mutect2 (https://gatk.broadinstitute.org/hc/en-us/articles/360037593851-Mutect2). A


file containing Illumina 450k array CpG locations and characteristics can be accessed on the Illumina website


(https://webdata.illumina.com/downloads/productfiles/humanmethylation450/humanmethylation450_15017482_v1-2.csv). The hg19 genome annotation can be accessed through the University of


California, Santa Cruz, website (https://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/cpgIslandExt.txt.gz). CODE AVAILABILITY All custom algorithms and analysis code are in the GitHub


repository at https://github.com/zanekoch/MutationsAndMethylationAging/. REFERENCES * Szilard, L. On the nature of the aging process. _Proc. Natl Acad. Sci. USA_ 45, 30–45 (1959). Article 


CAS  PubMed  PubMed Central  Google Scholar  * Cagan, A. et al. Somatic mutation rates scale with lifespan across mammals. _Nature_ 604, 517–524 (2022). Article  CAS  PubMed  PubMed Central


  Google Scholar  * Alexandrov, L. B. et al. Clock-like mutational processes in human somatic cells. _Nat. Genet._ 47, 1402–1407 (2015). Article  CAS  PubMed  PubMed Central  Google Scholar


  * Moore, L. et al. The mutational landscape of human somatic and germline cells. _Nature_ 597, 381–386 (2021). Article  CAS  PubMed  Google Scholar  * Jaiswal, S. & Ebert, B. L. Clonal


hematopoiesis in human aging and disease. _Science_ 366, eaan4673 (2019). Article  CAS  PubMed  PubMed Central  Google Scholar  * Lodato, M. A. et al. Aging and neurodegeneration are


associated with increased mutations in single human neurons. _Science_ 359, 555–559 (2018). Article  CAS  PubMed  Google Scholar  * Bae, T. et al. Analysis of somatic mutations in 131 human


brains reveals aging-associated hypermutability. _Science_ 377, 511–517 (2022). Article  CAS  PubMed  PubMed Central  Google Scholar  * Stratton, M. R., Campbell, P. J. & Futreal, P. A.


The cancer genome. _Nature_ 458, 719–724 (2009). Article  CAS  PubMed  PubMed Central  Google Scholar  * Blagosklonny, M. V. DNA- and telomere-damage does not limit lifespan: evidence from


rapamycin. _Aging (Albany NY)_ 13, 3167–3175 (2021). Article  CAS  PubMed  Google Scholar  * López-Otín, C., Blasco, M. A., Partridge, L., Serrano, M. & Kroemer, G. The hallmarks of


aging. _Cell_ 153, 1194–1217 (2013). Article  PubMed  PubMed Central  Google Scholar  * Moore, L. D., Le, T. & Fan, G. DNA methylation and its basic function. _Neuropsychopharmacology_


38, 23–38 (2013). Article  CAS  PubMed  Google Scholar  * Li, E., Beard, C. & Jaenisch, R. Role for DNA methylation in genomic imprinting. _Nature_ 366, 362–365 (1993). Article  CAS 


PubMed  Google Scholar  * Deaton, A. M. & Bird, A. CpG islands and the regulation of transcription. _Genes Dev._ 25, 1010–1022 (2011). Article  CAS  PubMed  PubMed Central  Google


Scholar  * Ehrlich, M. et al. Amount and distribution of 5-methylcytosine in human DNA from different types of tissues of cells. _Nucleic Acids Res._ 10, 2709–2721 (1982). Article  CAS 


PubMed  PubMed Central  Google Scholar  * Jabbari, K. & Bernardi, G. Cytosine methylation and CpG, TpG (CpA) and TpA frequencies. _Gene_ 333, 143–149 (2004). Article  CAS  PubMed  Google


Scholar  * Meaney, M. J. & Szyf, M. Environmental programming of stress responses through DNA methylation: life at the interface between a dynamic environment and a fixed genome.


_Dialogues Clin. Neurosci._ 7, 103–123 (2005). Article  PubMed  PubMed Central  Google Scholar  * Hannum, G. et al. Genome-wide methylation profiles reveal quantitative views of human aging


rates. _Mol. Cell_ 49, 359–367 (2013). Article  CAS  PubMed  Google Scholar  * Horvath, S. DNA methylation age of human tissues and cell types. _Genome Biol._ 14, R115 (2013). Article 


PubMed  PubMed Central  Google Scholar  * McCrory, C. et al. GrimAge outperforms other epigenetic clocks in the prediction of age-related clinical phenotypes and all-cause mortality. _J.


Gerontol. A Biol. Sci. Med. Sci._ 76, 741–749 (2021). Article  PubMed  Google Scholar  * Lu, A. T. et al. DNA methylation GrimAge strongly predicts lifespan and healthspan. _Aging (Albany


NY)_ 11, 303–327 (2019). Article  CAS  PubMed  Google Scholar  * Levine, M. E. et al. An epigenetic biomarker of aging for lifespan and healthspan. _Aging (Albany NY)_ 10, 573–591 (2018).


Article  PubMed  Google Scholar  * Li, A., Koch, Z. & Ideker, T. Epigenetic aging: biological age prediction and informing a mechanistic theory of aging. _J. Intern. Med._ 292, 733–744


(2022). Article  PubMed  Google Scholar  * Yang, J.-H. et al. Loss of epigenetic information as a cause of mammalian aging. _Cell_ 186, 305–326 (2023). Article  CAS  PubMed  PubMed Central 


Google Scholar  * de Magalhães, J. P. Ageing as a software design flaw. _Genome Biol._ 24, 51 (2023). Article  PubMed  PubMed Central  Google Scholar  * López-León, M. & Goya, R. G. The


emerging view of aging as a reversible epigenetic process. _Gerontology_ 63, 426–431 (2017). Article  PubMed  Google Scholar  * Ito, S. et al. Tet proteins can convert 5-methylcytosine to


5-formylcytosine and 5-carboxylcytosine. _Science_ 333, 1300–1303 (2011). Article  CAS  PubMed  PubMed Central  Google Scholar  * Wang, M. et al. Identification of DNA motifs that regulate


DNA methylation. _Nucleic Acids Res._ 47, 6753–6768 (2019). Article  CAS  PubMed  PubMed Central  Google Scholar  * Nachun, D. et al. Clonal hematopoiesis associated with epigenetic aging


and clinical outcomes. _Aging Cell_ 20, e13366 (2021). Article  CAS  PubMed  PubMed Central  Google Scholar  * Gibbs, J. R. et al. Abundant quantitative trait loci exist for DNA methylation


and gene expression in human brain. _PLoS Genet._ 6, e1000952 (2010). Article  PubMed  PubMed Central  Google Scholar  * McCartney, D. L. et al. Genome-wide association studies identify 137


genetic loci for DNA methylation biomarkers of aging. _Genome Biol._ 22, 194 (2021). Article  CAS  PubMed  PubMed Central  Google Scholar  * Youk, J., An, Y., Park, S., Lee, J.-K. & Ju,


Y. S. The genome-wide landscape of C:G > T:A polymorphism at the CpG contexts in the human population. _BMC Genomics_ 21, 270 (2020). Article  CAS  PubMed  PubMed Central  Google Scholar


  * Ju, Y. S. et al. Somatic mutations reveal asymmetric cellular dynamics in the early human embryo. _Nature_ 543, 714–718 (2017). Article  CAS  PubMed  PubMed Central  Google Scholar  *


Duncan, B. K. & Miller, J. H. Mutagenic deamination of cytosine residues in DNA. _Nature_ 287, 560–561 (1980). Article  CAS  PubMed  Google Scholar  * Ellrott, K. et al. Scalable open


science approach for mutation calling of tumor exomes using multiple genomic pipelines. _Cell Syst._ 6, 271–281 (2018). Article  CAS  PubMed  PubMed Central  Google Scholar  * Cancer Genome


Atlas Research Network et al. The Cancer Genome Atlas Pan-Cancer analysis project. _Nat. Genet._ 45, 1113–1120 (2013). Article  PubMed Central  Google Scholar  * Liu, J. et al. An integrated


TCGA pan-cancer clinical data resource to drive high-quality survival outcome analytics. _Cell_ 173, 400–416 (2018). Article  CAS  PubMed  PubMed Central  Google Scholar  * ICGC/TCGA


Pan-Cancer Analysis of Whole Genomes Consortium. Pan-cancer analysis of whole genomes. _Nature_ 578, 82–93 (2020). Article  Google Scholar  * Bibikova, M. et al. High density DNA methylation


array with single CpG site resolution. _Genomics_ 98, 288–295 (2011). Article  CAS  PubMed  Google Scholar  * Liu, X. et al. Metallothionein 2A (MT2A) controls cell proliferation and liver


metastasis by controlling the MST1/LATS2/YAP1 signaling pathway in colorectal cancer. _Cancer Cell Int._ 22, 205 (2022). Article  CAS  PubMed  PubMed Central  Google Scholar  * Si, M. &


Lang, J. The roles of metallothioneins in carcinogenesis. _J. Hematol. Oncol._ 11, 107 (2018). Article  PubMed  PubMed Central  Google Scholar  * Fu, J. et al. Metallothionein 1G functions


as a tumor suppressor in thyroid cancer through modulating the PI3K/Akt signaling pathway. _BMC Cancer_ 13, 462 (2013). Article  PubMed  PubMed Central  Google Scholar  * Tong, M. et al.


Evaluation of MT family isoforms as potential biomarker for predicting progression and prognosis in gastric cancer. _Biomed Res. Int._ 2019, 2957821 (2019). Article  PubMed  PubMed Central 


Google Scholar  * Pinney, S. E. Mammalian non-CpG methylation: stem cells and beyond. _Biology (Basel)_ 3, 739–751 (2014). PubMed  Google Scholar  * Mathelier, A. et al. Cis-regulatory


somatic mutations and gene-expression alteration in B-cell lymphomas. _Genome Biol._ 16, 84 (2015). Article  PubMed  PubMed Central  Google Scholar  * Luo, X. et al. Effects of DNA


methylation on TFs in human embryonic stem cells. _Front. Genet._ 12, 639461 (2021). Article  CAS  PubMed  PubMed Central  Google Scholar  * Wang, M., Ngo, V. & Wang, W. Deciphering the


genetic code of DNA methylation. _Brief. Bioinform._ 22, bbaa424 (2021). Article  PubMed  PubMed Central  Google Scholar  * Villicaña, S. & Bell, J. T. Genetic impacts on DNA


methylation: research findings and future perspectives. _Genome Biol._ 22, 127 (2021). Article  PubMed  PubMed Central  Google Scholar  * Russo, G. et al. DNA damage and repair modify DNA


methylation and chromatin domain of the targeted locus: mechanism of allele methylation polymorphism. _Sci. Rep._ 6, 33222 (2016). Article  CAS  PubMed  PubMed Central  Google Scholar  *


Morano, A. et al. Targeted DNA methylation by homology-directed repair in mammalian cells. Transcription reshapes methylation on the repaired gene. _Nucleic Acids Res._ 42, 804–821 (2014).


Article  CAS  PubMed  Google Scholar  * Allen, B., Pezone, A., Porcellini, A., Muller, M. T. & Masternak, M. M. Non-homologous end joining induced alterations in DNA methylation: a


source of permanent epigenetic change. _Oncotarget_ 8, 40359–40372 (2017). Article  PubMed  PubMed Central  Google Scholar  * Pagès-Gallego, M. et al. Direct detection of 8-oxo-dG using


nanopore sequencing. Preprint at _bioRxiv_ https://doi.org/10.1101/2024.05.17.594638 (2024). * Takahashi, K. & Yamanaka, S. Induction of pluripotent stem cells from mouse embryonic and


adult fibroblast cultures by defined factors. _Cell_ 126, 663–676 (2006). Article  CAS  PubMed  Google Scholar  * Gill, D. et al. Multi-omic rejuvenation of human cells by maturation phase


transient reprogramming. _eLife_ 11, e71624 (2022). Article  CAS  PubMed  PubMed Central  Google Scholar  * Ocampo, A. et al. In vivo amelioration of age-associated hallmarks by partial


reprogramming. _Cell_ 167, 1719–1733 (2016). Article  CAS  PubMed  PubMed Central  Google Scholar  * Martincorena, I. et al. Tumor evolution. High burden and pervasive positive selection of


somatic mutations in normal human skin. _Science_ 348, 880–886 (2015). Article  CAS  PubMed  PubMed Central  Google Scholar  * Martincorena, I. & Campbell, P. J. Somatic mutation in


cancer and normal cells. _Science_ 349, 1483–1489 (2015). Article  CAS  PubMed  Google Scholar  * Li, R. et al. A body map of somatic mutagenesis in morphologically normal human tissues.


_Nature_ 597, 398–403 (2021). Article  CAS  PubMed  Google Scholar  * Chen, Y. et al. Discovery of cross-reactive probes and polymorphic CpGs in the Illumina Infinium HumanMethylation450


microarray. _Epigenetics_ 8, 203–209 (2013). Article  CAS  PubMed  PubMed Central  Google Scholar  * Jaffe, A. E. & Irizarry, R. A. Accounting for cellular heterogeneity is critical in


epigenome-wide association studies. _Genome Biol._ 15, R31 (2014). Article  PubMed  PubMed Central  Google Scholar  * Tomusiak, A. et al. Development of an epigenetic clock resistant to


changes in immune cell composition. _Commun. Biol._ 7, 934 (2024). Article  CAS  PubMed  PubMed Central  Google Scholar  * Wang, T. et al. Quantitative translation of dog-to-human aging by


conserved remodeling of the DNA methylome. _Cell Syst._ 11, 176–185 (2020). Article  PubMed  PubMed Central  Google Scholar  * Lu, A. T. et al. Universal DNA methylation age across mammalian


tissues. _Nat. Aging_ 3, 1144–1166 (2023). Article  CAS  PubMed  PubMed Central  Google Scholar  * Rozenblit, M. et al. Evidence of accelerated epigenetic aging of breast tissues in


patients with breast cancer is driven by CpGs associated with polycomb-related genes. _Clin. Epigenetics_ 14, 30 (2022). Article  CAS  PubMed  PubMed Central  Google Scholar  * Moqri, M. et


al. PRC2-AgeIndex as a universal biomarker of aging and rejuvenation. _Nat. Commun._ 15, 5956 (2024). Article  CAS  PubMed  PubMed Central  Google Scholar  * Van Egeren, D. et al.


Reconstructing the lineage histories and differentiation trajectories of individual cancer cells in myeloproliferative neoplasms. _Cell Stem Cell_ 28, 514–523 (2021). Article  PubMed  PubMed


Central  Google Scholar  * Ferrall-Fairbanks, M. C. et al. Progenitor hierarchy of chronic myelomonocytic leukemia identifies inflammatory monocytic-biased trajectory linked to worse


outcomes. _Blood Cancer Discov._ 3, 536–553 (2022). Article  CAS  PubMed  PubMed Central  Google Scholar  * McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for


analyzing next-generation DNA sequencing data. _Genome Res._ 20, 1297–1303 (2010). Article  CAS  PubMed  PubMed Central  Google Scholar  * Nassar, L. R. et al. The UCSC Genome Browser


database: 2023 update. _Nucleic Acids Res._ 51, D1188–D1195 (2023). Article  CAS  PubMed  Google Scholar  * Raney, B. J. et al. Track data hubs enable visualization of user-defined


genome-wide annotations on the UCSC Genome Browser. _Bioinformatics_ 30, 1003–1005 (2014). Article  CAS  PubMed  Google Scholar  * Kent, W. J. et al. The human genome browser at UCSC.


_Genome Res._ 12, 996–1006 (2002). Article  CAS  PubMed  PubMed Central  Google Scholar  * Tang, G., Cho, M. & Wang, X. OncoDB: an interactive online database for analysis of gene


expression and viral infection in cancer. _Nucleic Acids Res._ 50, D1334–D1339 (2022). Article  CAS  PubMed  Google Scholar  * Chen, T. & Guestrin, C. XGBoost: a scalable tree boosting


system. In _Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining_ 785–794 (Association for Computing Machinery, 2016). * Alexandrov, L. B. et


al. The repertoire of mutational signatures in human cancer. _Nature_ 578, 94–101 (2020). Article  CAS  PubMed  PubMed Central  Google Scholar  * Pedregosa, F. et al. Scikit-learn: machine


learning in Python. _J. Mach. Learn. Res._ 12, 2825–2830 (2011). Google Scholar  * Koch, Z. Zip of all data. _figshare_ https://doi.org/10.6084/m9.figshare.27270468.v1 (2024). Download


references ACKNOWLEDGEMENTS This study was funded by the National Institutes of Health under awards U54 CA274502 (T.I.), P41 GM103504 (T.I.) and R01AG059416 (S.C.). S.C. and D.E. also


receive support from The Sequoia Center for Research on Aging, California Pacific Medical Center Research Institute. AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Program in Bioinformatics


and Systems Biology, University of California, San Diego, La Jolla, CA, USA Zane Koch, Adam Li & Trey Ideker * California Pacific Medical Center Research Institute, San Francisco, CA,


USA Daniel S. Evans & Steven Cummings * Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, USA Daniel S. Evans & Steven


Cummings * Department of Medicine, University of California, San Diego, La Jolla, CA, USA Trey Ideker Authors * Zane Koch View author publications You can also search for this author


inPubMed Google Scholar * Adam Li View author publications You can also search for this author inPubMed Google Scholar * Daniel S. Evans View author publications You can also search for this


author inPubMed Google Scholar * Steven Cummings View author publications You can also search for this author inPubMed Google Scholar * Trey Ideker View author publications You can also


search for this author inPubMed Google Scholar CONTRIBUTIONS Z.K. designed the study, carried out the primary data analyses and wrote the manuscript. A.L. and D.S.E. assisted with data


analysis and study design considerations. T.I. and S.C. designed the study and wrote the manuscript. CORRESPONDING AUTHORS Correspondence to Steven Cummings or Trey Ideker. ETHICS


DECLARATIONS COMPETING INTERESTS T.I. is a cofounder of Serinus and Data4Cure, is on their scientific advisory boards and has an equity interest in both companies. T.I. is on the scientific


advisory board of IDEAYA Biosciences and has an equity interest. The terms of these arrangements have been reviewed and approved by the University of California, San Diego, in accordance


with its conflict of interest policies. The other authors declare no competing interests. PEER REVIEW PEER REVIEW INFORMATION _Nature Aging_ thanks Wolfgang Wagner and the other, anonymous,


reviewer(s) for their contribution to the peer review of this work. ADDITIONAL INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published


maps and institutional affiliations. EXTENDED DATA EXTENDED DATA FIG. 1 LINKS AMONG CPG MUTATIONS, METHYLOME REMODELING, AND AGING. A) Various mutational processes affect the genome. Here,


we show that some of these mutations associate with an aberrant DNA methylation pattern at both the mutated site and at numerous neighboring CpGs. B) An individual’s DNA mutation profile and


DNA methylation profile make similar predictions of their calendar age and rate of aging. Panel A created with BioRender.com. EXTENDED DATA FIG. 2 SUPPLEMENTAL CHARACTERIZATION OF CPG


MUTATIONS. A) The distribution of methylation fraction values of each CpG site in the TCGA and PCAWG datasets separately (TCGA = 273,202 and PCAWG = 326,749 CpG sites) in each sample (TCGA =


 8,680 and PCAWG = 651 samples). B) The CpG density (number of CpGs per base pair) in the 50 and 125 base pairs surrounding each of the CpG sites in (A). The central line of the inner


boxplot represents the median, the edges of the box the interquartile range (IQR), and the whiskers 1.5-times the IQR. C) Violin plots of the distribution of mean methylation fraction of


non-mutated individuals at the same mutated CpG sites as in Fig. 1d (n = 8,037 sites), stratified by CpG mutation type. D) As in (C), but the distribution of CpG density in the 125 bp


surrounding each CpG site. E) Pie chart showing the proportion of CpG mutations (n = 467,079 mutations) that result in specific mutated nucleotides. Note that 5’-CpG-3’ sites are


palindromic, corresponding to a 3’-GpC-5’ sequence on the opposite strand; thus, mutation of the C residue is equivalent to mutation of the complementary G residue. For simplicity, we refer


to all CpG mutations by the status of the C residue. F) Violin plot showing the mean methylation fraction across all PCAWG samples, considering CpG sites where a mutation has occurred in at


least one sample (left, n = 1,137 CpG sites), CpG sites where no mutation has occurred in any sample (middle, n = 325,614 CpG sites), and all measured CpG sites (right, n = 326,751).


Significant difference of distribution (p ≤ 3.03 × 10–50) is marked with (***) and non-significant (p > 0.05) with (n.s.), based on a two-sided Mann-Whitney test. G) Methylation fraction


at the same mutated CpG sites as Fig. 1d (n = 8,037 sites). CpG sites are binned into five groups based on MAF, with violin plots summarizing the distribution of methylation fraction within


each group. Vertical bars inside each violin represent the interquartile range. Two-sided p value calculated based on the exact distribution of Pearson’s r modeled as a beta function.


EXTENDED DATA FIG. 3 MAGNITUDE OF METHYLATION CHANGE NEAR SOMATIC MUTATIONS BY TISSUE AND GENOMIC CONTEXT. A) Boxplots of the distribution of ΔMF10kb values for mutated (red) versus random


control (n = 260,000, blue) sites for each tissue type separately (n = 813, 144, and 1,643 mutated sites from Pancreas, Brain, and Ovary tissues, respectively). P value shown for a two-sided


Mann-Whitney test for a difference in median methylation fraction between the mutated and non-mutated random control loci. P value shown for a two-sided Mann-Whitney test for a difference


in median absolute deviation (MAD) of ΔMF10kb between the mutated and non-mutated random control loci. The central line represents the median, the edges of the box the interquartile range


(IQR), and the whiskers 1.5-times the IQR. B) A histogram of the median methylation fraction across comparison sites within ±10 kb of mutated (n = 2,600, red) and random control sites (n = 


260,000, blue). Mutated sites are the same as Fig. 3b. Random control sites have been selected as before, with the additional criteria of having a methylation profile matched to that of the


matched samples at mutated sites (as measured by the median methylation fraction of comparison sites, Methods). P value shown for a two-sided Mann-Whitney test for a difference in median


methylation fraction between the mutated and random control loci. C) Probability distribution of ΔMF10kb values for mutated (red) versus random control (blue) sites. Mutated and random sites


are the same as (B). P value calculated as in (A). D) Line plot depicting the fold enrichment for mutated over non-mutated random control sites as a function of ΔMF10kb, for the same sites


as Fig. 3b. Sites are stratified depending on whether the site is a CpG and/or falls within a CpG island (n = 419 CpG-non-CGI, 21 CpG-CGI, 2,120 non-CpG-non-CGI, and 39 non-CpG-CGI sites).


Fold enrichment is the ratio of the probability of observing a given ΔMF10kb for mutated sites versus non-mutated random control sites. ΔMF10kb is divided into equally spaced bins from –0.4


to 0.4. E) Barchart showing the fold-enrichment of mutated sites with the most extreme methylation changes (absolute ΔMF10kb | Z-score | > 1.96, n = 401 mutated sites) in various genomic


regions, compared to all other mutated sites (n = 2,199 mutated sites). P values were calculated using a two-sided Fisher exact test. The categories ‘Upstream gene’ and ‘Downstream gene’


refer to variants located within 1 kb of the 5’ transcription start site and the 3’ transcription stop site, respectively, but outside the gene itself. F) As in (E), but comparing the


mutated sites with the most extreme gains of methylation (Z-score of ΔMF10kb > 1) to those with the most extreme losses of methylation (Z-score of ΔMF10kb < –1). G) Boxplot of the


ΔMF10kb value as a function of the mutated allele frequency (MAF). Same sites and samples as Fig. 3e (n = 3,880 mutated loci. The Pearson correlation is shown for the association of MAF with


ΔMF10kb and the absolute value of ΔMF10kb. Two-sided p values were calculated based on the exact distribution of Pearson’s r modeled as a beta function. The central line represents the


median, the edges of the box the interquartile range (IQR), the whiskers 1.5-times the IQR, and the points all ΔMF10kb value outside of these ranges. EXTENDED DATA FIG. 4 MUTATION-ASSOCIATED


METHYLATION CHANGE IN NORMAL TISSUES. A) Probability distribution of ΔMF1kb values for mutated (red) versus random control (blue) sites. Includes n = 463 mutated sites (n = 146 samples)


with MAF ≤ 0.15, ≥10 matched individuals (individuals of same tissue type within ± 10 years of age), and ≥1 measured CpG within the window. Random control sites include n = 46,300


non-mutated sites (n = 146 samples, Methods). P value shown for a two-sided Mann-Whitney test for a difference in median absolute deviation (MAD) of ΔMF1kb between the mutated and


non-mutated random control loci. B) Line plot depicting the fold enrichment for mutated over non-mutated sites as a function of ΔMF1kb. Fold enrichment is the ratio of the probability of


observing a given ΔMF1kb for mutated sites versus the probability of that ΔMF1kb for non-mutated control sites. ΔMF1kb is divided into equally spaced bins from –0.45 to 0.45. C) Absolute


ΔMF1kb as the window center is moved away from the mutated site (n = 463, red). This quantity is also shown for non-mutated random control sites (n = 46,300, blue) (Methods). Points indicate


the mean value and error bars denote the 95% confidence interval. A significant difference in distribution of absolute ΔMF1kb values (two-sided t-test) is marked (**, p ≤ .01), (*, p ≤


.05). Other comparisons are non-significant (n.s., p > 0.05). EXTENDED DATA FIG. 5 SUPPLEMENTAL AGE PREDICTION ACCURACY. A) Bar plot indicating the correlation of chronological age with


the age predictions of mutation clocks (left) or methylation clocks (right). Correlations are shown across all tumor tissues (n = 1,601) and in each of five TCGA tumor tissues individually:


LGG (Brain), GBM (Brain-2), SARC (Bone), KIRP (Kidney), and THCA (Thyroid). B) As in (A) but for age predictions using samples from normal (that is non-cancerous) tissues (n = 40


individuals). C) Heatmap indicating the pairwise consistencies (Pearson correlation) among the mutation age in normal tissue, mutation age in tumor tissue, and chronological age. Data shown


for n = 22 individuals with mutations measured in both normal and tumor tissues (the same individuals as from panel B with the exception of 11 colon samples and 7 liver samples as these were


not available in the tumor samples). D) As in (c), but comparing predictions from methylation clocks. E) Scatter plot of human individuals, showing age predictions from the mutation model


versus their chronological age. Shared area denotes the 95% confidence interval of the line of best fit. Includes 40 individuals from four normal tissues (Methods). A two-sided p value was


calculated based on the exact distribution of Pearson’s r modeled as a beta function. F) Similar to panel (B) but showing age predictions from the methylation rather than mutation model. G)


Violin plots of the methylation age residual versus mutation age residual (Methods). Plots include the same individuals as in panels (B,C). Pearson r refers to the correlation between


methylation age residual and mutation age residual, controlling for chronological age (that is, partial correlation, p = 1.76 × 10–3). The central line of the inner boxplot represents the


median, the edges of the box the interquartile range (IQR), the whiskers 1.5-times the IQR, and the points all the methylation age residual values. Statistics calculated as in (E). EXTENDED


DATA FIG. 6 PERFORMANCE COMPARISON TO PREVIOUS EPIGENETIC CLOCKS. A) Pearson r between predicted and chronological age for Hannum, Horvath, and PhenoAge clocks across the same samples as


Fig. 4b (n = 1,601). Predictions were done using the subset of features from each clock that existed in our methylation data after quality control (66%, 63%, and 61% of CpG sites from the


Hannum, Horvath, and PhenoAge clocks, respectively). The performance of this study’s methylation clock is not shown as it is inherently fit to the TCGA dataset in 5-fold CV. B) Pearson r


between predicted and chronological age for Hannum, Horvath, and PhenoAge clocks after re-fitting (Methods). Same samples as (A). The performance of the methylation clock trained in this


study (‘This study’) is shown for reference. EXTENDED DATA FIG. 7 MUTATION AGE PREDICTION WITHOUT WHOLE-GENOME FEATURES. A) Correlation of chronological versus predicted age, shown for


mutation or methylation clocks built without whole-genome features (n = 1,601 individuals). Correlations are shown across all tissues and in each of five TCGA tissues individually: LGG


(Brain), GBM (Brain-2), SARC (Bone), KIRP (Kidney), and THCA (Thyroid). B) As in (A) but for age predictions using samples from normal (that is non-cancerous) tissues (n = 40). C) The


methylation age residual is plotted versus the mutation age residual, using clocks without whole-genome features (Methods). Violin plots summarize the same samples as in panel (A). Pearson r


refers to the correlation between methylation age residual and mutation age residual, controlling for chronological age (that is, partial correlation, p = 6.66 × 10–105). The central line


of the inner boxplot represents the median, the edges of the box the interquartile range (IQR), and the whiskers 1.5-times the IQR. A two-sided p value was calculated based on the exact


distribution of Pearson’s r modeled as a beta function. D) Similar to (C), but for the samples in (B). The central line of the inner boxplot represents the median, the edges of the box the


interquartile range (IQR), the whiskers 1.5-times the IQR, and the points all the methylation age residual values. Statistics calculated as in (C). SUPPLEMENTARY INFORMATION REPORTING


SUMMARY RIGHTS AND PERMISSIONS Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or


other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. Reprints and


permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Koch, Z., Li, A., Evans, D.S. _et al._ Somatic mutation as an explanation for epigenetic aging. _Nat Aging_ 5, 709–719 (2025).


https://doi.org/10.1038/s43587-024-00794-x Download citation * Received: 08 December 2023 * Accepted: 12 December 2024 * Published: 13 January 2025 * Issue Date: April 2025 * DOI:


https://doi.org/10.1038/s43587-024-00794-x SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not


currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative