Play all audios:
ABSTRACT In plants, epigenetic regulation is critical for silencing transposons and maintaining proper gene expression. However, its impact on the genome-wide transcription initiation
landscape remains elusive. By conducting a genome-wide analysis of transcription start sites (TSSs) using cap analysis of gene expression (CAGE) sequencing, we show that thousands of TSSs
are exclusively activated in various epigenetic mutants of _Arabidopsis thaliana_ and referred to as cryptic TSSs. Many have not been identified in previous studies, of which up to 65% are
contributed by transposons. They possess similar genetic features to regular TSSs and their activation is strongly associated with the ectopic recruitment of RNAPII machinery. The activation
of cryptic TSSs significantly alters transcription of nearby TSSs, including those of genes important for development and stress responses. Our study, therefore, sheds light on the role of
epigenetic regulation in maintaining proper gene functions in plants by suppressing transcription from cryptic TSSs. SIMILAR CONTENT BEING VIEWED BY OTHERS RNA INTERFERENCE-INDEPENDENT
REPROGRAMMING OF DNA METHYLATION IN _ARABIDOPSIS_ Article 30 November 2020 LONG-READ DIRECT RNA SEQUENCING REVEALS EPIGENETIC REGULATION OF CHIMERIC GENE-TRANSPOSON TRANSCRIPTS IN
_ARABIDOPSIS THALIANA_ Article Open access 05 June 2023 H3K36 METHYLATION STAMPS TRANSCRIPTION RESISTIVE TO PRESERVE DEVELOPMENT IN PLANTS Article 31 March 2025 INTRODUCTION Eukaryotic
genomes are comprised a large part of mobile genetic sequences, so-called transposable elements (TEs)1. Due to their mobility, TEs induce various alterations to the host genome, ranging from
genetic mutations to large-scale genomic rearrangements, such as inversions and translocations2,3. Genetic variations caused by TEs can introduce novel regulatory elements and therefore be
a major driving force underlying genome evolution1,2. On the other hand, uncontrolled activities of TEs can severely damage gene expression and the integrity of the host genomes3. To
suppress negative impacts without losing potential benefits brought in by TEs, both plants and animals have evolved numerous epigenetic mechanisms involving DNA methylation, histone
modifications, and small non-coding RNAs, allowing TEs remain silenced in their genomes4,5. Compared to mammals, plants are equipped with a different set of epigenetic mechanisms for greater
adaptability to dynamic environmental changes, partly due to their sessile nature. For example, in mammalian genomes, DNA sequences are mainly methylated at the cytosine in the CG
dinucleotides, while in plants cytosine methylation exists in both CG and non-CG contexts, which has different functional impacts on gene and TE regulation6. In the plant model _Arabidopsis
thaliana_ (_A. thaliana_), DNA methylation is established de novo by the RNA-directed DNA methylation (RdDM) pathway, which requires the functional activity of PolIV and PolV, two
plant-specific RNA polymerases5. After establishment, methylation patterns can be maintained by different factors depending on cytosine contexts. CG methylation is maintained by
METHYLTRANSFERASE 1 (MET1), a plant homologue of the mammalian DNA (cytosine-5)-methyltransferase 1 (DNMT1). Maintenance of DNA methylation in CHG context, on the other hand, is facilitated
by CHROMOMETHYLASE 3 (CMT3) in a positive feedback loop with the histone H3K9 methylase KRYPTONITE (KYP) (or SUPPRESSOR OF VARIEGATION 3-9 HOMOLOGUE 4 (SUVH4))7. Together with two of its
paralogues, SUVH5 and SUVH6, KYP regulates the genome-wide accumulation of H3K9me2 and consequently, CHG methylation6. CHH methylation can be maintained by either CMT2 or DOMAINS REARRANGED
METHYLASE 2 (DRM2) depending on the features of their targets, in which DRM2 often methylates short, euchromatic TEs, while CMT2 targets long TEs located in histone H1-containing
heterochromatic regions with the help of chromatin remodeler DECREASED DNA METHYLATION 1 (DDM1)8. These epigenetic pathways in plants, however, are highly interwoven. For example, MET1 and
CMT3 are involved in maintaining asymmetric methylation, while DMR2 and CMT2 may also affect DNA methylation in other contexts9. Epigenetic silencing of TEs inevitably confers regulatory
impacts on gene expression, especially when TEs are located close to transcription units2,4. In plants, repressive modifications triggered by TE insertions within introns or promoter regions
can attenuate or even turn off the expression of the associated genes10,11,12. At a global scale, genes harboring, or located close to, silenced TEs exhibit lower expression than their
counterparts13,14. Due to such unfavorable impacts, plants have evolved specific pathways to keep transcription units clear of repressive modifications, or to tolerate the presence of such
modifications when necessary. For example, in _A. thaliana_ the Jumonji C (jmjC) domain-containing histone demethylase INCREASE IN BONSAI METHYLATION 1 (IBM1) prevents repressive H3K9
methylation and consequently, CHG methylation, from accumulating at actively transcribed genes15. On the other hand, host factors, such as INCREASE IN BONSAI METHYLATION 2 (IBM2) and
Enhanced Downy Mildew 2 (EDM2) are required for proper transcription of genes containing heterochromatic domains16,17, likely due to the functional importance of these domains14. The
development of high resolution \({5}^{\prime}\) end-centered expression profiling techniques, such as oligo-capping methods18 or cap analysis of gene expression (CAGE)19, has greatly
advanced our understanding of gene regulation at a transcription initiation level. Studies employing these techniques have revealed both common and distinct features of the core promoters
and their origin and regulation, in many organisms20,21,22. In mammals, for example, CAGE sequencing (CAGE-seq) analyses revealed that a large fraction of cell-type specific transcripts in
stem and cancer cells originate from long terminal repeats (LTRs) of retroelements23,24. The loss of DNA methylation also causes spurious transcription within thousands of genes in mouse
embryonic stem cells25. In addition, modulating DNA methylation and histone deacetylation pathways pervasively activates cryptic transcription start sites (TSSs) normally silenced in human
cells26. These examples demonstrate the importance of epigenetic mechanisms in regulating transcription initiation in mammalian genomes. In plants, large-scale analyses have determined
thousands of TSSs, providing fundamental information about genetic structure and regulatory elements important for transcription in plant genomes27,28. Previous studies have also revealed
core promoter structures and sequence elements associated with plant TSSs29,30,31. However, these studies mainly focus on active TSSs in the wild-type background. The contribution of
epigenetic regulation to shaping the genome-wide transcription initiation landscape and its functional significance in plants, therefore, remains largely unexplored. To dissect the
functional impacts of epigenetic regulation in shaping the plant transcription initiation landscape, we employ CAGE-seq to generate the genome-wide profiles of TSSs at a high resolution for
various mutants of _A. thaliana_ that compromise epigenetic control. Our analysis identifies thousands of TSSs exclusively activated in the mutant backgrounds, demonstrating that epigenetic
regulation profoundly affects transcription initiation in _Arabidopsis_. These so-called cryptic TSSs are mainly located at heterochromatic regions, which hinder their accessibility to RNA
Polymerase II (RNAPII) transcription machinery. The alteration of DNA methylation maintenance in _met1_ activates the largest number of cryptic TSSs, which significantly overlap with the
targets regulated by other epigenetic pathways. A large fraction of cryptic TSSs originate from TEs of both retro and DNA-transposon families, suggesting that TEs are reservoirs of putative
TSSs in the _A. thaliana_ genome. Strikingly, the activation of cryptic TSSs significantly alters the regular transcription of nearby TSSs, which includes those of genes important for
development and stress responses in _Arabidopsis_. This study, therefore, sheds light on the role of epigenetic regulation in maintaining proper gene functions in plants by suppressing
transcription initiated from cryptic TSSs. In addition, the accompanying data are a valuable resource for studying the epigenetic control of the transcription of genes and TEs in plants.
RESULTS MAPPING TSSS IN EPIGENETIC MUTANTS OF _A_. _T__H__A__L__I__A__N__A_ BY CAGE-SEQ To gain a comprehensive view regarding the epigenetic regulation of transcription initiation in
plants, we performed CAGE-seq analyses of various _A. thaliana_ mutants, where epigenetic control is compromised, including mutants of maintenance DNA methyltransferase _met1_, the chromatin
remodeler _ddm1_, RdDM pathway components _nrpd1_ and _nrpe1_, histone H3K9 methyltransferases _suvh456_, histone H3K9 demethylase _ibm1_, and intragenic heterochromatin regulatory factors,
_ibm2_ and _edm2_. A total of 1,250,203,294 CAGE-seq reads were mapped to the _A. thaliana_ Col reference genome, achieving an average mapping efficiency of 97.53%. Of which, 402,814,394
reads were mapped uniquely, compiling a large collection of CAGE-seq data for this model plant (Supplementary Data 1). The expression of individual CAGE-based TSSs (CTSSs) was highly
correlated between replicates (the median of Pearson correlation coefficients was 0.95) (Supplementary Fig. 1a, b), confirming the reproducibility of our data. In total, 37,726 consensus tag
clusters representing single TSSs were identified across all samples (hereafter TSSs is used to refer to consensus tag clusters identified in this study, to distinguish from the
TAIR-annotated TSSs), of which about 30% were exclusively expressed in the mutant backgrounds (Supplementary Data 2). To confirm the relevance of our data, we analyzed the genome
distribution of 26,561 TSSs identified in wild-type sample. A majority of them (18,634 or ~70%) were located in promoters and \({5}^{\prime}\) UTRs of 17,722 (~64%) annotated genes (Fig.
1a), and about one-fourth (~24%) were located in intragenic regions, of which exonic TSSs were more prevalent than the intronic counterparts. Although the mechanisms leading to the
prevalence of exonic TSSs in the plant genomes have yet been clear21, a part of them may represent \({5}^{\prime}\)-end capped products of post-transcriptional processing of mature mRNAs, as
described in human and vertebrate genomes32,33. Alternatively, some may correspond to cryptic promoters that trigger spurious transcription from gene bodies25,34, or to mis-annotated
TSSs22. Nevertheless, consistent with a previous study21, the expression of intragenic TSSs was significantly lower than that of their counterparts located in promoters and \({5}^{\prime}\)
UTRs (Fig. 1b). Moreover, the TSSs in promoters and \({5}^{\prime}\) UTRs were found in close proximity to the TAIR10-annotated TSSs (Supplementary Fig. 2a, b). A similar result was obtained
using the Araport11 genome annotations, (Supplementary Fig. 3a–c), with a shift in the numbers of TSSs assigned to each genome feature (Fig. 1a, Supplementary Fig. 3a). Because of the
higher consistency with the TSSs identified by our CAGE-seq (Supplementary Figs. 2a, b, 3b, c), TAIR10 annotations were used in further downstream analysis. On the other hand, active genes
supported by CAGE and mRNA-seq were largely overlapped (Supplementary Fig. 2c), suggesting that active transcription events in _A. thaliana_ can be efficiently captured by our CAGE-seq data.
We then compared wild-type TSSs identified by CAGE-seq with those reported by the paired end analysis of transcription start site (PEAT) method31. They were indeed consistent even though
the samples were prepared from different tissues (Supplementary Fig. 3d–f). At a local scale, the promoter architecture of two well-studied genes, _ALMT1_ (_AT1G08430_) and _sAPX_
(_AT4G08390_), was also reexamined. The former has three functional TSSs within its promoter and the latter has one upstream and one intragenic TSS21. Our data recapitulated these structures
(Supplementary Fig. 4a, b), confirming its consistency with previous studies21,31. It has been found that the loss of CG methylation at a SINE-related repeat in the promoter region
triggered the ectopic expression of the homeobox gene _FLOWERING WAGENINGEN_ (_FWA_), causing a late flowering phenotype of _Arabidopsis_11,35,36. CAGE-seq analysis identified a TSS encoded
within the SINE repeat, which was highly activated in _met1_ and _ddm1_ backgrounds (Fig. 1c). In addition, the ectopic activation of the TSS of the F-box gene _SUPPRESSOR OF drm1 drm2 cmt3_
(_SDC_), whose promoter contains a tandem repeat co-regulated by H3K9 methylation and the RdDM pathway8, was also detected by our data (Fig. 1c). Taken together, these results demonstrate
that our CAGE-seq data can be effectively exploited for the detection and analysis of both regular and cryptic TSSs under epigenetic control. MODULATING EPIGENETIC REGULATION ACTIVATES MANY
CRYPTIC TSSS Next, we investigated the impact of epigenetic regulation on the transcription initiation landscape in the _A. thaliana_ genome in greater details. Compromising epigenetic
controls significantly affected the transcription initiated from hundreds to thousands of TSSs, in which the defect of the maintenance DNA methylation pathway in _met1_ induced changes at
the largest number of targets (Fig. 2a), followed by _ibm1_, _ddm1_, _suvh456_, and _pol4_. To our surprise, _ibm2_ and _edm2_, which cause the transcriptional defect of _IBM1_16,17, had a
lower number of affected TSSs than _ibm1_, suggesting that the IBM1 function is partially maintained in these mutants. Of the altered TSSs, many were activated de novo in the mutant
backgrounds and were not associated with any TAIR10-annotated TSSs (Fig. 2a, Supplementary Fig. 2b, Supplementary Data 3). They were also largely distinct from the TSSs reported by
PEAT-seq31 and the TSSs identified in multiple tissues and light stress conditions in _A. thaliana_21 (Supplementary Fig. 5a, b), suggesting that they are cryptic TSSs suppressed by
epigenetic mechanisms (referred herein as EPICATs, for EPigenetically Induced Consensus tAg clusTers). Our data showed that the EPICATs activated in _met1_ largely overlapped with the
EPICATs regulated by other mutants, confirming the profound regulatory impact of MET1 on the genome-wide transcription initiation in _A. thaliana_ (Fig. 2b). On the other hand, _ddm1_ and
RdDM-associated mutants (_pol4_ and _pol5_) induced stronger activation of the EPICATs than _met1_ (Fig. 2c). Due to the minor numbers of instances, targets of _ibm2_ and _edm2_ were
excluded from further analysis. Similar results were obtained using the Araport11 annotations (Supplementary Figs. 3c, 5c), confirming the robustness of our analysis. As the transcription
orientation at regulatory regions of eukaryotes can be either unidirectional20 or bidirectional37, we examined the directionality of transcription initiated at EPICATs. Our data showed that
transcription at the EPICATs in _met1_ was mainly uni-directional, similar to that of the TAIR10-annotated TSSs in _A. thaliana_ (Supplementary Fig. 6a20). Moreover, the expression levels of
EPICATs were not significantly different from those of the annotated TSSs activated de novo in epigenetic mutants (Supplementary Fig. 6b). We also found that, tag clusters corresponding to
the EPICATs mainly had narrow peaks (NPs), especially those activated in _ddm1_, _met1_, and _pol5_ (Supplementary Fig. 6c), suggesting that they may have a well-defined underlying genetic
architecture31,38. To elucidate putative mechanisms regulating the activity of EPICATs, we first examined the genomic regions where they reside. EPICATs were mainly located at intergenic
regions, except the EPICATs in _ibm1_, of which a majority were intragenic (Fig. 3a, Supplementary Fig. 6d). These intragenic EPICATs, however, may not be directly regulated by the activity
of IBM1, because they were not associated with increased CHG methylation in the _ibm1_ background (Fig. 3b). In contrast, the EPICATs in other mutants were located in genomic regions
decorated with repressive chromatin modifications, such as DNA methylation, H3K9me2, and H3K27me1 (Supplementary Fig. 7a, b). Compared to the EPICATs in other mutants, those activated in
_pol4_ and _pol5_ were also associated with a higher level of CHH methylation and 24 nt siRNAs, the hallmarks of the RdDM pathway (Supplementary Fig. 7a, c). Moreover, DNA methylation at the
EPICATs in all mutants, except in _ibm1_, was significantly reduced, in concomitant with their activation (Fig. 3b, Supplementary Fig. 7d), suggesting that in wild-type plants transcription
initiation at EPICATs is directly suppressed by repressive epigenetic modifications. Since heterochromatic modifications, such as DNA methylation and H3K9me2, are often associated with
closed chromatin in plant genomes39, their loss may alter the access to genomic regions harboring EPICATs. We therefore examined how the accessibility of these loci changes in the mutant
backgrounds. For this purpose, the EPICATs activated in _ddm1_ were used as a proxy due to the large number of instances and the availability of public data characterizing chromatin openness
in _ddm1_40. Indeed, chromatin around the EPICATs became highly accessible in _ddm1_, compared to wild-type plants, as measured by the sensitivity to DNaseI (Fig. 3c). Furthermore, ChIP-seq
analysis showed that RNAPII phosphorylated at Ser5 (Ser5P) and Ser2 (Ser2P) in the C-terminal domain (CTD), the hallmarks of transcription initiation and elongation41 respectively, were
also highly accumulated at the EPICATs in most mutant backgrounds (Fig. 3d, Supplementary Fig. 7e). These data demonstrate that repressive chromatin suppresses the activity of EPICATs by
preventing the access of transcription machinery to genomic regions encompassing potential TSSs. Ectopic transcription initiation in mutants and the convergence of various epigenetic
pathways on a large number of EPICATs (Fig. 2b), together with the narrow shapes of tag clusters corresponding to most of the EPICATs (Supplementary Fig. 6c), suggest that these loci harbor
functional genetic features, such as promoter structure and/or regulatory sequences21, in addition to repressive chromatin modifications. Therefore, genetic sequences surrounding EPICATs
were analyzed. Interestingly, DNA elements and motifs enriched around EPICATs exhibited spatial architecture similar to that of regular plant promoters20,30, with a sharp accumulation of
TATA-box at 36 nt upstream and CA-rich/CT-rich (Y-patch) motifs around the TSSs (Fig. 3e, Supplementary Fig. 8). TATA-box, a core promoter motif conserved in both plants and animals30,38,
was especially enriched at the EPICATs in _met1_ and _ddm1_. The enrichment of the Telobox motif (AAACCCTA), which is known to recruit development-associated repressive modification H3K27me3
in _A. thaliana_42, was also found at the EPICATs in _met1_, _ddm1_, and _suvh456_. The presence of the Telobox sequence around EPICATs may partially explain the accumulation of H3K27me3 at
the heterochromatic regions upon the loss of DNA methylation and H3K9 methylation43. Taken together, we conclude that the _A. thaliana_ genome harbors hundreds of potential TSSs equipped
with functional core promoter architecture similar to that of regular TSSs. Their activities, however, are suppressed by repressive chromatin restricting their accessibility to transcription
machinery. GENE BODY METHYLATION AND THE SUPPRESSION OF INTRAGENIC TSSS In _A. thaliana_, about 20% of protein coding genes accumulate CG methylation in their bodies44. Moreover, gene body
methylation (gbM) is largely conserved across plant species, especially in angiosperms45, suggesting its functional importance. Although many hypotheses have been proposed regarding the
biological functions of gbM, such as suppressing spurious intragenic transcription25, impeding transcriptional elongation46, or reducing transcription noise47, so far its role in plants has
been largely elusive48. By exploiting the high resolution CAGE-seq data of genome-wide TSSs, we reexamined the relationship between gbM and intragenic transcription initiation in _A.
thaliana_. Our data showed that, in wild-type plants, a similar fraction of both body methylated (BM) and non body methylated (non-BM) genes harbored intragenic TSSs, suggesting that the
methylation state of gene body is not significantly associated with the occurrence of intragenic TSSs (Fig. 4a). Moreover, only a few BM genes activated intragenic EPICATs when gbM was
strongly lost in _met1_ background (Fig. 4b, Supplementary Fig. 9a), meanwhile intragenic EPICATs could be activated at some loci without gbM (Fig. 4d). These evidences, which are consistent
with the conclusions of a previous study48, suggested that gbM alone is dispensable for suppressing intragenic transcription at a global scale in _A. thaliana_ (Supplementary Fig. 9b).
Although some BM genes harbored intragenic EPICATs in _met1_ (Fig. 4c, d), at this time, we do not know if this is a direct or indirect effect of _met1_ mutant. Future testing using targeted
demethylation could help resolve if BM is causal at these loci. The intragenic EPICATs in _met1_ may correspond to \({5}^{\prime}\)-end capped products of post-transcriptional processing of
mature mRNAs generated at the associated gene loci, a mechanism well-described in mammals32,33. Although we did not rule out this possibility, our data provided evidences supporting that
some of these EPICATs are genuine TSSs. First, these loci exhibited a stronger accumulation of RNAPII in _met1_ (Supplementary Fig. 9c). Second, only 1/124 genes harboring intragenic EPICATs
also had upstream EPICATs (Supplementary Fig. 9d), suggesting that these intragenic EPICATs correspond to independent, de novo transcribed mRNAs. Third, promoter-associated DNA sequences
were also present at some of these intragenic loci (Fig. 4d). Besides _met1_, _ibm1_ also activated a comparable number of intragenic EPICATs (Supplementary Fig. 6d, Supplementary Data 4).
However, it is unlikely that they are directly regulated by the activity of IBM1 (Fig. 3b, Supplementary Fig. 10a). On the other hand, although the expression of _IBM1_ is significantly
reduced in _met1_ background49, the intragenic EPICATs activated in _ibm1_ and _met1_ were largely un-overlapped (Supplementary Fig. 10b). Moreover, the accumulation of RNAPII at these loci
was not significantly affected in _ibm1_ background (Supplementary Fig. 10c), suggesting that intragenic EPICATs in _ibm1_ and _met1_ are regulated differently. Given that none of the
associated genes simultaneously harbored upstream EPICATs, and that promoter-associated DNA sequences were present at some of these intragenic targets (Fig. 4e), we speculate that some of
them are genuine TSSs, while some others could be derived from post-transcriptionally processed mRNAs. RNAPII AND POLIV EXCLUSIVELY BIND TO RDDM-REGULATED EPICATS It has been reported that,
although PolIV-dependent RNAs (P4RNAs) feature PolII-like TSSs, PolIV and PolII target distinct genomic territories50. Our data, however, showed that 24 nt siRNAs were highly enriched at
genomic loci harboring the EPICATs activated in the mutants of the RdDM pathway’s components, such as _pol4_ and _pol5_ (Supplementary Fig. 7c). The biogenesis of these 24 nt siRNAs was
indeed dependent on PolIV, which is responsible for the transcription of P4RNAs initiated from the corresponding EPICATs (Supplementary Fig. 11a–c). Moreover, in _pol4_ and _pol5_
backgrounds, RNAPII was highly recruited to these loci (Supplementary Fig. 7e). These evidences suggest that, genomic regions harboring the EPICATs regulated by the RdDM pathway likely
possess distinct features compared to those of its regular targets, which allow PolII and PolIV exclusively function at these loci (Supplementary Fig. 11d). TES ARE A MAJOR SUPPLIER OF
CRYPTIC TSSS IN _ARABIDOPSIS_ The existence of a large number of cryptic TSSs within a small and compact genome, like that of _A. thaliana_, has raised important questions regarding their
origin. Investigations involving mammalian genomes have shown that TEs are a major genetic element that can be exapted as TSSs in the host genomes51,52. Although less prevalent, several
lines of study have demonstrated a similar function of TEs in plant genomes53,54. Together with the evidence that EPICATs are mainly located at intergenic regions decorated with repressive
chromatin modifications (Fig. 3a, Supplementary Fig. 7a, b), we speculated that many cryptic TSSs in the _A. thaliana_ genome may have originated from TEs. The data indicated that TEs
contribute to up to 65% of the EPICATs activated in the mutant backgrounds (Supplementary Fig. 12a). Additionally, hundreds of TEs harboring active TSSs were identified in wild-type
background (Fig. 5a, Supplementary Data 5). TEs, therefore, may serve as a reservoir of potential functional TSSs in _A. thaliana_, similar to their role in animal genomes. There are
numerous types of TEs with different origins and mobility strategies1,2 which greatly affect their abilities to induce genetic variations to the host genomes. Therefore, the TSS-encoding
potential of each TE family in the _A. thaliana_ genome was examined. Although EPICATs were associated with various TE families (Fig. 5a), compared to the genome-wide average, LTR/Gypsy
members were enriched among TEs harboring the EPICATs in _ddm1_ and _met1_ (_p_ = 2.0e-52 and 6.0e-49, respectively, Hypergeometric test), while members of the LTR/Copia family were highly
represented among the TE targets of _ddm1_ and _suvh456_ (_p_ = 8.0e-10 and 2.0e-31, respectively, Hypergeometric test). In addition, the DNA/En-Spm family was highly associated with the
EPICATs in _met1_, _ddm1_, and _suvh456_ (Fig. 5a, _p_ < 1.6e-16 for all, Hypergeometric test). Due to the minor numbers of TE instances associated with the EPICATs in _ibm1_, _pol4_, and
_pol5_, they were skipped from enrichment analysis. The data suggest that both retro- and DNA transposons are genetic suppliers of cryptic TSSs in the _A. thaliana_ genome. Since _ddm1_
affected the largest number of TEs harboring EPICATs, and these elements largely overlapped with TEs activated in other mutants (Fig. 5a, Supplementary Fig. 12b), we examined if they possess
any specific features that facilitate their ectopic activation in _ddm1_ background. Compared to their counterparts, which either contain active TSSs in wild-type plants or do not harbor
any EPICATs, TEs harboring EPICATs were more highly methylated in both CG and non-CG contexts (Fig. 5b). They were also substantially longer (Fig. 5b), suggesting that these TEs are likely
younger insertions that still maintain intact structures with transcription and transposition capacities, that may be a trigger for greater accumulation of DNA methylation and other
repressive modifications at the associated loci. Analysis of the core promoter motifs identified at the _ddm1_-activated EPICATs (Supplementary Fig. 8) showed that they were more prevalent
among EPICAT-harboring TEs (Fig. 5c). However, there were still hundreds to thousands of inactive TEs associated with these motifs (Supplementary Fig. 12c). As a case study, the genetic
structure associated with the EPICATs located in the LTR regions of the Gypsy TEs was investigated in a more detail. This was because the LTR/Gypsy family contributed a large number of
elements harboring the EPICATs in _ddm1_ and _met1_ (Fig. 5a), and its members still maintain transcription/transposition potential in the _Arabidopsis_ genome55. Although LTR sequences
surrounding the CAGE-seq peaks were largely diverged between and within Gypsy sub-families, they commonly shared putative TATA-box and TSS-associated YR motifs (Fig. 5d, Supplementary Fig.
12d). However, the conservation of sequences/motifs surrounding the LTR-encoded TSSs could not fully explain their activation in the mutant backgrounds. Moreover, although a significant loss
of repressive modifications (e.g., DNA methylation) was observed at many TEs regardless of their association with the EPICATs in _ddm1_ (Fig. 5e), only EPICAT-harboring elements became
highly accessible in the mutant, especially at their two ends (Fig. 5f). Concomitantly, RNAPII was highly recruited to these loci, together with an increased production of the associated
transcripts (Fig. 5g, Supplementary Fig. 12e). These data suggest that, in addition to the presence of core promoter sequences, factors regulating chromatin environment are required for
RNAPII recruitment and the ectopic activation of TE-encoded EPICATs. REGULATORY IMPACT OF TRANSCRIPTION FROM CRYPTIC TSSS In mammals, TE sequences frequently act as alternative promoters to
regulate development-associated gene expression programs51,52. While the contribution of TEs to plant transcriptomes has been much less clear56, this evidence suggests that regulatory
elements supplied by TEs can be co-opted for transcriptional regulation in plant genomes28. Using the EPICATs activated in _met1_ as a proxy, we therefore investigated the potential
alteration in the _A. thaliana_ transcriptome induced by cryptic TSSs. About ~80% of the EPICATs in _met1_ were associated with the transcripts assembled from mRNA-seq data (Supplementary
Fig. 13a, Supplementary Data 6, see the “Methods” section for details). Moreover, the expression of EPICATs was positively correlated with that of the assembled gene units (Supplementary
Fig. 13a–c). 73% of the transcripts associated with _met1_-activated EPICATs had more than one exons, of which 112 (~9%) shared splicing junctions with 75 reference gene units (Fig. 6a).
Surprisingly, about half (50/112) of these spliced transcripts possessed at least one active TSS in wild-type background, suggesting that their regular transcription, and consequently
downstream functions, can potentially be affected by the ectopic activation of EPICATs. We selected and experimentally confirmed the production of novel cryptic fusion transcripts at some of
these loci in _met1_ and/or _ddm1_ backgrounds, which include _SQN (AT2G15790)_, a gene critical for vegetative shoot maturation57, _COQ3 (AT2G30920)_, a gene encoding a
mitochondria-localized methyltransferase important for ubiquinone biosynthesis and embryo development58,59, and a gene of unknown function (_AT2G16050_) (Fig. 6b, c, Supplementary Fig. 14a,
b). To complement the CAGE-seq data, transcripts with significant alteration in promoter usage were analyzed using mRNA-seq data (see Methods section for details). Of the resulting
transcripts, 10 were found associated with _met1_-activated EPICATs at three gene loci (Supplementary Data 7). We also experimentally confirmed the production of a read-through fusion
transcript from the annotated TSS at the _AT5G28442_ gene locus, which harbored an EPICAT in _met1_ and _ddm1_ backgrounds (Supplementary Fig. 14a, b). Although it has been suggested that
repressive chromatin associated with TE insertions potentially imposes negative impacts on the transcription of nearby genes13,14, direct consequences of TE-encoded TSS activation on the
surrounding transcriptional environment remain obscure. Inspection of the loci producing cryptic fusion transcripts revealed that some of them concurrently exhibited reduced transcription
from their regular TSSs in the mutant backgrounds (Fig. 6b, Supplementary Fig. 14a). This suggests that, the activation of EPICATs may also quantitatively affect the transcription from
nearby regular TSSs. Therefore, wild-type active TSSs located in the vicinity (up to 3 kb) of EPICATs were examined to see how their expression is altered upon EPICAT activation. While some
showed increased expression, the majority were not significantly affected (Fig. 6d, e). Nevertheless, there were groups of TSSs whose expressions were significantly suppressed in concomitant
with the activation of nearby EPICATs (Fig. 6e, Supplementary Data 8). Of the gene loci associated with the TSSs suppressed in _met1_, five were selected for validation by qPCR. Except
_AT5G28442_, which could not be amplified, significant decreases in the expression at three out of the four loci in _met1_ and _ddm1_ were confirmed, which is consistent with the observation
from the CAGE-seq data (Fig. 6f, Supplementary Fig. 14c). These include _AT1G23935_, _SUS5_ (_AT5G37180_), and _PRB1_ (_AT2G14580_), a gene involved in response to abiotic stress in
_Arabidopsis_60. Taken together, these data demonstrate that the activation of cryptic TSSs has critical impacts on the transcriptome of _A. thaliana_, both qualitatively and quantitatively.
DISCUSSION To understand how transcription initiation in plants is epigenetically regulated, we have generated a comprehensive maps of TSSs in various epigenetic mutants of _A. thaliana_
using CAGE-seq. Compared to mammals, epigenetic mechanisms regulating transcription initiation in plants are much less clear, mainly due to a lack of suitable resources which allow the
investigation of the alteration of transcription initiation under different conditions25,26,56. This study, therefore, provides valuable reference data for research communities to enlighten
the impact of epigenetic regulation on transcription initiation landscapes in plants. Our study showed that, in epigenetic mutant backgrounds, thousands of cryptic TSSs are activated, in
which the mutant of maintenance DNA methylation _met1_ regulates the largest number of targets (Fig. 2a). A large number of cryptic TSSs reside in TE sequences, which are dominantly
contributed by members of the LTR/Gypsy, LTR/Copia, and DNA/En-Spm families (Fig. 5a). Interestingly, there is a clear difference in DNA methylation between TEs with and without EPICATs,
where the former accumulate higher DNA methylation (Fig. 5b, e). This suggests that the DNA methylation of TEs could be largely influenced by their potential to initiate transcription. On
the other hand, the analysis of LTR sequences indicated that the conservation of core promoter elements alone is not sufficient for transcription initiation (Fig. 5d, Supplementary Fig. 12d)
as their transcription levels are largely varied, even among LTRs with nearly identical sequences. The ability of TE-encoded TSSs to initiate transcription may, therefore, also be dependent
on their relative positions within TEs (e.g., whether they are located at the \({5}^{\prime}\)- or 3\({}^{\prime}\)-end of the TEs), and/or local chromatin environments, such as
higher-order chromatin conformation and long-range enhancer interactions61. In mammals, the loss of gene-body DNA methylation caused by _DNMT3b_ knockout triggers spurious RNAPII recruitment
and cryptic transcription initiation from intragenic regions25. The analysis of intragenic TSSs in the present study showed that a complete loss of gbM in the _met1_ mutant does not
profoundly activate intragenic transcription in the _Arabidopsis_ genome (Fig. 3a, 4, Supplementary Fig. 9a, b). Recruitment of DNMT3b to genic regions in mammals is dependent on histone
H3K36 methylation62. In yeast, H3K36 methylation (H3K36me) mediated by SET2 suppresses cryptic intragenic transcription initiation63. In plants, however, concurrent loss of both gbM and
H3K36me3 does not show significant difference in transcription between (BM) and unmethylated (UM) loci48. On the other hand, regulation of cryptic transcription from intronic heterochromatin
by the RdDM pathway64, and the suppression of intragenic antisense transcripts by histone H1 and DNA methylation65 have also recently been reported. These results suggest that plants may
employ additional layers of epigenetic regulation to prevent spurious transcription initiation, especially in intragenic regions. The activation of spurious transcription from cryptic TSSs
would inevitably alter transcription from nearby regular TSSs (Fig. 6, Supplementary Fig. 14). The data showed that such alteration may occur in several different scenarios. First, an
activated cryptic TSS located upstream may function as the major initiation site facilitating the formation of a read-through transcript, which can suppress transcription from a downstream
regular TSS, as observed at _AT2G16050_ and _SQN_ loci (Fig. 6b). This regulatory effect is likely facilitated by a less understood mechanism known as transcriptional interference66,67.
Secondly, the activation of a cryptic TSS located downstream may attenuate transcription initiated from an upstream regular TSS and trigger the production of spurious transcripts, as
observed at _AT2G14580_, _AT2G15042_, and _AT5G28442_ loci (Supplementary Fig. 14). Thirdly, when cryptic and regular TSSs are situated close to each other, but in divergent directions,
transcription from the regular TSS may also be suppressed (Fig. 6f). Such repressive impacts could be facilitated by competitive binding to regulatory sequences of transcription initiation
complexes associated with the two TSSs66, or by the mechanism suppressing transcription from divergent promoters68, or by the lack of a mechanism facilitating bi-directional transcription in
plants20 compared to mammals37. Whether the epigenetic regulation of cryptic TSSs brings any potential developmental and/or adaptive advantages or disadvantages to a plant species is of
great interest in plant research. As epigenetic information is relatively flexible and can be reprogrammed according to environmental stimuli, the mechanisms described here may provide
plants with a fast and efficient mean for tuning, or even inverting the polarity of regulatory inputs on, gene expression. In addition, potential activation and co-option of cryptic TSSs can
provide alternative promoters to the existing transcription units, as observed at _AT2G16050_ and _COQ3_ loci (Fig. 6b, c, Supplementary Fig. 14a, b), which may help plants customize gene
functions during development51,52. Such events can also create opportunities for plants to innovate their transcriptome in response to environmental changes. However, the mis-control of
cryptic TSSs encoded in TEs may trigger developmental abnormality in plants11,69. In addition, modulating 3\({}^{\prime}\) and/or \({5}^{\prime}\) UTRs of a transcript without changing its
coding potential can critically affect its function in response to pathogen attacks in _Arabidopsis_70. Epigenetic suppression of a cryptic TSS at the \({5}^{\prime}\) UTR of the LRR gene
_AT2G15042_ (Supplementary Fig. 14a) may, therefore, help maintain the proper response of _Arabidopsis_ to viral infection71. Importantly, activation of the cryptic TSS upstream of _SQN_
(_AT2G15790_), a gene important for vegetative shoot maturation in _Arabidopsis_57, leads to ectopic production of aberrant transcripts and a decreased accumulation of the normal one (Fig.
6b). Although the impacts of such transcriptional attenuation on plant development are to be confirmed, it has been shown in _A. thaliana_ that, light-induced regulation of alternative
promoters could generate proteins with differential localizations from the same genes, which help alleviate the impact of changing light conditions on the plant72. Our data, therefore,
demonstrate that the epigenetic regulation of cryptic TSSs would profoundly and critically affect proper responses of plant species to ever changing environmental conditions. Additionally,
as many protein coding genes in _A. thaliana_ possess multiple active upstream as well as intragenic TSSs, it would be interesting to investigate whether cryptic TSSs are still in the
process of being co-opted to become functional in the _Arabidopsis_ genome. METHODS PLANT MATERIALS _ddm1-1_, _met1-3_, _ibm1-4_, _ibm2-2_, and _edm2-9_ mutants have been described
previously16,73,74,75. _suvh456_ and _nrpe1-7_ seeds were kindly provided by Dr. Kakutani and Dr. Kanno, respectively. The T-DNA insertion line of _nrpd1a-3_ (SALK_128428) was obtained from
the Arabidopsis Biological Resource Center (https://abrc.osu.edu). All the mutants are in Columbia (Col) background. The second generation of homozygous _met1_, _ddm1_, _ibm1_, _ibm2_, and
_edm2_ were used for the RNA experiments described below. _nrpd1a_, _nrpe1_, and _suvh456_ were maintained as homozygous for at least three generations before the experiments. The seeds were
germinated and grown on 1/2 Murashige and Skoog (MS) plate under long-day conditions (16-h light; 8-h dark) at 22 ∘C. RNA EXTRACTION AND CAGE For CAGE analysis, 10-to-12-day-old whole
seedlings of wild-type Col and mutant plants were pooled for RNA extraction. Total RNA was extracted using RNAiso (TAKARA), and DNA was digested with TURBO DNase (Thermo Fisher Scientific),
followed by purification by RNeasy Plant Minikit (QIAGEN). Four technical replicates of WT Col and _met1_, and two technical replicates of other samples were prepared for CAGE. Single end
75bp CAGE libraries were prepared and sequenced in DNAFORM (Yokohama, Japan). RNA quality was assessed by Bioanalyzer (Agilent) to ensure that the RIN (RNA integrity number) was over 7.0,
and A260/280 and 260/230 ratios were over 1.7. CAGE SEQUENCING DATA ANALYSIS The CAGE sequencing (CAGE-seq) data were processed as follows: sequencing reads were trimmed using Trimmomatic
(v0.30)76 with the following parameters: HEADCROP:1, TRAILING:20, to remove nonspecific guanines38 and low quality bases at the read ends. These were then mapped to the
_A__r__a__b__i__d__o__p__s__i__s_ Col reference genome by HISAT2 (v2.0.0-beta)77, allowing up to ten alignments for a single read. Due to low mapping coverage, met1.4 replicate was excluded
from further analysis. met1.3 was also discarded due to its low correlations with two other replicates (met1.1 and met1.2). Then, uniquely mapped reads were used to identify TSSs at a single
base resolution (CTSSs) by CAGEr (v1.20.0)78 with the following parameters: sequencingQualityThreshold = 20, mappingQualityThreshold = 20. After being normalized to Tags Per Million (TPM),
CTSSs in each sample were grouped into tag clusters by the paraclu method, with threshold = 0.1, nrPassThreshold = 2, removeSingletons = TRUE, keepSingletonAbove = 0.3, minStability = 2,
maxLength = 100. Finally, tag clusters from individual samples were merged into a common set of consensus tag clusters by the aggregateTagCluster function, with threshold = 0.3, _q_Low =
NULL, _q_UP = NULL, maxDist = 100, excludeSignalBelowThreshold = TRUE. Each consensus tag cluster was then considered a single reliable TSS, represented by its dominant CTSS, to distinguish
from the TSSs annotated by TAIR10. Promoter width was defined by the distance between the 10th (_q_Low = 0.1) and 90th (_q_Up = 0.9) quantiles of the cummulative distribution of CAGE signal
along each tag cluster, as described in ref. 78. Raw tag counts were used to identify differentially expressed TSSs in the mutants compared to wild-type plants by DESeq2 (v1.22.2)79, with
significance cut-off threshold _p_adj ≤ 0.1. ANNOTATING TSSS IDENTIFIED BY CAGE-SEQ TAIR10 genome annotations of 19,891 TEs and 27,600 protein coding genes and non coding RNAs in _A_.
_t__h__a__l__i__a__n__a_ were obtained from ref. 14. Araport11 version of genome annotations were also downloaded from The Arabidopsis Information Resource (TAIR)
(https://www.arabidopsis.org/). Promoters were defined as the regions of 1 kb upstream of the TAIR-annotated TSSs. A TSS identified by CAGE-seq was annotated based on genomic location of its
dominant CTSS, in the following order: promoter, \({5}^{\prime}\) UTR, 3\({}^{\prime}\) UTR, intron, exon, antisense, TE, intergenic. TSSs identified by PEAT method were obtained from ref.
31. Then, the nearest distance between the dominant CTSS of each CAGE-seq tag cluster and the mode locations of PEAT TSSs in the same direction was calculated. PEAT TSSs, which exactly
matched with CAGE-seq TSSs (distance = 0 nt), were used as the proxy to estimate interquantile widths for each shape category defined in ref. 31, including NP, broad with peak (BP), and weak
peak (WP). MRNA SEQUENCING DATA ANALYSIS Paired-end mRNA sequencing (mRNA-seq) data were prepared following the method described in ref. 14 and processed as follows: reads were trimmed by
Trimmomatic to remove sequencing bias and adapter sequences, then mapped to the _A__r__a__b__i__d__o__p__s__i__s_ Col reference genome by HISAT2, allowing up to ten alignments for a read
pair. The featureCounts function in the package Rsubread (v1.14.2)80 was used to identify the number of read pairs uniquely mapped to genes and TEs. The outputs of mRNA-seq mapping were also
used for transcript assembly as follows: first, transcripts of each individual sample were assembled by Cufflinks (v2.2.1)81. Low-expressed transcripts (smaller than the 10th percentile of
expression of all the assembled transcripts) were then removed. The remaining transcripts from all samples were merged to create a unified set of transcripts. They were then compared to
reference transcripts in TAIR10 by the cuffcompare function to identify splicing patterns. Differential promoter usage was assessed by the cuffdiff function. To identify assembled
transcripts associated with EPICATs, overlap tests were conducted between the transcripts and genomic regions centering around the EPICATs’ dominant CTSSs (extended 180 bp into both sides,
regarding that a TSS identified by CAGE-seq could be associated with a nearby transcript (Supplementary Fig. 2b)). The results were given in Supplementary Data 6. CHIP SEQUENCING DATA
ANALYSIS ChIP sequencing (ChIP-seq) data of histone modifications, including H3K27me1/3, H3K9me2, H3K36me3, and H3K4me3, in wild-type plants were retrieved from a previous study82.
Paired-end Chip-seq data of RNAPII in wild-type plants and mutants were prepared as follows: Two-week-old whole seedlings of wild-type Col and _met1_ and _ddm1_ were fixed in a fixation
buffer (10 mM Tris-HCl (pH 7.5), 50 mM NaCl, 0.1 M sucrose, 1% formaldehyde) for 20-min, followed by quenching by 125 mM Glycine. Nuclei isolation was performed as previously described83.
PolII ChIP was performed for two replicates for each genotype (about 1 g tissue/IP) by SimpleChIP Plus Kit (Cell Signaling Technology) according to the manufacturer’s instructions. Anti-RNA
polymerase II CTD repeat YSPTSPS (phospho S2) (Abcam ab5095) and Anti-RNA polymerase II CTD repeat YSPTSPS (phospho S5) (Abcam ab5408) antibodies were used for IPs (4 μg/IP). Precipitated
DNA samples were sequenced by Hiseq 4000 in the 150 bp paired-end mode in OIST SQC. Due to the large overlap between two reads, only one read (read 1) in each pair was used for downstream
analysis. Reads were trimmed to remove sequencing bias and adapter sequences using Trimmomatic, then mapped to the _A__r__a__b__i__d__o__p__s__i__s_ Col reference genome by Bowtie
(v1.0.0)84. Reads mapped to an identical position were collapsed into a single read, and only the best alignment was kept for a read mapped to multiple locations. Mapping results were given
in Supplementary Data 9. ChIP-seq data of PolIV (NRPD1) and the list of NRPD1 binding loci were obtained from ref. 85. Genomic locations of NRPD1 binding loci were then converted from TAIR8
to TAIR9 coordinates using the _update_coordinates.pl_ script provided by TAIR. ChIP-seq data of RNAPII in _pol4_ and corresponding wild-type plants were obtained from ref. 50. These data
were processed as described above. Preprocessed RNAPII Ser5P ChIP-seq data (in bigwig format) in _pol5_ were downloaded from ref. 64 and directly used for visualization. BISULFITE SEQUENCING
DATA ANALYSIS Whole-genome bisulfite sequencing (WGBS) MethylC-Seq data of wild-type plants and epigenetic mutants were retrieved from ref. 9. High quality reads (_q_ ≥ 28), trimmed to
remove adapter effects and sequencing bias, were mapped to the _Arabidopsis_ Col reference genome using Bismark (v0.12.1)86 allowing up to two mismatches. Bases covered by fewer than 3 reads
were excluded, and only uniquely mapped reads were used for further analysis. Methylation levels were calculated using MethylKit (v0.5.7)87. The list of BM, intermediate methylated (IM),
and unmethylated (UM) genes were obtained from ref. 44. To exclude the potential impacts of non-CG methylation on the activation of intragenic EPICATs, only _met1_-activated intragenic
EPICATs with low (less than 10%) CHG methylation in the 101 bp regions centering around their dominant CTSSs were examined (Supplementary Data 4). SMALL RNA SEQUENCING DATA ANALYSIS
Sequencing data of 24 nt small interference RNAs (siRNAs) in wild-type and _nrpd1_ mutant plants were obtained from ref. 85 and trimmed by TrimGalore (v0.4.5)88 with Cutadapt (v1.8.3)89,
using the following parameters: stringency:4, quality:20, length:15, max_length:30. PolIV-dependent small RNAs (P4RNAs) longer than 27 nt in _dcl2/3/4_ and corresponding wild-type plants
were obtained from ref. 50 and trimmed by Trimmomatic. These data were then mapped to the _A__r__a__b__i__d__o__p__s__i__s_ Col reference genome by Bowtie (v1.0.0), allowing up to two
mismatches. Only uniquely mapped reads were used for further analysis. SEQUENCE MOTIF ANALYSIS De novo motif analysis and search of motif instances were conducted using MEME suite (v4.11.2)
with default parameters90. GYPSY LTR ANALYSIS Gypsy family sequences were retrieved from the TAIR database and aligned to obtain the full-length sequence for each family. LTR regions were
then determined by comparing \({5}^{\prime}\) and 3\({}^{\prime}\) ends of TE sequences and also checked by LTR_FINDER (v1.0.2)91. Several copies from each family were used to obtain
consensus sequences of LTRs (Supplementary Data 10). Consensus sequences of Gypsy LTRs were used to search for LTR sequences in the _Arabidopsis_ genome (TAIR10) using BLAST (v2.0)92. BLAST
hits shorter than 100 bp were discarded. LTR sequences were then aligned using ClustalW (v2.1)93, and edited using Jalview (v2.11.0)94. DATA VISUALIZATION Figures were created using
deepTools (v3.3.0)95, Integrated Genome Browser (IGB) (v9.1.2)96 with the Araport11 version of genome annotations, Excel, and the R package ggplot2 (v2.3.1)97. DNA methylation files were
firstly converted from bedGraph into bigWig format by the bedGraphToBigWig function (http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/), then used to generate heatmap and metaplot
figures using deepTools. mRNA-seq data were normalized to reads per million (RPM), and a single replicate was used to create IGB track. ChIP-seq signals were normalized to log2(ChIP/input),
and a single replicate of RNAPII (both Ser5P and Ser2P) were used for visualization in IGB. Small RNA sequencing data and RNAPII ChIP-seq data with no input samples were normalized to counts
per million (CPM). 5′-RACE AND QUANTITATIVE PCR \({5}^{\prime}\)-RACE was performed by SMARTer RACE kit (TAKARA) according to the manufacturer’s instructions. Quantitative PCR (qPCR) was
performed following the method described in ref. 75. All primers used in this study are listed in Supplementary Data 11. REPORTING SUMMARY Further information on research design is available
in the Nature Research Reporting Summary linked to this article. DATA AVAILABILITY Sequencing data have been deposited to the DDBJ Sequence Read Archive under the accession codes DRA009134
and DRA009847. Processed CAGE-seq data are also accessible via the following web link: https://plantepigenetics.oist.jp/. The source data underlying Figs. 1b, 2c, 3b, 4d–e, 5b, and 6c, d, f
and Supplementary Figs. 6b–c, 7d, and 14b–c are provided as a Source Data file. CODE AVAILABILITY In-house R codes and bash scripts customized for analyzing data are available from the
authors upon request. REFERENCES * Fedoroff, N. V. Transposable elements, epigenetics, and genome evolution. _Science_ 338, 758–767 (2012). ADS CAS PubMed Google Scholar * Lisch, D. How
important are transposons for plant evolution? _Nat. Rev. Genet._ 14, 49 (2013). CAS PubMed Google Scholar * Chuong, E. B., Elde, N. C. & Feschotte, C. Regulatory activities of
transposable elements: from conflicts to benefits. _Nat. Rev. Genet._ 18, 71 (2017). CAS PubMed Google Scholar * Slotkin, R. K. & Martienssen, R. Transposable elements and the
epigenetic regulation of the genome. _Nat. Rev. Genet._ 8, 272–285 (2007). CAS PubMed Google Scholar * Law, J. A. & Jacobsen, S. E. Establishing, maintaining and modifying DNA
methylation patterns in plants and animals. _Nat. Rev. Genet._ 11, 204–220 (2010). CAS PubMed PubMed Central Google Scholar * Zhang, H., Lang, Z. & Zhu, J.-K. Dynamics and function
of DNA methylation in plants. _Nat. Rev. Mol. Cell Biol._ 19, 489–506 (2018). CAS PubMed Google Scholar * Du, J. et al. Dual binding of chromomethylase domains to H3K9ME2-containing
nucleosomes directs DNA methylation in plants. _Cell_ 151, 167–180 (2012). CAS PubMed PubMed Central Google Scholar * Zemach, A. et al. The _Arabidopsis_ nucleosome remodeler DDM1 allows
dna methyltransferases to access H1-containing heterochromatin. _Cell_ 153, 193–205 (2013). CAS PubMed PubMed Central Google Scholar * Stroud, H., Greenberg, M. V., Feng, S.,
Bernatavichute, Y. V. & Jacobsen, S. E. Comprehensive analysis of silencing mutants reveals complex regulation of the _Arabidopsis methylome_. _Cell_ 152, 352–364 (2013). CAS PubMed
PubMed Central Google Scholar * Liu, J., He, Y., Amasino, R. & Chen, X. siRNAs targeting an intronic transposon in the regulation of natural flowering behavior in _Arabidopsis_. _Genes
Dev._ 18, 2873–2878 (2004). CAS PubMed PubMed Central Google Scholar * Kinoshita, Y. et al. Control of FWA gene silencing in _Arabidopsis thaliana_ by sine-related direct repeats.
_Plant J._ 49, 38–45 (2007). CAS PubMed Google Scholar * Henderson, I. R. & Jacobsen, S. E. Tandem repeats upstream of the _Arabidopsis_ endogene SDC recruit non-cg DNA methylation
and initiate sirna spreading. _Genes Dev._ 22, 1597–1606 (2008). CAS PubMed PubMed Central Google Scholar * Hollister, J. D. & Gaut, B. S. Epigenetic silencing of transposable
elements: a trade-off between reduced transposition and deleterious effects on neighboring gene expression. _Genome Res._ 19, 1419–1428 (2009). CAS PubMed PubMed Central Google Scholar *
Le, T. N., Miyazaki, Y., Takuno, S. & Saze, H. Epigenetic regulation of intragenic transposable elements impacts gene transcription in _Arabidopsis thaliana_. _Nucleic Acids Res._ 43,
3911–3921 (2015). CAS PubMed PubMed Central Google Scholar * Saze, H., Shiraishi, A., Miura, A. & Kakutani, T. Control of genic DNA methylation by a JMJC domain-containing protein in
_Arabidopsis thaliana_. _Science_ 319, 462–465 (2008). ADS CAS PubMed Google Scholar * Saze, H. et al. Mechanism for full-length RNA processing of _Arabidopsis_ genes containing
intragenic heterochromatin. _Nat. Commun._ 4, 2301 (2013). ADS PubMed Google Scholar * Lei, M. et al. _Arabidopsis_ EDM2 promotes _IBM1_ distal polyadenylation and regulates genome DNA
methylation patterns. _Proc. Natl Acad. Sci. USA_ 111, 527–532 (2014). ADS CAS PubMed Google Scholar * Ni, T. et al. A paired-end sequencing strategy to map the complex landscape of
transcription initiation. _Nat. Methods_ 7, 521–527 (2010). CAS PubMed PubMed Central Google Scholar * Takahashi, H., Lassmann, T., Murata, M. & Carninci, P. 5’ end-centered
expression profiling using cap-analysis gene expression and next-generation sequencing. _Nat. Protoc._ 7, 542–561 (2012). CAS PubMed PubMed Central Google Scholar * Hetzel, J., Duttke,
S. H., Benner, C. & Chory, J. Nascent RNA sequencing reveals distinct features in plant transcription. _Proc. Natl Acad. Sci. USA_ 113, 12316–12321 (2016). CAS PubMed PubMed Central
Google Scholar * Tokizawa, M. et al. Identification of _Arabidopsis_ genic and non-genic promoters by paired-end sequencing of TSS tags. _Plant J._ 90, 587–605 (2017). CAS PubMed Google
Scholar * Lu, Z. & Lin, Z. Pervasive and dynamic transcription initiation in _Saccharomyces cerevisiae_. _Genome Res_. https://doi.org/10.1101/gr.245456.118 (2019). * Fort, A. et al.
Deep transcriptome profiling of mammalian stem cells supports a regulatory role for retrotransposons in pluripotency maintenance. _Nat. Genet._ 46, 558–566 (2014). CAS PubMed Google
Scholar * Hashimoto, K. et al. Cage profiling of ncrnas in hepatocellular carcinoma reveals widespread activation of retroviral LTR promoters in virus-induced tumors. _Genome Res._ 25,
1812–1824 (2015). CAS PubMed PubMed Central Google Scholar * Neri, F. et al. Intragenic DNA methylation prevents spurious transcription initiation. _Nature_ 543, 72–77 (2017). ADS CAS
PubMed Google Scholar * Brocks, D. et al. DNMT and HDAC inhibitors induce cryptic transcription start sites encoded in long terminal repeats. _Nat. Genet._ 49, 1052 (2017). CAS PubMed
PubMed Central Google Scholar * Yamamoto, Y. Y. et al. Differentiation of core promoter architecture between plants and mammals revealed by LDSS analysis. _Nucleic Acids Res._ 35,
6219–6226 (2007). CAS PubMed PubMed Central Google Scholar * Mejía-Guerra, M. K. et al. Core promoter plasticity between maize tissues and genotypes contrasts with predominance of sharp
transcription initiation sites. _Plant Cell._ 27, 3309–3320 (2015). PubMed PubMed Central Google Scholar * Yamamoto, Y. Y. et al. Identification of plant promoter constituents by analysis
of local distribution of short sequences. _BMC Genomics_ 8, 67 (2007). PubMed PubMed Central Google Scholar * Yamamoto, Y. Y. et al. Heterogeneity of _Arabidopsis_ core promoters
revealed by high-density TSS analysis. _Plant J._ 60, 350–362 (2009). CAS PubMed Google Scholar * Morton, T. et al. Paired-end analysis of transcription start sites in _Arabidopsis_
reveals plant-specific promoter signatures. _Plant Cell._ 26, 2746–2760 (2014). CAS PubMed PubMed Central Google Scholar * Fejes-Toth, K. et al. Post-transcriptional processing generates
a diversity of 5’-modified long and short rnas: affymetrix/cold spring harbor laboratory encode transcriptome project. _Nature_ 457, 1028–1032 (2009). ADS CAS PubMed Central Google
Scholar * Mercer, T. R. et al. Regulated post-transcriptional RNA cleavage diversifies the eukaryotic transcriptome. _Genome Res._ 20, 1639–1650 (2010). CAS PubMed PubMed Central Google
Scholar * Nielsen, M. et al. Transcription-driven chromatin repression of intragenic transcription start sites. _PLoS Genet._ 15, e1007969 (2019). CAS PubMed PubMed Central Google
Scholar * Soppe, W. J. et al. The late flowering phenotype of FWA mutants is caused by gain-of-function epigenetic alleles of a homeodomain gene. _Mol. Cell_ 6, 791–802 (2000). CAS PubMed
Google Scholar * Lippman, Z. & Martienssen, R. The role of RNA interference in heterochromatic silencing. _Nature_ 431, 364–370 (2004). ADS CAS PubMed Google Scholar * Seila, A.
C. et al. Divergent transcription from active promoters. _Science_ 322, 1849–1851 (2008). ADS CAS PubMed PubMed Central Google Scholar * Carninci, P. et al. Genome-wide analysis of
mammalian promoter architecture and evolution. _Nat. Genet._ 38, 626–635 (2006). CAS PubMed Google Scholar * Shu, H., Wildhaber, T., Siretskiy, A., Gruissem, W. & Hennig, L. Distinct
modes of DNA accessibility in plant chromatin. _Nat. Commun._ 3, 1281 (2012). ADS PubMed Google Scholar * Zhang, T., Marand, A. P. & Jiang, J. PlantDHS: a database for DNase I
hypersensitive sites in plants. _Nucleic Acids Res._ 44, D1148–D1153 (2015). PubMed PubMed Central Google Scholar * Eick, D. & Geyer, M. The rna polymerase II carboxy-terminal domain
(CTD) code. _Chem. Rev._ 113, 8456–8490 (2013). CAS PubMed Google Scholar * Xiao, J. et al. _Cis_ and _trans_ determinants of epigenetic silencing by polycomb repressive complex 2 in
_Arabidopsis_. _Nat. Genet._ 49, 1546–1552 (2017). CAS PubMed Google Scholar * Deleris, A. et al. Loss of the DNA methyltransferase MET1 induces H3K9 hypermethylation at PcG target genes
and redistribution of H3K27 trimethylation to transposons in _Arabidopsis thaliana_. _PLoS Genet._ 8, e1003062 (2012). CAS PubMed PubMed Central Google Scholar * Takuno, S. & Gaut,
B. S. Body-methylated genes in _Arabidopsis thaliana_ are functionally important and evolve slowly. _Mol. Biol. Evol._ 29, 219–227 (2011). PubMed Google Scholar * Bewick, A. J. &
Schmitz, R. J. Gene body DNA methylation in plants. _Curr. Opin. Plant Biol._ 36, 103–110 (2017). CAS PubMed PubMed Central Google Scholar * Zilberman, D., Gehring, M., Tran, R. K.,
Ballinger, T. & Henikoff, S. Genome-wide analysis of _Arabidopsis thaliana_ DNA methylation uncovers an interdependence between methylation and transcription. _Nat. Genet._ 39, 61–69
(2007). CAS PubMed Google Scholar * Horvath, R., Laenen, B., Takuno, S. & Slotte, T. Single-cell expression noise and gene-body methylation in _Arabidopsis thaliana_. _Heredity_ 123,
81–91 (2019). * Bewick, A. J. et al. On the origin and evolutionary consequences of gene body DNA methylation. _Proc. Natl Acad. Sci. USA_ 113, 9111–9116 (2016). CAS PubMed PubMed Central
Google Scholar * Rigal, M., Kevei, Z., Pélissier, T. & Mathieu, O. DNA methylation in an intron of the IBM1 histone demethylase gene stabilizes chromatin modification patterns. _EMBO
J._ 31, 2981–2993 (2012). CAS PubMed PubMed Central Google Scholar * Zhai, J. et al. A one precursor one siRNA model for Pol IV-dependent siRNA biogenesis. _Cell_ 163, 445–455 (2015).
CAS PubMed PubMed Central Google Scholar * Faulkner, G. J. et al. The regulated retrotransposon transcriptome of mammalian cells. _Nat. Genet._ 41, 563–571 (2009). CAS PubMed Google
Scholar * Batut, P., Dobin, A., Plessy, C., Carninci, P. & Gingeras, T. R. High-fidelity promoter profiling reveals widespread alternative promoter usage and transposon-driven
developmental gene expression. _Genome Res._ 23, 169–180 (2013). CAS PubMed PubMed Central Google Scholar * Settles, A. M., Baron, A., Barkan, A. & Martienssen, R. A. Duplication and
suppression of chloroplast protein translocation genes in maize. _Genetics_ 157, 349–360 (2001). CAS PubMed PubMed Central Google Scholar * Butelli, E. et al. Retrotransposons control
fruit-specific, cold-dependent accumulation of anthocyanins in blood oranges. _Plant Cell_ 24, 1242–1255 (2012). CAS PubMed PubMed Central Google Scholar * Tsukahara, S. et al. Bursts of
retrotransposition reproduced in _Arabidopsis_. _Nature_ 461, 423–426 (2009). ADS CAS PubMed Google Scholar * Hirsch, C. D. & Springer, N. M. Transposable element influences on gene
expression in plants. _Biochim Biophys. Acta Gene Regul. Mech._ 1860, 157–165 (2017). CAS PubMed Google Scholar * Prunet, N. et al. SQUINT promotes stem cell homeostasis and floral
meristem termination in _Arabidopsis_ through APETALA2 and CLAVATA signalling. _J. Exp. Bot._ 66, 6905–6916 (2015). CAS PubMed Google Scholar * Avelange-Macherel, M.-H. & Joyard, J.
Cloning and functional expression of ATCOQ3, the _Arabidopsis_ homologue of the yeast COQ3 gene, encoding a methyltransferase from plant mitochondria involved in ubiquinone biosynthesis.
_Plant J._ 14, 203–213 (1998). CAS PubMed Google Scholar * Meinke, D. W. Genome-wide identification of EMBRYO-DEFECTIVE (EMB) genes required for growth and development in _Arabidopsis_.
_New Phytol_. 14, 306–325 (2019). Google Scholar * Santamaria, M., Thomson, C. J., Read, N. D. & Loake, G. J. The promoter of a basic PR1-like gene, ATPRB1, from _Arabidopsis_
establishes an organ-specific expression pattern and responsiveness to ethylene and methyl jasmonate. _Plant Mol. Biol._ 47, 641–652 (2001). CAS PubMed Google Scholar * Todd, C. D.,
Deniz, Ö., Taylor, D. & Branco, M. R. Functional evaluation of transposable elements as enhancers in mouse embryonic and trophoblast stem cells. _eLife_ 8, e44344 (2019). CAS PubMed
PubMed Central Google Scholar * Teissandier, A. & Bourc’his, D. Gene body DNA methylation conspires with H3K36ME3 to preclude aberrant transcription. _EMBO J._ 36, 1471–1473 (2017).
CAS PubMed PubMed Central Google Scholar * Carrozza, M. J. et al. Histone H3 methylation by set2 directs deacetylation of coding regions by RPD3S to suppress spurious intragenic
transcription. _Cell_ 123, 581–592 (2005). CAS PubMed Google Scholar * Zhou, J. et al. Intronic heterochromatin prevents cryptic transcription initiation in _Arabidopsis_. _Plant J_. 101,
1185–1197 (2019). PubMed Google Scholar * Choi, J., Lyons, D. B., Kim, M. Y., Moore, J. D. & Zilberman, D. DNA methylation and histone h1 jointly repress transposable elements and
aberrant intragenic transcripts. _Mol. Cell._ 77, 310–323 (2020). CAS PubMed Google Scholar * Shearwin, K. E., Callen, B. P. & Egan, J. B. Transcriptional interference–a crash course.
_Trends Genet._ 21, 339–345 (2005). CAS PubMed PubMed Central Google Scholar * Palmer, A. C., Egan, J. B. & Shearwin, K. E. Transcriptional interference by rna polymerase pausing
and dislodgement of transcription factors. _Transcription_ 2, 9–14 (2011). PubMed Google Scholar * Wu, A. C. et al. Repression of divergent noncoding transcription by a sequence-specific
transcription factor. _Mol. Cell._ 72, 942–954 (2018). CAS PubMed PubMed Central Google Scholar * Hedtke, B. & Grimm, B. Silencing of a plant gene by transcriptional interference.
_Nucleic Acids Res._ 37, 3739–3746 (2009). CAS PubMed PubMed Central Google Scholar * Wang, Y.-H. & Warren Jr, J. T. Mutations in retrotransposon atcopia4 compromises resistance to
hyaloperonospora parasitica in _Arabidopsis thaliana_. _Genet Mol. Biol._ 33, 135–140 (2010). CAS PubMed PubMed Central Google Scholar * Diezma-Navas, L. et al. Crosstalk between
epigenetic silencing and infection by tobacco rattle virus in _Arabidopsis_. _Mol. Plant Pathol_. 20, 1439–1452 (2019). * Ushijima, T. et al. Light controls protein localization through
phytochrome-mediated alternative promoter selection. _Cell_ 171, 1316–1325 (2017). CAS PubMed Google Scholar * Vongs, A., Kakutani, T., Martienssen, R. A. & Richards, E. J.
_Arabidopsis thaliana_ DNA methylation mutants. _Science_ 260, 1926–1928 (1993). ADS CAS PubMed Google Scholar * Saze, H., Scheid, O. M. & Paszkowski, J. Maintenance of CPG
methylation is essential for epigenetic inheritance during plant gametogenesis. _Nat. Genet._ 34, 65–69 (2003). CAS PubMed Google Scholar * Osabe, K., Harukawa, Y., Miura, S. & Saze,
H. Epigenetic regulation of intronic transgenes in _Arabidopsis_. _Sci. Rep._ 7, 45166 (2017). ADS CAS PubMed PubMed Central Google Scholar * Bolger, A. M., Lohse, M. & Usadel, B.
Trimmomatic: a flexible trimmer for illumina sequence data. _Bioinformatics_ 30, 2114–2120 (2014). CAS PubMed PubMed Central Google Scholar * Kim, D., Langmead, B. & Salzberg, S. L.
Hisat: a fast spliced aligner with low memory requirements. _Nat. Methods_ 12, 357–360 (2015). CAS PubMed PubMed Central Google Scholar * Haberle, V., Forrest, A. R., Hayashizaki, Y.,
Carninci, P. & Lenhard, B. Cager: precise TSS data retrieval and high-resolution promoterome mining for integrative analyses. _Nucleic Acids Res._ 43, e51 (2015). PubMed PubMed Central
Google Scholar * Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. _Genome Biol._ 15, 550 (2014). PubMed PubMed
Central Google Scholar * Liao, Y., Smyth, G. K. & Shi, W. The R package rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads.
_Nucleic Acids Res._ 47, e47 (2019). CAS PubMed PubMed Central Google Scholar * Trapnell, C. et al. Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and
isoform switching during cell differentiation. _Nat. Biotechnol._ 28, 511–515 (2010). CAS PubMed PubMed Central Google Scholar * Luo, C. et al. Integrative analysis of chromatin states
in _Arabidopsis_ identified potential regulatory mechanisms for natural antisense transcript production. _Plant J._ 73, 77–90 (2013). CAS PubMed Google Scholar * Saleh, A.,
Alvarez-Venegas, R. & Avramova, Z. An efficient chromatin immunoprecipitation (CHIP) protocol for studying histone modifications in _Arabidopsis_ plants. _Nat. Protoc._ 3, 1018 (2008).
CAS PubMed Google Scholar * Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. et al. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. _Genome
Biol._ 10, R25 (2009). PubMed PubMed Central Google Scholar * Law, J. A. et al. Polymerase IV occupancy at RNA-directed DNA methylation sites requires SHH1. _Nature_ 498, 385–389 (2013).
ADS CAS PubMed PubMed Central Google Scholar * Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for bisulfite-seq applications. _Bioinformatics_ 27,
1571–1572 (2011). CAS PubMed PubMed Central Google Scholar * Akalin, A. et al. methylkit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles. _Genome
Biol._ 13, R87 (2012). PubMed PubMed Central Google Scholar * Krueger, F. Trim galore (Babraham Bioinformatics, 2015). * Martin, M. Cutadapt removes adapter sequences from high-throughput
sequencing reads. _EMBnet. J._ 17, 10–12 (2011). Google Scholar * Bailey, T. L. et al. Meme suite: tools for motif discovery and searching. _Nucleic Acids Res._ 37, W202–W208 (2009). CAS
PubMed PubMed Central Google Scholar * Xu, Z. & Wang, H. Ltr_finder: an efficient tool for the prediction of full-length LTR retrotransposons. _Nucleic Acids Res._ 35, W265–W268
(2007). PubMed PubMed Central Google Scholar * Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. _J. Mol. Biol._ 215, 403–410
(1990). CAS PubMed Google Scholar * Thompson, J. D., Higgins, D. G. & Gibson, T. J. Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence
weighting, position-specific gap penalties and weight matrix choice. _Nucleic Acids Res._ 22, 4673–4680 (1994). CAS PubMed PubMed Central Google Scholar * Clamp, M., Cuff, J., Searle, S.
M. & Barton, G. J. The jalview java alignment editor. _Bioinformatics_ 20, 426–427 (2004). CAS PubMed Google Scholar * Ramírez, F., Dündar, F., Diehl, S., Grüning, B. A. & Manke,
T. deeptools: a flexible platform for exploring deep-sequencing data. _Nucleic Acids Res._ 42, W187–W191 (2014). PubMed PubMed Central Google Scholar * Freese, N. H., Norris, D. C. &
Loraine, A. E. Integrated genome browser: visual analytics platform for genomics. _Bioinformatics_ 32, 2089–2095 (2016). CAS PubMed PubMed Central Google Scholar * Wickham, H. _ggplot2:
Elegant Graphics for Data Analysis_ (Springer-Verlag, New York, 2016). MATH Google Scholar Download references ACKNOWLEDGEMENTS This work was supported by JSPS KAKENHI Grant Number
19K06619 to H.S., and by Okinawa Institute of Science and Technology Graduate University. We thank the Arabidopsis Biological Resource Center and the Salk Institute Genomic Analysis
Laboratory for providing _Arabidopsis_ T-DNA insertion mutants, OIST SQC for RNA-seq, ChIP-seq, and BS-seq sequencing services, Dr. Tetsuji Kakutani and Dr. Tatsuo Kanno for providing mutant
seeds, Dr. Shohei Takuno for kindly sharing the list of BM genes in _A. thaliana_, OIST Infrastructure Section for technical supports in building web interface to access data, and OIST
English editing service for proofreading of the manuscript. AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Plant Epigenetics Unit, Okinawa Institute of Science and Technology (OIST), 1919-1
Tancha, Onna-son, Kunigami-gun, Okinawa, 904-0495, Japan Ngoc Tu Le, Yoshiko Harukawa, Saori Miura & Hidetoshi Saze * Wageningen University & Research, Droevendaalsesteeg 4, 6708 PB
Wageningen, Netherlands Damian Boer * Faculty of Life Sciences, Kyoto Sangyo University, Kyoto, 603-8555, Japan Akira Kawabe Authors * Ngoc Tu Le View author publications You can also search
for this author inPubMed Google Scholar * Yoshiko Harukawa View author publications You can also search for this author inPubMed Google Scholar * Saori Miura View author publications You
can also search for this author inPubMed Google Scholar * Damian Boer View author publications You can also search for this author inPubMed Google Scholar * Akira Kawabe View author
publications You can also search for this author inPubMed Google Scholar * Hidetoshi Saze View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS
Experiments were designed by N.T.L. and H.S., and performed by Y.H., S.M., and H.S. Data analysis was performed by N.T.L., with the support of D.B. for gene expression analysis using
mRNA-seq data. LTR sequences were analyzed by A.K. The manuscript was prepared by N.T.L. and H.S. CORRESPONDING AUTHOR Correspondence to Hidetoshi Saze. ETHICS DECLARATIONS COMPETING
INTERESTS The authors declare no competing interests. ADDITIONAL INFORMATION PEER REVIEW INFORMATION _Nature Communications_ thanks the anonymous reviewers for their contribution to the peer
review of this work. Peer review reports are available. PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional
affiliations. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION PEER REVIEW FILE DESCRIPTION OF ADDITIONAL SUPPLEMENTARY FILES SUPPLEMENTARY DATA 1 SUPPLEMENTARY DATA 2 SUPPLEMENTARY DATA
3 SUPPLEMENTARY DATA 4 SUPPLEMENTARY DATA 5 SUPPLEMENTARY DATA 6 SUPPLEMENTARY DATA 7 SUPPLEMENTARY DATA 8 SUPPLEMENTARY DATA 9 SUPPLEMENTARY DATA 10 SUPPLEMENTARY DATA 11 REPORTING SUMMARY
SOURCE DATA SOURCE DATA RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation,
distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and
indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to
the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will
need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE
CITE THIS ARTICLE Le, N.T., Harukawa, Y., Miura, S. _et al._ Epigenetic regulation of spurious transcription initiation in _Arabidopsis_. _Nat Commun_ 11, 3224 (2020).
https://doi.org/10.1038/s41467-020-16951-w Download citation * Received: 09 December 2019 * Accepted: 01 June 2020 * Published: 26 June 2020 * DOI: https://doi.org/10.1038/s41467-020-16951-w
SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy
to clipboard Provided by the Springer Nature SharedIt content-sharing initiative