Play all audios:
ABSTRACT The extent to which sequence variation impacts plant fitness is poorly understood. High-resolution maps detailing the constraint acting on the genome, especially in regulatory
sites, would be beneficial as functional annotation of noncoding sequences remains sparse. Here, we present a fitness consequence (fitCons) map for rice (_Oryza sativa_). We inferred fitCons
scores (_ρ_) for 246 inferred genome classes derived from nine functional genomic and epigenomic datasets, including chromatin accessibility, messenger RNA/small RNA transcription, DNA
methylation, histone modifications and engaged RNA polymerase activity. These were integrated with genome-wide polymorphism and divergence data from 1,477 rice accessions and 11 reference
genome sequences in the Oryzeae. We found _ρ_ to be multimodal, with _~_9% of the rice genome falling into classes where more than half of the bases would probably have a fitness consequence
if mutated. Around 2% of the rice genome showed evidence of weak negative selection, frequently at candidate regulatory sites, including a novel set of 1,000 potentially active enhancer
elements. This fitCons map provides perspective on the evolutionary forces associated with genome diversity, aids in genome annotation and can guide crop breeding programs. Access through
your institution Buy or subscribe This is a preview of subscription content, access via your institution ACCESS OPTIONS Access through your institution Access Nature and 54 other Nature
Portfolio journals Get Nature+, our best-value online-access subscription $32.99 / 30 days cancel any time Learn more Subscribe to this journal Receive 12 digital issues and online access to
articles $119.00 per year only $9.92 per issue Learn more Buy this article * Purchase on SpringerLink * Instant access to full article PDF Buy now Prices may be subject to local taxes which
are calculated during checkout ADDITIONAL ACCESS OPTIONS: * Log in * Learn about institutional subscriptions * Read our FAQs * Contact customer support SIMILAR CONTENT BEING VIEWED BY
OTHERS EPIALLELIC VARIATION OF NON-CODING RNA GENES AND THEIR PHENOTYPIC CONSEQUENCES Article Open access 14 February 2024 A SUPER PAN-GENOMIC LANDSCAPE OF RICE Article Open access 12 July
2022 SEQUENCE-BASED ANALYSIS OF THE RICE CAMTA FAMILY: HAPLOTYPE AND NETWORK ANALYSES Article Open access 05 October 2024 DATA AVAILABILITY The read data used to generate the ChromHMM model
and genomic classes have been deposited at the NCBI SRA (https://www.ncbi.nlm.nih.gov/sra) and can be accessed through BioProject ID PRJNA586887. Genome assemblies of _O. officinalis_ and
_O. australiensis_ are available from the CoGe CyVerse website (https://genomevolution.org/coge/) with genome IDs id56031 and id56030, respectively. Access to genomic class annotation and
INSIGHT scoring of the rice genome is available via a genome browser linked from the project’s website (http://purugganan-genomebrowser.bio.nyu.edu/insightJuly2018/greenInsight.html). All
epigenomic data tracks, genome annotations, multiple alignments, conservation scores, fitCons scores and site classes are available for visualization and download on a local installation on
the USCSC Genome Browser at http://purugganan-genomebrowser.bio.nyu.edu/cgi-bin/hgTracks?db=Osaj&position=Osaj.1%3A166356–178595, and are also available for download from the NCBI SRA
(PRJNA586887). The greenINSIGHT-specific data used to generate the greenINSIGHT online tool are available in the “Additional information, scripts & data” section at
http://purugganan-genomebrowser.bio.nyu.edu/insightJuly2018/greenInsight.html. The greenINSIGHT-specific code used to generate the greenINSIGHT online tool, as well as the code described in
the Methods, are available in the “Additional information, scripts & data” section at http://purugganan-genomebrowser.bio.nyu.edu/insightJuly2018/greenInsight.html. CODE AVAILABILITY The
greenINSIGHT-specific data used to generate the greenINSIGHT online tool are available in the “Additional information, scripts & data” section at
http://purugganan-genomebrowser.bio.nyu.edu/insightJuly2018/greenInsight.html. The greenINSIGHT-specific code used to generate the greenINSIGHT online tool, as well as the code described in
the Methods, are available in the “Additional information, scripts & data” section at http://purugganan-genomebrowser.bio.nyu.edu/insightJuly2018/greenInsight.html. REFERENCES * Siepel,
A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. _Genome Res._ 15, 1034–1050 (2005). Article CAS PubMed PubMed Central Google Scholar *
Schrider, D. R. & Kern, A. D. Inferring selective constraint from population genomic data suggests recent regulatory turnover in the human brain. _Genome Biol. Evol._ 7, 3511–3528
(2015). Article CAS PubMed PubMed Central Google Scholar * Gronau, I., Arbiza, L., Mohammed, J. & Siepel, A. Inference of natural selection from interspersed genomic elements based
on polymorphism and divergence. _Mol. Biol. Evol._ 30, 1159–1171 (2013). Article CAS PubMed PubMed Central Google Scholar * McDonald, J. H. & Kreitman, M. Adaptive protein evolution
at the _Adh_ locus in _Drosophila_. _Nature_ 351, 652–654 (1991). Article CAS PubMed Google Scholar * Sawyer, S. A. & Hartl, D. L. Population genetics of polymorphism and
divergence. _Genetics_ 132, 1161–1176 (1992). Article CAS PubMed PubMed Central Google Scholar * Bustamante, C. D. et al. Natural selection on protein-coding genes in the human genome.
_Nature_ 437, 1153–1157 (2005). Article CAS PubMed Google Scholar * Smith, N. G. C. & Eyre-Walker, A. Adaptive protein evolution in _Drosophila_. _Nature_ 415, 1022–1024 (2002).
Article CAS PubMed Google Scholar * Gulko, B., Hubisz, M. J., Gronau, I. & Siepel, A. A method for calculating probabilities of fitness consequences for point mutations across the
human genome. _Nat. Genet._ 47, 276–283 (2015). Article CAS PubMed PubMed Central Google Scholar * Gulko, B. & Siepel, A. An evolutionary framework for measuring epigenomic
information and estimating cell-type-specific fitness consequences. _Nat. Genet._ 51, 335–342 (2019). Article CAS PubMed Google Scholar * Wing, R. A., Purugganan, M. D. & Zhang, Q.
The rice genome revolution: from an ancient grain to Green Super Rice. _Nat. Rev. Genet._ 19, 505–517 (2018). Article CAS PubMed Google Scholar * Wang, W. et al. Genomic variation in
3,010 diverse accessions of Asian cultivated rice. _Nature_ 557, 43–49 (2018). Article CAS PubMed PubMed Central Google Scholar * Stein, J. C. et al. Genomes of 13 domesticated and wild
rice relatives highlight genetic conservation, turnover and innovation across the genus _Oryza_. _Nat. Genet._ 50, 285–296 (2018). Article CAS PubMed Google Scholar * Cao, J. et al.
Whole-genome sequencing of multiple _Arabidopsis thaliana_ populations. _Nat. Genet._ 43, 956–963 (2011). Article CAS PubMed Google Scholar * Haudry, A. et al. An atlas of over 90,000
conserved noncoding sequences provides insight into crucifer regulatory regions. _Nat. Genet._ 45, 891–898 (2013). Article CAS PubMed Google Scholar * Huang, X. et al. A map of rice
genome variation reveals the origin of cultivated rice. _Nature_ 490, 497–501 (2012). Article CAS PubMed PubMed Central Google Scholar * Gutaker, R. M. et al. Genomic history and
ecology of the geographic spread of rice. Preprint at https://www.biorxiv.org/content/10.1101/748178v1 (2019). * Josephs, E. B., Lee, Y. W., Stinchcombe, J. R. & Wright, S. I.
Association mapping reveals the role of purifying selection in the maintenance of genomic variation in gene expression. _Proc. Natl Acad. Sci. USA_ 112, 15390–15395 (2015). Article CAS
PubMed PubMed Central Google Scholar * Flowers, J. M. et al. Natural selection in gene-dense regions shapes the genomic pattern of polymorphism in wild and domesticated rice. _Mol. Biol.
Evol._ 29, 675–687 (2012). Article CAS PubMed Google Scholar * Caicedo, A. L. et al. Genome-wide patterns of nucleotide polymorphism in domesticated rice. _PLoS Genet._ 3, 1745–1756
(2007). Article CAS PubMed Google Scholar * Bradnam, K. R. & Korf, I. Longer first introns are a general property of eukaryotic gene structure. _PLoS ONE_ 3, e3093 (2008). Article
PubMed PubMed Central CAS Google Scholar * Rigau, M., Juan, D., Valencia, A. & Rico, D. Intronic CNVs and gene expression variation in human populations. _PLoS Genet._ 15, e1007902
(2019). Article PubMed PubMed Central CAS Google Scholar * Berendzen, K. W. et al. Bioinformatic _cis_-element analyses performed in _Arabidopsis_ and rice disclose bZIP- and
MYB-related binding sites as potential AuxRE-coupling elements in auxin-mediated transcription. _BMC Plant Biol._ 12, 125 (2012). Article CAS PubMed PubMed Central Google Scholar *
Freeling, M., Rapaka, L., Lyons, E., Pedersen, B. & Thomas, B. C. G-boxes, bigfoot genes, and environmental response: characterization of intragenomic conserved noncoding sequences in
_Arabidopsis_. _Plant Cell_ 19, 1441–1457 (2007). Article CAS PubMed PubMed Central Google Scholar * Choi, H. I., Hong, J. H., Ha, J. O., Kang, J. Y. & Kim, S. Y. ABFs, a family of
ABA-responsive element binding factors. _J. Biol. Chem._ 275, 1723–1730 (2000). Article CAS PubMed Google Scholar * Lu, T. et al. Function annotation of the rice transcriptome at
single-nucleotide resolution by RNA-Seq. _Genome Res._ 20, 1238–1249 (2010). Article CAS PubMed PubMed Central Google Scholar * Peng, T. et al. Differentially expressed microRNA cohorts
in seed development may contribute to poor grain filling of inferior spikelets in rice. _BMC Plant Biol._ 14, 196 (2014). Article PubMed PubMed Central CAS Google Scholar * Buenrostro,
J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins
and nucleosome position. _Nat. Methods_ 10, 1213–1218 (2013). Article CAS PubMed PubMed Central Google Scholar * Buenrostro, J. D., Wu, B., Chang, H. Y. & Greenleaf, W. J. ATAC-Seq:
a method for assaying chromatin accessibility genome-wide. _Curr. Protoc. Mol. Biol_. 109, 21.29.1–21.29.9 (2015). Article Google Scholar * Feng, S. et al. Conservation and divergence of
methylation patterning in plants and animals. _Proc. Natl Acad. Sci. USA_ 107, 8689–8694 (2010). Article CAS PubMed PubMed Central Google Scholar * Mahat, D. B. et al.
Base-pair-resolution genome-wide mapping of active RNA polymerases using precision nuclear run-on (PRO-Seq). _Nat. Protoc._ 11, 1455–1476 (2016). Article PubMed PubMed Central Google
Scholar * Kwak, H., Fuda, N. J., Core, L. J. & Lis, J. T. Precise maps of RNA polymerase reveal how promoters direct initiation and pausing. _Science_ 339, 950–953 (2013). Article CAS
PubMed PubMed Central Google Scholar * Liu, Y. et al. PCSD: a plant chromatin state database. _Nucleic Acids Res._ 46, D1157–D1167 (2018). Article CAS PubMed Google Scholar * Yan,
W. et al. Dynamic control of enhancer activity drives stage-specific gene expression during flower morphogenesis. _Nat. Commun._ 10, 1705 (2019). Article PubMed PubMed Central CAS Google
Scholar * Wen, M. et al. Expression variations of miRNAs and mRNAs in rice (_Oryza sativa_). _Genome Biol. Evol._ 8, 3529–3544 (2016). Article CAS PubMed PubMed Central Google Scholar
* Zong, W., Zhong, X., You, J. & Xiong, L. Genome-wide profiling of histone H3K4-tri-methylation and gene expression in rice under drought stress. _Plant Mol. Biol._ 81, 175–188
(2013). Article CAS PubMed Google Scholar * Lozano, R. et al. RNA polymerase mapping in plants identifies enhancers enriched in causal variants. Preprint at
https://www.biorxiv.org/content/10.1101/376640v1 (2018). * Xia, J. et al. Detecting and characterizing microRNAs of diverse genomic origins via miRvial. _Nucleic Acids Res._ 45, e176 (2017).
Article CAS PubMed PubMed Central Google Scholar * Wilkins, O. et al. EGRINs (environmental gene regulatory influence networks) in rice that function in the response to water deficit,
high temperature, and agricultural environments. _Plant Cell_ 28, 2365–2384 (2016). Article CAS PubMed PubMed Central Google Scholar * Tan, F. et al. Analysis of chromatin regulators
reveals specific features of rice DNA methylation pathways. _Plant Physiol._ 171, 2041–2054 (2016). Article CAS PubMed PubMed Central Google Scholar * Liu, C., Lu, F., Cui, X. &
Cao, X. Histone methylation in higher plants. _Annu. Rev. Plant Biol._ 61, 395–420 (2010). Article CAS PubMed Google Scholar * Liu, N., Fromm, M. & Avramova, Z. H3K27me3 and H3K4me3
chromatin environment at super-induced dehydration stress memory genes of _Arabidopsis thaliana_. _Mol. Plant_ 7, 502–513 (2014). Article CAS PubMed Google Scholar * Fang, H., Liu, X.,
Thorn, G., Duan, J. & Tian, L. Expression analysis of histone acetyltransferases in rice under drought stress. _Biochem. Biophys. Res. Commun._ 443, 400–405 (2014). Article CAS PubMed
Google Scholar * Du, Z. et al. Genome-wide analysis of histone modifications: H3K4me2, H3K4me3, H3K9ac, and H3K27ac in _Oryza sativa_ L. Japonica. _Mol. Plant_ 6, 1463–1472 (2013).
Article CAS PubMed PubMed Central Google Scholar * Lee, T., Zhai, J. & Meyers, B. C. Conservation and divergence in eukaryotic DNA methylation. _Proc. Natl Acad. Sci. USA_ 107,
9027–9028 (2010). Article CAS PubMed PubMed Central Google Scholar * Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. _Nat. Methods_ 9,
215–216 (2012). Article CAS PubMed PubMed Central Google Scholar * Ernst, J. & Kellis, M. Chromatin-state discovery and genome annotation with ChromHMM. _Nat. Protoc._ 12, 2478–1492
(2017). Article CAS PubMed PubMed Central Google Scholar * Roudier, F. et al. Integrative epigenomic mapping defines four main chromatin states in _Arabidopsis_. _EMBO J._ 30,
1928–1938 (2011). Article CAS PubMed PubMed Central Google Scholar * Sequeira-Mendes, J. et al. The functional topography of the _Arabidopsis_ genome is organized in a reduced number of
linear motifs of chromatin states. _Plant Cell_ 26, 2351–2366 (2014). Article CAS PubMed PubMed Central Google Scholar * Liu, C. et al. Genome-wide analysis of chromatin packing in
_Arabidopsis thaliana_ at single-gene resolution. _Genome Res._ 26, 1057–1068 (2016). Article CAS PubMed PubMed Central Google Scholar * Guo, H. & Moose, S. P. Conserved noncoding
sequences among cultivated cereal genomes identify candidate regulatory sequence elements and patterns of promoter evolution. _Plant Cell_ 15, 1143–1158 (2003). Article CAS PubMed PubMed
Central Google Scholar * Liu, L., Xu, W., Hu, X., Liu, H. & Lin, Y. W-box and G-box elements play important roles in early senescence of rice flag leaf. _Sci. Rep._ 6, 20881 (2016).
Article CAS PubMed PubMed Central Google Scholar * Ding, M. et al. Enhancer RNAs (eRNAs): new insights into gene transcription and disease treatment. _J. Cancer_ 9, 2334–2340 (2018).
Article PubMed PubMed Central CAS Google Scholar * Wang, Z., Chu, T., Choate, L. A. & Danko, C. G. Identification of regulatory elements from nascent transcription using dREG.
_Genome Res._ 29, 293–303 (2019). Article CAS PubMed PubMed Central Google Scholar * Danko, C. G. et al. Dynamic evolution of regulatory element ensembles in primate CD4+ T cells. _Nat.
Ecol. Evol._ 2, 537–548 (2018). Article PubMed PubMed Central Google Scholar * Savisaar, R. & Hurst, L. D. Exonic splice regulation imposes strong selection at synonymous sites.
_Genome Res._ 28, 1442–1454 (2018). Article CAS PubMed PubMed Central Google Scholar * Cannavò, E. et al. Shadow enhancers are pervasive features of developmental regulatory networks.
_Curr. Biol._ 26, 38–51 (2016). Article PubMed PubMed Central CAS Google Scholar * Prescott, S. L. et al. Enhancer divergence and _cis_-regulatory evolution in the human and chimp
neural crest. _Cell_ 163, 68–83 (2015). Article CAS PubMed PubMed Central Google Scholar * Mezmouk, S. & Ross-Ibarra, J. The pattern and distribution of deleterious mutations in
maize. _G3 (Bethesda)_ 4, 163–171 (2014). Article CAS Google Scholar * Wallace, J. G., Rodgers-Melnick, E. & Buckler, E. S. On the road to breeding 4.0: unraveling the good, the bad,
and the boring of crop quantitative genomics. _Annu. Rev. Genet._ 52, 421–444 (2018). Article CAS PubMed Google Scholar * Moyers, B. T., Morrell, P. L. & McKay, J. K. Genetic costs
of domestication and improvement. _J. Hered._ 109, 103–116 (2018). Article PubMed Google Scholar * Morrell, P. L., Buckler, E. S. & Ross-Ibarra, J. Crop genomics: advances and
applications. _Nat. Rev. Genet._ 13, 85–96 (2012). Article CAS Google Scholar * Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. _EMBnet J._ 17, 10–12
(2011). Article Google Scholar * Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. _Nat. Methods_ 9, 357–359 (2012). Article CAS PubMed PubMed Central Google
Scholar * Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. _Bioinformatics_ 26, 841–842 (2010). CAS PubMed PubMed Central Google
Scholar * Kent, W. J. BLAT—the BLAST-like alignment tool. _Genome Res._ 12, 656–664 (2002). CAS PubMed PubMed Central Google Scholar * Zhang, Y. et al. Model-based analysis of ChIP-Seq
(MACS). _Genome Biol._ 9, R137 (2008). Article PubMed PubMed Central CAS Google Scholar * Raurell-Vila, H., Ramos-Rodríguez, M. & Pasquali, L. in _CpG Islands. Methods in Molecular
Biology_ Vol. 1766 (eds Vavouri, T. & Peinado, M. A.) 197–208 (Humana Press, 2018). * Hetzel, J., Duttke, S. H., Benner, C. & Chory, J. Nascent RNA sequencing reveals distinct
features in plant transcription., _Proc. Natl Acad. Sci. USA_ 113, 12316–12321 (2016). Article CAS PubMed PubMed Central Google Scholar * Boisvert, S., Raymond, F., Godzaridis, É.,
Laviolette, F. & Corbeil, J. Ray Meta: scalable de novo metagenome assembly and profiling. _Genome Biol._ 13, R122 (2012). Article PubMed PubMed Central CAS Google Scholar * Butler,
J. et al. ALLPATHS: de novo assembly of whole-genome shotgun microreads. _Genome Res._ 18, 810–820 (2008). Article CAS PubMed PubMed Central Google Scholar * Green, E. D. et al.
Aligning multiple genomic sequences with the threaded blockset aligner. _Genome Res._ 14, 708–715 (2004). Article PubMed PubMed Central CAS Google Scholar * Siepel, A. & Haussler,
D. Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. _Mol. Biol. Evol._ 21, 468–488 (2004). Article CAS PubMed Google Scholar Download references
ACKNOWLEDGEMENTS We thank the New York University Center for Genomics and Systems Biology GenCore Facility and the Next Generation Sequencing core at Cold Spring Harbor Laboratory for
sequencing support. We thank O. Wilkins and C. Danko for valuable suggestions relating to the ATAC and PRO-Seq protocols, respectively. This work was supported primarily by a grant from the
Zegar Family Foundation (no. A16-0051-004), as well as some support from the National Science Foundation Plant Genome Research Program (no. IOS-1546218) and NYU Abu Dhabi Research Institute
to M.D.P., the National Science Foundation CAREER award (no. MCB-1552455), the US National Institutes of Health (no. R35GM124806) and US Department of Agriculture Hatch Program (no. 1012915)
to X.Z., the US National Institutes of Health (no. R35GM127070) to A.S., and fellowships from the Gordon and Betty Moore Foundation and Life Sciences Research Foundation (no. GBMF2550.06)
to S.C.G. and from the Natural Sciences and Engineering Research Council of Canada (no. PDF-502464-2017) to Z.J.-L. AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Center for Genomics and
Systems Biology, Department of Biology, New York University, New York, NY, USA Zoé Joly-Lopez, Adrian E. Platts, Jae Young Choi, Simon C. Groen & Michael D. Purugganan * Simons Center
for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA Adrian E. Platts, Brad Gulko & Adam Siepel * Laboratory of Genetics and Wisconsin Institute for
Discovery, University of Wisconsin-Madison, Madison, WI, USA Xuehua Zhong * Center for Genomics and Systems Biology, NYU Abu Dhabi Research Institute, NYU Abu Dhabi, Abu Dhabi, United Arab
Emirates Michael D. Purugganan Authors * Zoé Joly-Lopez View author publications You can also search for this author inPubMed Google Scholar * Adrian E. Platts View author publications You
can also search for this author inPubMed Google Scholar * Brad Gulko View author publications You can also search for this author inPubMed Google Scholar * Jae Young Choi View author
publications You can also search for this author inPubMed Google Scholar * Simon C. Groen View author publications You can also search for this author inPubMed Google Scholar * Xuehua Zhong
View author publications You can also search for this author inPubMed Google Scholar * Adam Siepel View author publications You can also search for this author inPubMed Google Scholar *
Michael D. Purugganan View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS M.D.P. conceived of the study idea. M.D.P., Z.J.-L., A.E.P. and A.S.
designed the study. M.D.P. directed the study. Z.J.-L. and X.Z. collected the data, A.E.P., Z.J.-L., J.Y.C., B.G., S.C.G. and M.D.P. analysed the data. Z.J.-L., A.E.P., A.S. and M.D.P. wrote
the paper. CORRESPONDING AUTHOR Correspondence to Michael D. Purugganan. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests. ADDITIONAL INFORMATION PEER
REVIEW INFORMATION _Nature Plants_ thanks Robin Allaby, Peter Civan and Peter Morrell for their contribution to the peer review of this work. PUBLISHER’S NOTE Springer Nature remains neutral
with regard to jurisdictional claims in published maps and institutional affiliations. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION Supplementary Figs. 1–12. REPORTING SUMMARY
SUPPLEMENTARY TABLES Supplementary Tables 1–11. RIGHTS AND PERMISSIONS Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Joly-Lopez, Z., Platts, A.E., Gulko, B. _et al._ An
inferred fitness consequence map of the rice genome. _Nat. Plants_ 6, 119–130 (2020). https://doi.org/10.1038/s41477-019-0589-3 Download citation * Received: 18 July 2019 * Accepted: 20
December 2019 * Published: 10 February 2020 * Issue Date: February 2020 * DOI: https://doi.org/10.1038/s41477-019-0589-3 SHARE THIS ARTICLE Anyone you share the following link with will be
able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing
initiative