Disentangling the genetic basis of rhizosphere microbiome assembly in tomato

Disentangling the genetic basis of rhizosphere microbiome assembly in tomato

Play all audios:

Loading...

ABSTRACT Microbiomes play a pivotal role in plant growth and health, but the genetic factors involved in microbiome assembly remain largely elusive. Here, we map the molecular features of


the rhizosphere microbiome as quantitative traits of a diverse hybrid population of wild and domesticated tomato. Gene content analysis of prioritized tomato quantitative trait loci suggests


a genetic basis for differential recruitment of various rhizobacterial lineages, including a _Streptomyces_-associated 6.31 Mbp region harboring tomato domestication sweeps and encoding,


among others, the iron regulator FIT and the water channel aquaporin SlTIP2.3. Within metagenome-assembled genomes of root-associated _Streptomyces_ and _Cellvibrio_, we identify bacterial


genes involved in metabolism of plant polysaccharides, iron, sulfur, trehalose, and vitamins, whose genetic variation associates with specific tomato QTLs. By integrating ‘microbiomics’ and


quantitative plant genetics, we pinpoint putative plant and reciprocal rhizobacterial traits underlying microbiome assembly, thereby providing a first step towards plant-microbiome breeding


programs. SIMILAR CONTENT BEING VIEWED BY OTHERS IDENTIFYING PLANT GENES SHAPING MICROBIOTA COMPOSITION IN THE BARLEY RHIZOSPHERE Article Open access 16 June 2022 AN INTEGRATED


TRANSCRIPTOME, METABOLOME, AND MICROBIOME DATASET OF _POPULUS_ UNDER NUTRIENT-POOR CONDITIONS Article Open access 30 April 2025 HERITABLE MICROBIOME VARIATION IS CORRELATED WITH SOURCE


ENVIRONMENT IN LOCALLY ADAPTED MAIZE VARIETIES Article 21 March 2024 INTRODUCTION Root and shoot microbiomes are fundamental to plant growth and plant tolerance to (a)biotic stress factors.


The outcome of these beneficial interactions is the emergence of specific microbiome-associated phenotypes (MAPs)1, such as drought resilience2, disease resistance3, development4, and


heterosis (i.e., hybrid vigor)5. The microbes inhabiting the surface or internal tissues of plant roots are selectively nurtured by diverse plant-derived compounds in the form of primary and


secondary metabolites6,7. Microbes reciprocate by supporting plant growth and producing metabolites that mediate processes such as nutrient acquisition and pathogen suppression8,9.


Developing a blueprint of the genetic architecture for this ‘chemical dialog’ and how these interactions lead to specific MAPs is one of the key focal points in current plant microbiome


research. The promise is that these genomic and chemical blueprints can be integrated into crop breeding programs for a new generation of ‘microbiome-assisted’ crops that can rely, at least


in part, on specific members of the microbiome for stress protection, enhanced growth, and higher yields10. Selective breeding for yield-related traits has left a considerable impact on the


taxonomic and functional composition of modern crop microbiomes11,12. Wild plant relatives represent a ‘living library’ of diverse genetic traits that may have been lost during


domestication13. For example, recombinant inbred lines (RILs) of crosses between wild tomato relatives and modern tomato cultivars have been used to identify genetic loci controlling


important agronomic traits, including tolerance to abiotic14 and biotic stress15, as well as nutritional quality and flavor profiles16. To date, microbiome traits are not yet considered for


breeding purposes, except for specific quantitative MAPs such as the number of nodules in legume-rhizobia symbioses17. However, technological advances in sequencing now make it feasible to


treat microbiomes as quantitative traits for selection. Quantitative approaches to map the microbiome as a phenotype have been adopted to investigate the phyllosphere microbiome and,


recently, for the _Arabidopsis_ and sorghum rhizosphere microbiomes18,19. However, actualizing microbiome features into breeding programs at a scale for crop improvement has not yet been


realized. In fact, for most plant species, investigations leveraging diverse plant populations to map microbiome-associated quantitative trait loci (QTL) are still in their infancy18,19,20.


In these recent studies, the microbiomes were characterized by amplicon sequencing to detect loci involved in alpha and beta diversity as well as individual OTU abundances21. These studies


provide strong evidence that microbiome recruitment has a genetic component, but the functional nature of the corresponding plant–microbe interactions cannot be reliably elucidated from


amplicon data. Hence, functional genomic features of the microbiome, as well as intraspecific diversity within microbial species, have not yet been taken into account in QTL analyses22.


Here, we use both amplicon and shotgun metagenome sequencing to generate taxonomic as well as functional microbiome features as quantitative traits. Using an extensive RIL population of a


cross between modern _Solanum lycopersicum_ var. Moneymaker and wild _Solanum pimpinellifolium_23, we identify reciprocal associations between specific plant and microbiome traits and infer


putative mechanisms for rhizosphere microbiome assembly. Using the modern allele as a reference, we find QTLs for numerous taxonomic and metagenomic features of the microbiome with both


positive and negative effects. We observe more positive effects related to increases in microbiome feature abundance for the modern reference allele compared to the wild reference allele,


suggesting that domestication has had a significant impact on rhizosphere microbiome assembly. We identify plant traits related to growth, stress, amino acid metabolism, iron and water


acquisition, hormonal responses, and terpene biosynthesis, whereas the microbial traits we identify are related to the metabolism of plant cell wall polysaccharides, vitamins, sulfur, and


iron. Furthermore, we show that amplicon-based approaches allow detection of QTLs for rarer microbial taxa, whereas shotgun metagenomics allowed mapping to smaller and thus more defined


plant genomic regions. Together, these results demonstrate the power of an integrated approach to disentangle and prioritize specific genomic regions and genes in both plants and microbes


associated with microbiome assembly. RESULTS BASELINE ANALYSES OF THE TOMATO RECOMBINANT INBRED LINE POPULATION Prior to detailed metagenome analyses of the microbiome of the tomato RIL


population, we first investigated whether QTLs previously identified in the same RIL population under sterile in vitro conditions could be replicated in our experiment conducted under


greenhouse conditions with a commercial tomato greenhouse soil (Fig. 1a, b and Supplementary Data 1)24. We identified QTLs for shoot dry weight (SDW) coinciding with a QTL identified


previously on chromosome 924. Similarly, we identified QTLs for rhizosphere mass (RM), defined here as a the total mass of the roots with tightly adhering soil, which coincides with root


trait QTLs previously identified for lateral root number, fresh and dry shoot weight, lateral root density per branched zone and total root size (Fig. 1b)24. An analysis of variance (ANOVA)


yielded significant variation in SDW based on the additivity of alleles linked to SDW (zero, one, or two alleles) (F(2, 186) = 16.02, _p_ = 3.76 e–07) (Fig. 1c, d). A post hoc Tukey test


further demonstrated significant differences between all pairwise comparisons (_p_ < 0.05). For RM, an ANOVA yielded a significant difference (F(2, 186) = 16.02, _p_ = 3.76 e–07); a post


hoc Tukey test demonstrated a statistically significant difference only between the presence of either one or two alleles (_p_ < 0.05), but did not support additivity (_p_ = 0.15) (Fig. 


1e, f). Collectively, our results confirm and extend earlier work conducted on the same tomato RIL population in vitro24, providing a solid basis for QTL mapping of taxonomic and genomic


features of the rhizosphere microbiome TAXONOMIC MICROBIOME FEATURES AS QUANTITATIVE TRAITS To investigate molecular features of the microbiome as quantitative traits, we conducted 16S rRNA


gene amplicon sequencing of 225 rhizosphere samples, including unplanted bulk soil, parental tomato genotypes, and all 96 RIL accessions in duplicate (BioProject ID PRJNA787039). We observed


separation between the microbiomes of rhizosphere and bulk soil, between the microbiomes of the two parental tomato genotypes, and the RIL accession microbiomes (Fig. 2a). To limit multiple


testing and to focus on common microbiome features with sufficient coverage across all accessions, we prioritized the rhizosphere-enriched amplicon sequence variants (ASVs) to those present


in 50% or more of the RIL accessions (Fig. 2b). A QTL analysis with these prioritized ASVs was run with R/qtl225 using a high-density tomato genotype map26, harvest date, post-harvest total


bulk soil mass, RM, number of leaves at harvest, and SDW as covariates. We identified 48 QTL peaks, across 45 distinct loci, significantly associated with 33 ASVs (Supplementary Data 6).


Our logarithm of the odds (LOD) thresholds for significance had been determined by pooled permutations from all ASVs to attain a genome-wide threshold of _P_ 0.05 (LOD 3.35) and _P_ 0.2 (LOD


2.64). The modern allele was set at reference, such that negative effects were relatively more associated with the wild allele and positive effects with the modern allele. Of the


significant QTLs, 16 were microbiome features less abundant compared to the reference allele, whereas 32 were microbiome features more abundant in presence of the modern reference allele.


The QTLs on chromosomes 11, 10, 8, and 2 were associated with increases in abundance in presence of the modern reference allele. In contrast, the sole QTL on chromosome 7 was negative


relative to the reference. All other chromosomes contained a mix of QTLs with positive and negative effects on ASV abundance relative to the reference allele (Fig. 3a). While many


rhizobacterial lineages were linked to a single QTL (14 out of 25 unique taxonomies), others were linked to two or more QTLs (7 and 4 taxa, respectively) (Fig. 3b). Of the lineages with


multiple QTLs, most were positive relative to the reference allele. One salient exception was _Methylophilaceae_, with a total of 9 QTLs that were both positive and negative relative to the


reference and distributed across chromosomes 3 (positive, x2), 4 (positive), 7 (negative), 11 (positive x2), and 12 (negative x3) (Fig. 3c). Another salient feature of the QTL analysis was


the hotspot for microbiome assembly identified on chromosome 11, including a significant linkage with ASVs from _Adhaeribacter, Caulobacter, Devosia_, Rhizobiaceae_, Massilia,_ and


_Methylophilaceae_ (Fig. 3c). In addition to individual ASVs, we investigated diversity metrics as quantitative traits using Shannon index and principal coordinate analysis (PCoA) with


Bray–Curtis dissimilarity. For each approach, we calculated diversity statistics first using all ASVs with a relative abundance greater than the effective samples size27, and second using


the rhizosphere-enriched ASVs present in 50% or more of the RIL accessions. For the Shannon index, LOD thresholds for significance were determined by permutations to attain a genome-wide


threshold of _P_ 0.05 (LOD 3.27) and _P_ 0.2 (LOD 2.63). Two QTLs were identified on chromosomes 1 and 3 (Supplementary Figs. 1 and 2) using all, and prioritized, ASVs to calculate Shannon


Diversity respectively. Of note, the QTL on chromosome 1 overlaps with the confidence interval of the _Cellvibrio_ QTL highlighted later in the results section. For the PCoA, the first two


components were mapped as quantitative traits. A LOD threshold for significance was determined by permutations to attain a genome-wide threshold of _P_ 0.05 (LOD 3.41) and _P_ 0.2 (LOD


2.71). A single QTL was identified on chromosome 6 in the same position as the QTL identified previously for _Streptomyces_ ASV 5 (Supplementary Fig. 3). Of further interest is that all


diversity metric QTLs were negative relative to the reference. Thus, while genetic changes during domestication may have made some ASVs more or less abundant, these genetic changes also


impacted overall diversity. Given the non-independence of sequencing-based microbiome features, we suggest caution in interpreting the results of using diversity metrics as microbiome


features. Effect size is an important factor when mapping the genetic architecture of quantitative traits. While some QTLs have large effect sizes, many small effect QTLs may explain a large


proportion of trait variation28. To date, there is little understanding of the distribution of the effect sizes of QTLs for microbiome features. Here we show that the absolute values of the


effect sizes of the 48 QTLs on ASV relative abundance ranged from 1.3 to 17%, with an average effect size of approximately 5%, comparable to the effects seen for SDW and RM (Fig. 1c, e).


The largest QTL effects were positive for an ASV in the genus _Qipengyuania_ (17%), and an ASV in _Edaphobaculum_ (10%). However, no statistical difference was found between the absolute


value of positive and negative effect sizes (_p_ = 0.78, two-tailed _t_-test). Furthermore, for those lineages with sufficient representation at the class level (Bacteroidia,


Alphaproteobacteria, and Gammaproteobacteria), there was no statistically significant difference between effect size (F(3, 16) = 0.072, _p_ = 0.974). However, an ANOVA on the positive effect


size at genus level demonstrated significant differences between lineages (F(3, 16) = 12.94, _p_ = 1.15 e−04). A post hoc Tukey test demonstrated QTLs for _Massilia_ with a larger positive


effect size than other lineages with sufficient sample size for comparison (Fig. 3d). Collectively, our amplicon analysis provided a broad picture, suggesting that the assembly of bacteria


in the tomato rhizosphere is a complex trait governed by a combination of multiple loci, some being ASV specific, some being pleiotropic for different ASVs, and with heterogenous effect


sizes on ASV abundance (Fig. 3d). While QTLs were identified with both positive and negative effects relative to the reference modern allele, the large number of positive effects suggests


domestication impacted rhizosphere microbiome assembly. FUNCTIONAL MICROBIOME FEATURES AS QUANTITATIVE TRAITS To understand the functional traits associated with rhizosphere microbiome


assembly, we generated shotgun metagenomes for the rhizosphere microbiome of each accession in the tomato RIL population (96 total), as well as six samples of the modern tomato parent, five


samples of the wild tomato parent and seven bulk soil samples (BioProject ID PRJNA789467). After pre-processing, a co-assembly strategy using all metagenomes was implemented (see


Supplementary Methods section 4.2.2 for more detail). Subsequently, bin and contig abundances were determined by read depth using CSS normalization, a computational method to adjust for


compositional bias27. QTL mapping was conducted for the rhizosphere-enriched contig and bin abundances. A PCoA analysis of the contigs demonstrated separation between the bulk soil and RIL


rhizosphere microbiomes (Supplementary Fig. 9). Binning was done using Metabat2 (version 2:2.15)29 and genomic quality of the output was evaluated by CheckM30 (Supplementary Data 7). The


bins and assembled contigs larger than 10 kb are publicly available (https://doi.org/10.5281/zenodo.6561541). All contigs of 10 kb and larger were taxonomically assigned using Kraken31


(Supplementary Data 8). With nearly 40 million contigs being assembled, the effects of multiple testing were reduced by prioritizing rhizosphere-enriched contigs (relative to the bulk soil)


which were larger than 10 kb and with an enrichment greater than 4-fold. After these stringent prioritization steps, 1249 contigs were remaining. The functional potential of these


rhizosphere-enriched contigs represented 8.3% of protein clusters identified in all contigs greater than 10 kb by MMseqs2 using a 50% protein identity threshold32. Approximately 25% of all


proteins were contained within these clusters, suggesting that a considerable fraction of functional diversity was maintained during the prioritization. Only bins with greater than 90%


completion and less than 5% contamination were mapped (33 out of 588 bins). As with the ASVs, harvest date, bulk soil mass, RM, number of leaves at harvest, and SDW were used as covariates


in QTL mapping. We identified 7 significant bin QTLs (LOD > 3.40, _P_ < 0.05) (Supplementary Data 9) including _Streptomyces_ bin 72 with a positive effect on tomato chromosomes 6 and


11. For the contigs, a total of 717 QTLs at 26 unique positions on tomato chromosomes 1, 4, 5, 6, 9, and 11 were identified (Supplementary Data 10), corresponding to 476 metagenomic contigs


from 10 different genera (LOD > 3.47, _P_ < 0.05). The largest number of contig QTLs were linked to the _Streptomyces_, _Cellvibrio,_ and _Sphingopyxis_ lineages (Fig. 4a). The


_Streptomyces_ contigs mapped to QTLs on tomato chromosomes 4 (46 contigs, negative), 6 (190 contigs, positive), and 11 (257 contigs, positive), with a subset of contigs mapping to two or


all three of these positions (Fig. 4b). These findings corroborate and expand upon the _Streptomyces_ QTL identified on chromosome 6 using our 16S rRNA gene amplicon data, as well as that of


the bin QTLs identified on chromosomes 6 and 11. The _Cellvibrio_ contigs mapped to chromosome 1 (42 contigs, negative) and chromosome 9 (94 contigs, negative), again corroborating the


findings from our 16S rRNA gene amplicon analysis described above. In contrast, the _Sphingopyxis_ QTLs identified on chromosome 5 (24 contigs, negative) and 9 (49 contigs, positive) did not


correspond to the QTLs identified on chromosomes 8 and 3 in the 16S rRNA gene amplicon analysis. Four contigs for _Devosia_ also corroborated the results of the 16S QTL analysis. The effect


sizes ranged from 9 to 21% and were significantly different (F(14, 702) = 530.9 _p_ < 2e−16) between QTL and lineages (Fig. 4c). As with the 16S rRNA amplicon analysis, some of the


highest LOD scores were for _Devosia_. Also, the effect size of the _Sphingopyxis_ contigs was large (±20% on average), above 15% for _Cellvibrio_, and approximately 10% for _Streptomyces_.


The average QTL region was 51.59 Mbps for the 16S rRNA gene amplicon sequences and 26.64 Mbps for the metagenomic contigs (two-sided _t_-test, _p_ = 3.32E−09) (Fig. 4e). A more striking


contrast was observed in the difference between the median size of amplicon and contig QTL regions which were 58.56 Mbp and only 6.47 Mbp, respectively. In summary, while many more taxa were


identified in the amplicon-based QTL analysis, the metagenome-based QTL analysis provided QTLs with much smaller confidence intervals (Fig. 4e). AMPLICON-BASED BULK SEGREGANT ANALYSIS OF


_STREPTOMYCES_ AND _CELLVIBRIO_ ABUNDANCE The two most abundant rhizosphere taxa with replicated patterns for amplicon and metagenome-based QTLs were _Streptomyces_ and _Cellvibrio_.


Therefore, we sought to provide additional independent support for these QTLs using a bulk segregant analysis of an independent population of parental and RIL genotypes (Supplementary Data 


11). In particular, we tested the previously identified amplicon-based QTLs associated with higher _Cellvibrio_ abundance at markers 464 and 3142 on chromosomes 1 and 9, respectively with


higher _Streptomyces_ abundance at marker 2274 on chromosome 6 (Fig. 5). In each case, ANOVA showed a statistical difference between genotypes and bulk soil, respectively (F(4, 396) = 21.56,


_p_ = 4.16 e−16), (F(4, 396) = 18.43, _p_ = 6.68 e−14), (F(4, 396) = 8.423, _p_ = 1.57 e−06). A post hoc Tukey HSD test supported the conclusion that wild allele at markers 464 and 3142 on


chromosomes 1 and 9, respectively, are indeed associated with increased abundance _Cellvibrio_ (_p_ = 3.913 e−04, and _p_ = 0.08, respectively), while the modern allele at markers 2274 on


chromosome 6 was significantly associated with increased abundance of _Streptomyces_ (_p_ = 1.152 e−04). HOST GENETICS AND RHIZOSPHERE MICROBIOME ASSEMBLY A subset of 5 regions consistent


across both the amplicon and metagenome-based analyses were prioritized with an average size of 2.68 Mbps (Supplementary Data 12). These included positions on chromosome 1 (positions


87.36–90.49 Mbps), chromosome 9 (pos 62.03–63.32 Mbps), chromosome 5 (pos 61.54–63.38), chromosome 6 (pos 33.99–40.3 Mbps), and chromosome 11 (pos 53.06–53.89 Mbps). In total, 1359 genes


were identified in these regions. Potential candidate genes with root-specific transcriptional patterns, defined as a 4 fold increase in the roots compared to leaf samples, were further


prioritized using a publicly available RNA-seq dataset33. Based on this analysis, a subset of 192 root specific plant genes were identified (Supplementary Data 13). A total of 98 root


specific plant genes were linked to _Streptomyces_ on chromosome 6 (84 genes) and 11 (14 genes) (Fig. 6). Intriguingly, 61 of these genes were found in regions previously identified to be


subjected to selective sweeps, regions of fixed low genetic diversity, related to tomato domestication as well as to subsequent sweeps related to improvements in fruit quality34


(Supplementary Fig. 4). While it remains unclear whether the relationship between selective sweeps and changes in microbial feature abundance is causal or coincidental; here we reveal a


genomic signature that the domestication process impacted alleles involved in microbiome assembly. Two of the most salient genes in this list included genes with high transcription in the


roots; an aquaporin and a Fer-like iron deficiency-induced transcription factor (FIT). The aquaporin (SlTIP2.3) has the highest fold change of all tonoplast intrinsic proteins in tomato


roots as compared to all other organs32,33, while the FIT gene is a bHLH transcriptional regulator controlling iron homeostasis in tomato34,35. Other genes within this region on chromosome 6


include a glycine rich protein, a receptor-like kinase known to be upregulated during drought36, alcohol dehydrogenase, numerous phosphatases, expansins, ethylene-responsive transcription


factors, gibberellin receptors, aminocyclopropane-1-carboxylate oxidase (ACO), an enzyme involved in the last step of ethylene biosynthesis, and finally, alpha-humulene and


(-)-(E)-beta-caryophyllene, a known tomato terpene and signaling molecule in tomato37,38 and also acting as a volatile in microbiome assembly39. Root specific genes involved in carbohydrate,


protein, and amino metabolism were also identified, including trypsin-alpha amylase inhibitor, prolyl 4-hydroxylase, polygalacturonase, trehalose phosphatase, glycogenin, xyloglucan


fucosyltransferase, and a metallocarboxypeptidase inhibitor, spermidine synthase, acetolactate synthases, alanine aminotransferase, and an amino acid permease. On chromosome 11, a


ferrodoxin, an aluminum-activated malate transporter40, and a cluster of various acetyltransferases and a sulfotransferase were identified. An aluminum-activated malate transporter was also


identified in the QTL region on chromosome 6, which has been linked to increased malate accumulation in both fruit and roots41. A total of 57 root specific genes were identified in the QTL


regions on chromosome 1 and 9 linked to _Cellvibrio_. These include a cytochrome p450 involved in coumarin synthesis, numerous extensins, phosphatases, respiratory burst oxidase-like


protein, iron chelator nicotianamine synthase42,43, and on chromosome 11 phenazine biosynthesis. On chromosome 5, 37 root specific genes were identified including multiple peroxidases,


glutamine synthetase, rhamnogalacturonate lyase, pectinesterase, metacaspase, and trehalose-phosphatase. Furthermore, numerous ethylene responsive transcription factors and receptor-like


kinases were observed. The QTL on chromosome 1 contains genome-wide sweeps associated with the initial tomato domestication and subsequent improvements of fruit quality traits, suggesting


that one or both of these events were connected to or act as a ‘side effect’ on the decreased abundance of _Cellvibrio_ in the tomato rhizosphere. ILLUMINATING METAGENOMIC TRAITS IN


_CELLVIBRIO_ AND _STREPTOMYCES_ To further investigate the potential functional importance of the 476 rhizosphere-enriched metagenomic contigs mapped as QTLs, we performed a deeper analysis


into their functional gene content (Supplementary Data 14, 15, and 16). An antiSMASH44 analysis identified 30 biosynthetic gene clusters (BGCs) across these contigs. These BGCs largely


originated from contigs taxonomically assigned to _Cellvibrio_ and _Streptomyces_. They included several gene clusters potentially associated with root colonization, such as two melanin BGCs


(c00216, NODE_5919; c00255, NODE_7250) from _Streptomyces_ (which have been positively associated with colonization45) and a _Cellvibrio_ aryl polyene BGC (c00185, NODE_4941), which is


thought to protect bacteria against reactive oxygen species generated during immune responses of the host plant46. The contigs also contained gene clusters potentially beneficial to the


host, such as BGCs encoding iron-scavenging siderophores, which have been associated with disease suppression in tomato47; specifically, homologs of coelichelin and desferrioxamine BGCs from


streptomycetes were found (c00269, NODE_7969, and c00122, NODE_3362), three IucA/IucC-like putative siderophore synthetase gene clusters (c00106, NODE_2973; c00041, NODE_1131; c00238,


NODE_6661), as well as a _Cellvibrio_ NRPS-PKS gene cluster (c00001, NODE_101) most likely encoding the production of a siderophore based on the presence of a TonB-dependent siderophore


receptor-encoding gene as well as a putative _tauD_-like siderophore amino acid β-hydroxylase-encoding gene48. The _Cellvibrio_ contigs also contain several genes relevant for carbohydrate


catabolism. For example, homologs of _xyl31a_ (B2R_23365) and _bgl35a_ (B2R_06825-06826) were detected (with 78%, 79 and 65% amino acid identity, respectively), genes that have been shown to


be responsible for utilization of the abundant plant cell wall polysaccharide xyloglucan in _Cellvibrio japonicus_49. In addition, a possible homolog of the β-glucosidase gene _bgl3D_50


(B2R_26663), involved in xyloglucan utilization, was also identified, having high similarity to _bgl3D_ from _Cellvibrio japonicus_ (64% amino acid identity). Also, putative


cellulose-hydrolizing enzymes were detected, such as a homolog (B2R_21082) of the cellobiohydrolase _cel6A_ from _Cellvibrio japonicus_51 encoded in a complex locus of nine


carbohydrate-acting enzymes annotated on this contig (NODE_5090) by DBCAN52 (Supplementary Data 14). Collectively, these results point to a possible role of microbial traits related to iron


acquisition and metabolism of plant polysaccharides in tomato rhizosphere microbiome assembly. Contigs of the metagenome-assembled genome (MAG) associated with _Streptomyces_ ASV5 (the key


taxon associated with tomato QTLs described above) contained a multitude of functional genes potentially relevant for host-microbe interactions. Taxonomically, the ASV5 MAG was most closely


related to a clade of streptomycetes that includes type strains of species such as _arenae_, _flavovariabilis_, _variegatus_, and _chartreusis_. To understand how tomato might differentially


recruit ASV5 streptomycetes, we analyzed the MAG for genes and gene clusters potentially involved in colonization. Intriguingly, we found contigs to be rich in genes associated with plant


cell wall degradation. In particular, we identified a family 6 glycosyl hydrolases (B2R_10154) of which the glycosyl hydrolase domain has 84% amino acid identity to that of the SACTE_0237


protein that was recently shown to be essential for the high cellulolytic activity of _Streptomyces_ sp. SirexAA-E31. Additionally, we detected a homolog (82% amino acid identity) of


_Streptomyces reticuli_ avicelase, a well-studied cellulase enzyme that degrades cellulose into cellobiose53 (B2R_29198). Larger gene clusters associated with degradation of plant cell wall


materials were also found. These included an 8 kb gene cluster coding for multiple pectate lyases and pectinesterases (B2R_31553-31558), and an 8 kb gene cluster encoding a family 43


glycosyl hydrolase, a pectate lyase L, a rhamnogalacturonan acetylesterase RhgT, a GDSL-like lipase/acylhydrolase, a family 53 glycosyl hydrolase, and an endoglucanase A (B2R_15915-15920).


Together, these findings suggest that ASV5 _Streptomyces_ has the capacity to effectively process complex organic materials shed by plant roots during growth. These results are in line with


a recent study on plant-associated streptomycetes that indicated that their colonization success appears to be associated with the ability to utilize complex organic material of plant


roots54. Root exudates also play a key role in the recruitment of microbes. Prominent sugar components of tomato root exudates are glucose, but also xylose and fructose55. The _Streptomyces_


MAG contains _xylA_ and _xylB_ genes (B2R_19014, B2R_19013) and a putative _xylFGH_ import system (B2R_29274, B2R_23438, B2R_23439) facilitating xylose metabolism. Similarly, a _frcBCA_


import system was identified in the genome (B2R_17966- B2R_17968) as well as a glucose permease (B2R_32780) with 91,5% amino acid identity to _glcP1_ SCO5578 of _Streptomyces coelicolor_


A3(2)56. Other genes putatively involved in root exudate catabolism were also found in the ASV5 MAG, such as sarcosine oxidase (_soxBAG_, B2R_20550-20551, and B2R_21105), which has been


shown to be upregulated in the presence of root exudates of various plants57,58. In summary, the _Cellvibrio_ and _Streptomyces_ contigs encoded a range of functions that likely allow them


to profit from tomato root exudates as well as complex organic material shed from growing tomato roots. How these plant traits differ between wild and domesticated tomatoes and if/how these


influence differential colonization of roots of wild and domesticated tomato lines by these two bacterial lineages will require detailed comparative metabolomic analyses of the root exudates


of both tomato lines as well as isolation of the corresponding _Cellvibrio_ and _Streptomyces_ ASVs, analysis of their substrate utilization spectrum followed by site-directed mutagenesis


of the candidate genes, root colonization assays and in situ localization studies. GENOMIC STRUCTURE IN _CELLVIBRIO_ AND _STREPTOMYCES_ PROVIDES INSIGHTS INTO ADAPTATIONS FOR DIFFERENTIAL


RECRUITMENT Bacterial populations often contain significant genomic heterogeneity. This heterogeneity may be associated with differential recruitment through altered nutrient preferences or


host colonization mechanisms. The use of metagenomics enabled us to investigate the population structure within each rhizobacterial lineage and identify intraspecific differences. To do so,


we first identified a unique set of 697,731 microbiome Single Nucleotide Variants (SNVs) in a subset of parental and bulk metagenomes using InStrain22. A set of 15,026 SNVs enriched in


either the wild or modern tomato rhizosphere were selected and the abundance of each allele at each SNV was calculated. Using these abundances, QTL mapping was performed using R/qtl2 as


described in the methods. A total of 3,357 QTL peaks were identified (LOD > 3.01, _P_ < 0.05), to 1229 independent loci. A total of 1354 QTL with positive effects and 2,001 QTL with


negative effects were identified, derived from 2,898 unique SNVs, and corresponding to 810 and 1068 unique rhizobacterial genes respectively (Supplementary Data 17). We investigated the 103


_Streptomyces_ SNV QTLs at 94 unique positions within annotated genes whose mapping coincided with the previously identified QTLs for _Streptomyces_ contigs to tomato chromosomes 4, 6, and


11 (Supplementary Data 17). Numerous _Streptomyces_ SNVs were associated positively with the reference tomato alleles on chromosomes 6 and 11. In particular, alpha-galactosidase (B2R_16136)


and arabinose import (B2R_29105) had the highest LOD and smallest overlapping confidence intervals with chromosomes 6 and 11 (Fig. 7). Indeed, many SNVs in genes involved in the degradation


of xylan59, one of the most dominant non-cellulosic polysaccharides in plant cell-walls60, as well as carbohydrate and protein metabolism were associated positively to QTL on chromosomes 6


and 11, including xyloglucanase Xgh74A (B2R_10589), alpha-xylosidase (B2R_23763), endo-1,4-beta-xylanase (B2R_20609), extracellular exo-alpha-L-arabinofuranosidase (B2R_20608), multiple


protease HtpX (B2R_19218), cutinase (B2R_19356), and putative ABC transporter substrate-binding protein YesO (B2R_09821) which has been implicated in the transport of plant cell wall


pectin-derived oligosaccharides61. A _Streptomyces_ SNV in acetolactate synthase (B2R_28001) was associated positively to QTL on tomato chromosome 6 where a plant acetolactate synthase was


located. Similarly, multiple SNVs in _Streptomyces_ genes involved in putrescine transportation (B2R_25489) were associated positively to QTL on tomato chromosomes 6 and 11, which contain


genes for spermine synthase, suggesting a possible metabolic cross-feeding from plant to microbe. A majority of these SNVs were synonymous having no effect on the produced amino acid


sequence. However, some were non-synonymous, resulting in an altered amino acid sequence, including the histidine decarboxylase SNV (B2R_16511) mapping to both tomato chromosomes 6 and 11


(Fig. 7). _Streptomyces_ SNVs that were associated negatively with the QTL on tomato chromosome 4 included an antibiotic resistance gene (daunorubicin/doxorubicin, B2R_28992) and


maltooligosyl trehalose synthase (B2R_07820) among others. Similarly, we investigated the 324 _Cellvibrio_ SNV QTLs within annotated genes whose mapping coincided with the previously


identified _Cellvibrio_ contig QTLs to chromosomes 1 and 9. Again, numerous SNV QTLs were identified in genes were related to sugar catabolism, including a gene encoding an extracellular


exo-alpha-(1->5)-L-arabinofuranosidase (B2R_16093), fructose import FruK (B2R_22268), a cellulase/esterase-encoding _celE_ homolog (B2R_11067), and genes involved in malate (B2R_18213),


mannonate (B2R_14081), xyloglucan (B2R_10668) and xylulose (B2R_22179) metabolism. Furthermore, many additional SNV QTL were identified in genes related to vitamin and cofactor metabolism as


well as sulfur and iron metabolism. In particular, these included genes for a phosphoadenosine phosphosulfate reductase (B2R_15720), vitamin B12 transporter BtuB (10 different genes, see


Supplementary Data 17), a siroheme synthase (B2R_24033), a pyridoxal phosphate homeostasis protein (B2R_17481), a heme chaperone HemW (B2R_12751), a hemin transport system permease protein


HmuU (B2R_09175), a Fe(2+) transporter FeoB (B2R_19968), a biotin synthase (B2R_30007), a catecholate siderophore receptor Fiu (B2R_17486), and a Fe(3+) dicitrate transport ATP-binding


protein Fec (B2R_09176) (Supplementary Data 17). Taken together, this analysis suggests that a shotgun metagenomic approach integrated with quantitative plant genetics can be instrumental in


a high-throughput manner to discover putative reciprocal genetic links between plant and microbial metabolisms, such as those identified here for polysaccharides, trehalose, iron, vitamin,


amino acid, and polyamine metabolism. DISCUSSION Breeding for microbiome-assisted crops is a daunting task, encompassing ecological, evolutionary, and cultural processes. What constitutes a


desirable trait for selection is context-dependent and differs between societies, crops, and locations62. As society grapples with modern challenges such as a rapidly changing environment,


water scarcity and land degradation, it is becoming increasingly clear that a new era of trait selection is needed with increased focus on sustainability and microbiome


interactions63,64,65,66. In this regard, it is also time to reckon with the consequences of historic yield-centric trait selection and accompanying genomic sweeps34, especially with regards


to plant–microbe interactions (Fig. 8a, b). Current approaches to investigating the genomic architecture determining microbiome assembly rely primarily on mutational studies in known genes


and pathways. More recently, studies leveraging the natural variation within plant populations have been used to conduct GWA and QTL of the leaf20,67 and rhizosphere18. To date, the


microbiome has been primarily characterized through amplicon sequencing, thereby providing limited functional resolution of microbiome structure. Increasing the resolution of phenotyping of


quantitative traits has been shown to improve the precision and detection of QTLs68. Thus, integrating microbial genomics into microbiome QTL analysis plays a dual purpose; increasing the


ecological resolution with which microbial traits may be mapped (e.g., at a community and population level, Fig. 8c), and second, affording the identification of the reciprocal microbial


adaptations that drive plant–microbe interactions (e.g., by using SNVs a microbiome features). In this investigation, we addressed these challenges by integrating amplicon and shotgun


metagenome sequencing to identify microbiome QTLs for the tomato rhizosphere. One major difference between the amplicon and contig QTL analysis is the number of lineages for which QTLs were


identified. Amplicon-based sequencing, which captures more rare taxa per unit sequencing, provided a broader taxonomic picture and was able to capture QTLs of both abundant and relatively


rare rhizobacterial lineages. In contrast, the majority of contig QTLs mapped to the most predominant lineages yet failed to identify QTLs for more rare lineages. Nevertheless, besides the


fact that the shotgun-based approach provided functional insights into the associated bacterial taxa, the size of the 95% confidence interval of the QTL region was significantly smaller


using contig QTLs, with a median size of just 6.47 Mbp compared to 58.56 Mbp for the amplicon-based QTL regions. Furthermore, for _Streptomyces_, the number of unique QTLs identified was


greater in the contig-based approach. Thus, we identified a trade-off between amplicon and shotgun-based technologies, whereby amplicon sequencing provides a deeper view into broad community


structure, whereas shotgun-based approaches provided a more nuanced picture. In particular, the smaller regions identified by our contig-based metagenome mapping provided considerably more


functional insights as it enabled us to analyze the genomic content contained in the regions linked to _Cellvibrio_ and _Streptomyces_. It is possible that less stringent prioritization


steps could be used to increase the number of metagenomic features identified, but this may also increase the false discovery rate. It should be noted that a limitation of the approaches


taken is that both amplicon and shotgun-based approaches produce non-independent measurements. Here we use CSS normalization, one of the top performing computational approaches to address


compositional bias69. Nevertheless, future approaches that provide community level absolute ASV abundances will further minimize compositionality of the microbiome data and likely perform


better when mapping microbiome features as QTLs. Extending these studies to the endophytic compartment and including metatranscriptome analyses may also further improve the identification of


microbiome features, provided that the endophytic microbiome can be separated well from the plant cells to obtain sufficient sequencing depth. The increased QTL mapping resolution provided


by shotgun-based phenotyping of the microbiome combined with SNV analysis provided an approach to leverage both the host diversity of the RIL and the natural microbiome population diversity


to disentangle the reciprocal genomic adaptions between plants and natural microbiomes (Fig. 8d). For example, understanding the forces driving the abundances of rhizospheric _Streptomyces_


is of increasing interest and has been linked to both iron70 and water limitations54. Here, we pinpointed the genetic basis for these interactions among the short list of highly expressed


root-specific tomato genes linked positively to _Streptomyces_ abundance including both aquaporin and FIT. More specifically, the aquaporin (SlTIP2.3) has the highest fold change of all


tonoplast intrinsic proteins in the tomato genome in the roots when compared to all other organs71,72, while the FIT gene has been shown to largely control iron homeostasis in tomato35,73.


Future experiments will focus on functional validation by, among others, transcriptome analyses and site-directed mutagenesis of the microbial and plant genes identified. In addition to


these high priority genes, many other key genes were identified in these regions. Those previously shown to contribute to microbiome assembly included 1-aminocyclopropane-1-carboxylate


oxidase, which plays a central role in plant regulation of various processes including bacterial colonization and root elongation74 and alpha-humulene/(-)-(E)-beta-caryophyllene synthase, a


terpene known to modify microbiome structure39. In addition, numerous genes related to growth, development, and cell wall loosening75 known to be involved in microbial colonization76 and


aluminum-activated malate transporter, which has been linked to microbiome-mediated abiotic stress tolerance40 and selected during tomato domestication resulting in high malate content in


both fruit and roots41. Both low-malate and high-malate haplotypes have been identified in tomato41, which may form the basis of future studies investigating the role of malate exudation in


microbiome assembly. The historic impact of domestication on genomic regions linked to microbiome assembly is also apparent (Fig. 6, Supplementary Data 14, and Supplementary Fig. 4).


However, the processes and consequences of these sweeps, and possible subsequent recombination events on microbiome assembly remain unclear. In particular, the discontinuity of sweeps in


microbiome QTL regions suggests that evolutionary pressure for recombination of key (microbiome associated) traits, such as iron homeostasis and water transport, may have acted against


selective sweeps. The approach developed here provides the means to illuminate such complex eco-evolutionary questions, forming the basis of integrating the microbiome into the classic


genotype by environment model of host phenotype10. From the microbial perspective, the increased resolution in QTL analysis afforded by our shotgun-based approach also provided a window into


the host-specific bacterial adaptations to wild and modern alleles. In particular, the SNV QTL analysis demonstrated that genes related to the degradation of various plant-associated


polysaccharides in _Streptomyces_ were associated positively with the modern reference allele. Many other functions were identified in both plant and microbe, such as trehalose metabolism,


polyamine metabolism, and acetolactate synthase, suggesting either a direct link through cross-feeding77 or signaling78, or perhaps shared ecological pressures. While the microbial


adaptations related to polysaccharides79, vitamins80 and iron metabolism47,70 are well documented in relation to plant colonization, here we demonstrate that the reciprocal adaptations that


drive plant–microbe interactions can be investigated simultaneously to uncover their genetic architecture in both host and microbiome (Fig. 8d). From a societal context, linking quantitative


genetics with community level microbiome data provides us a tool to understand the complex genotype, environment, microbiome, and management interactions that shape our agroecosystems


structure and function. Armed with these tools and molecular insights, we can begin to re-envision the agroecosystem; targeting QTLs for improved plant–microbe interactions, identifying


‘missing microbes’ or functions lost during the domestication process, or pinpointing the molecules that drive these interactions. METHODS RECOMBINANT INBRED LINE POPULATION An F8 RIL


population derived from the parental lines _Solanum lycopersicum_ cv. Moneymaker (modern) and _Solanum pimpinellifolium_ L. accession CGN14498 (wild) consisting of 100 lines were used for


this study23. A high density map produced from this population was used to map QTLs26. GROWTH CONDITIONS FOR RIL The natural soil was collected in June 2017 from a tomato greenhouse in


South-Holland, The Netherlands (51°57’47”N 4°12’16”E). The soil was sieved, air dried, and stored at room temperature until use in 2019. Before the beginning of the experiment, soil moisture


was adjusted to 20% water by volume using deionized water. All soil was homogenized by thorough mixing and allowed to sit, covered by a breathable cloth, in the greenhouse for one week


prior to potting. The soil was then homogenized once again and then potted. Each pot was weighed to ensure all pots were 175 g ± 0.5 (wet weight). Duplicate pots for each accession were


planted, as well as six replicates of each modern and wild parental accession, and 8 bulk soil pots that were left unseeded. Each replicate was prepared simultaneously. Planting was done


separately representing biological replicates. In each pot, 3 seeds were planted in a triangular pattern to ensure the germination success for all pots. The first seedling to emerge in each


pot was retained and others were removed after germination. All pots were randomly distributed in trays containing approximately 10 plants. Throughout growth, careful attention was given to


randomize the distribution of plants. First, tray location and orientation with relation to each other were randomized on a nearly daily basis. In addition, the distribution of plants within


trays was randomized three times during growth. All pots were kept covered with a transparent lid until germination, which was scored daily. After germination, plants were visually


monitored and watered at the same rates. To minimize the impact of environmental differences between pots on microbiome composition, the watering regime for all plants was standardized and


leaks from the bottom of the pot and overflows were completely prevented. To achieve this, a minimal volume (2.5–5.0 mL) of water was used at each watering. This strategy was successful as


washout was never observed. Moisture content was measured by weighing the pots at the middle and end of the experiment to ensure all pots had similar moisture contents. HARVESTING AND


PROCESSING OF PLANT MATERIALS All plants had between 5 and 7 true leaves at harvest (Supplementary Data 1). Plants were gently removed from the pot and roots and were vigorously shaken. Soil


that remained attached to the roots after this stage was considered the rhizosphere. The remaining bulk soil and rhizosphere (plus roots) fractions were weighed. The root and attached


rhizosphere fraction were treated with 4 mL of lifeguard, vortexed, and sonicated. Roots were then removed. The remaining rhizosphere sample was then stored in LifeGuard Soil Preservation


Solution (Qiagen) at −20 °C until DNA extraction. The dry weight of shoots was measured after drying at 60 °C. The dry weight of the bulk soil was measured after storing at room temperature


in open paper bags for 1 month. The DNA was extracted using the DNeasy PowerSoil extraction kit (Qiagen). The protocol was optimized for the soil in the following manner: each sample was


vortexed and then a volume of approximately 1.5 mL was transferred into 2 mL tubes. This subsample was centrifuged at 10,000 × _g_ for 30 s such that a pellet was formed. The supernatant was


removed, and a new subsample was transferred, and centrifuged until the total volume of the original sample, without sand, had been transferred to the 2 mL tubes. The resulting pellet was


recalcitrant to disruption through bead beating, and therefore was physically disrupted by a pipette tip before proceeding with DNA extraction protocol. In test samples, DNA extractions from


the sand fraction yielding no, or marginal levels of DNA. RRNA AMPLICON SEQUENCE PROCESSING All DNA was sent to BaseClear (Leiden, The Netherlands) for 16S rRNA gene 300 bp paired-end


amplicon sequencing (MiSeq platform). MiSeq primers targeted the V3-V4 region of Bacteria: 341FCCTACGGGNGGCWGCAG, 805RGACTACHVGGGTATCTAATCC. In total, 20,542,135 16S rRNA gene amplicon read


pairs over 225 samples were generated. The raw reads were processed using the DADA2 workflow (v1.14.1) to produce amplicon sequence variants (ASV) and to assign taxonomy based on the Silva


database version 13881,82 (Supplementary Data 2). ASVs tagged as non-bacterial, chloroplast, or mitochondria were removed. Next, ASV counts were normalized using the cumulative sum scaling


(CSS) (Supplementary Data 3), which has been shown to be one of the most effective computational transformation techniques69, and filtered based on the effective sample size using the


metagenomeSeq package (v1.28.2)27. Differential abundances between rhizosphere and bulk soil were determined using the eBayes function from the limma package. Enriched rhizosphere ASVs with


a greater than log(2) fold change in abundance were analyzed based on their presence and absence, standard deviation and mean values. Using these statistics, stochastic ASVs (<50% of


samples) were removed from further analysis (Supplementary Data 4). All ASV sequences may be found in Supplementary Data 5. The remaining microbiome features were then mapped as QTLs as


described subsequently. To investigate diversity metrics as quantitative traits, the Shannon diversity of each sample was calculated using all ASV after filtering based on the effective


sample size using the metagenomeSeq package (v1.28.2)27, and using all ASV in greater than 50% of samples (Supplementary Data 21). Similarly, a PCoA analysis using Bray Curtis distances was


conducted, and the values for principle components axis 1 and 2 were extracted (Supplementary Data 22). Both calculations were done in phyloseq version 1.34.083. These diversity-based


microbiome features were then mapped as QTLs as described subsequently. METAGENOMICS ANALYSIS For the one set of replicates for each accession, paired-end sequence read libraries were


generated in the length of 150 bp per read on NovaSeq paired-end platform by BaseClear B.V. Demultiplexing was performed before the following analysis. It is computationally expensive to


assemble the 114 read libraries all at once. Therefore, a strategy of (merging) partial assemblies was undertaken. Two assemblers were used to create the assembled contigs, namely SPAdes


(version 3.13.2)84 and MEGAHIT (version 1.2.9)85. Assembly quality was assessed by running MultiQC (version 1.8)86 with Quast Module87 (Supplementary Figure 5). First, 6 modern parents, 5


wild parents, and 1 bulk soil sample were co-assembled via SPAdes with the metagenomic mode and parameter of -k 21,33,55,99, generating the first assembly (A1). Subsequently, a second


assembly (A2) was done using the unmapped reads from the remaining metagenomes using MEGAHIT with the parameter of --k-list 27,33,55,77,99. The third assembly (A3) was performed similarly as


A2, however, included the unmapped reads, ambiguously mapped reads, and mapped reads with a low mapping quality score (MapQ < 20) (Supplementary Data 18). Read mapping was done with


BWA-MEM with default settings88 and SAMtools was used to convert the resulting SAM files into sorted and indexed BAM files (version 1.10). Extraction of these reads was conducted by samtools


bam2fq. Redundancy between assemblies was evaluated by alignment to A1 via nucmer package of MUMmer with --maxmatch option (version:4.0.0)89. Firstly, 111.5 Gbp of reads from the parental


samples were assembled, labeled as A1, and yielded a total assembly length of 8.6 Gbp with the largest contig of 933.0 kilobase pairs (Kbp). After aligning the reads from RIL samples to A1,


unmapped reads, ambiguously mapped reads, and mapped reads with a low mapping quality score (MapQ < 20) were retrieved and assembled, yielding the second and third assembly (A2 and A3).


Specifically, A2 stemmed from solely the unmapped reads while A3 included the ambiguously mapped reads and mapped reads with MapQ < 20 in addition to the unmapped reads. A2 and A3


produced a total assembly length of 9.6 Gbp and 14.0 Gbp, with the largest contig of 56.2 and 86.3 Kbp respectively. There were 1.2, 2.0, and 2.8 million contigs with the length over 1 Kb


for A1, A2, and A3 respectively. In particular, 912 contigs in A1 were greater or equal to 50 Kbp whereas 1 or 2 such large contigs were successfully assembled in A2 or A3. The detailed


assembly statistics is given in Supplementary Data 18 and the numbers of contigs with different ranges of length for each assembly are presented in Supplementary Fig. 5. The sequence


similarities of the contigs in each assembly (≥1 Kbp) were compared using the nucmer package in MUMer. No contigs in A2 were reported to share an overlapped region with A1, therefore contigs


in A1 and A2 could be merged directly. When A3 was aligned to A1, 1.1% of the total length (≥1 Kbp) of A3 was reported to be overlapped with A1, however, only 18 contigs from A3 were 100%


identical to regions in larger contigs in A1. The sensitivity of filtering the overlapping contigs was evaluated by a benchmarking test using a random RIL sample to calculate the mapping


rates (Supplementary Fig. 6). 83.4% reads were mapped to A1 + A3 at MapQ ≥ 20 without filtering. Excluding the contigs from A3 that were completely and identically covered by A1, the mapping


rate was nearly the same as the one without filtering. Nevertheless, the removal of all aligned contigs in A3 resulted in a slight drop of mapping rate to 82.6%. To conclude, the final


assembly was determined as A1 + A3 with the 18 redundant contigs from A3 removed. To assess the overall assembly quality and quantify the abundance of contigs among all samples, metagenomic


reads were mapped to A1, A1 + A2, and A1 + A3 (deduplicated) respectively. Afterwards, the mapping rates were calculated for the mapped reads with MapQ > 20 in each sample. As shown in


Supplementary Fig. 7, approximately 70% reads among rhizosphere samples could be mapped to A1, while the mapping rates were 55 to 65% in the bulk soil samples. With the unmapped reads


assembled and added to A1, the mapping rates for A1 + A2 increased by 10%. The read recruitment was further improved by assembling and adding ambiguously mapped reads and mapped reads with


low MapQ in the final assembly (A1 + A3). A1, as well as de-replicated A3, were merged to acquire the final assembly. All the ‘contigs’ mentioned below are referring to the contigs in this


final assembly. BINNING OF METAGENOMIC CONTIGS Metabat2 (version 2:2.15)90 was used for assigning the contigs into genomic bins. Based on tetra-nucleotide frequency and abundance scores, 588


genomic bins were generated. Afterwards, genomic quality of those genomes was evaluated by CheckM (version: 1.1.1)30 with the command “checkm linage_wf” (Supplementary Data 8). The 33


genomes displaying the completeness larger than 90% and contamination smaller than 5% were used for further study as quantitative traits. MAKING PHENOTYPE FILES BASED ON CONTIG DEPTH Read


counts for each position on the assembled contigs were acquired using bedtools genomecov (version: 2.29.2)91. A custom Python script was applied to calculate the average depth (defined as


the number of total mapped reads divided by contig length) and coverage (defined as the number of covered base pairs divided by contig length) of every contig. Furthermore, the average


abundance of contigs assigned into a bin was calculated for the high-quality genomic bins detected by CheckM30. FEATURE SELECTION Average depths of the contigs were first normalized using


the CSS and filtered based on the effective sample size using metagenomeSeq package (v1.28.2)27. Differential abundance analysis was performed by moderated _t_-tests between groups using the


makeContrasts and eBayes commands retrieved from the R package Limma (v.3.22.7)92. Obtained _P_-values were adjusted using the Benjamini–Hochberg correction method. Differences in the


abundance of contigs between groups were considered significant when adjusted _P_-values were lower than 0.01 (Supplementary Data 19). In either comparison, the contigs that were


significantly enriched in the rhizosphere were gathered and regarded as the statistically rhizosphere-enriched contigs after removing the replicated ones. To perform QTL analysis for the


abundance of these enriched rhizosphere contigs, only the contigs with biological meanings were kept, i.e., the log (2) fold-change of mean values for the normalized abundances of RIL and


bulk samples should be greater than 2, and the contig should be in enough depth with at least the mean value of a group larger than 1. This selection step resulted in 1249


rhizosphere-enriched contigs. The statistics of the filtered normalized abundance were further inspected based on the presence and absence of contigs, standard deviation, and mean values of


the counts. TAXONOMIC AND FUNCTIONAL ANNOTATION OF THE METAGENOME Taxonomic classifications were assigned to the contigs in the final assembly using Kraken2 (version: 2.0.8)31 based on exact


_k_-mer matches. A custom Kraken2 database was built to contain _RefSeq_ complete genomes/proteins of archaea, bacteria, viral, fungi, and protozoa. Univec_Core was also included in the


custom database (20200308). Using the Kraken2 standard output, a python script based on TaxonKit93 was utilized to add full taxonomic names to each contig in the format of tab-delimited


table. 76.22% of the contigs > 1 kb were classified. Among the contigs >10 kb, up to 99.44% contigs were classified. Prokaryotic microbial genes were predicted by Prodigal (version:


2.6.3)94 with metagenomics mode. 10,246,55 genes were predicted from contigs > 1 kb. Open reading frames (ORFs) on contigs >10 kb were annotated by prokka (v1.14.5) and the


_Streptomyces_ ASV5 bin (MAG.72) was further annotated by DRAM (v1.2.0) integrating UniRef, Pfam, dbCAN and KEGG databases95. To assess the impact of the prioritization on the functional


representation of the metagenome, we identified the fraction of protein clusters represented in the rhizosphere-enriched contigs compared to the rest of the contigs greater than 10 kb.


First, Prodigal was used in metagenomics mode to predict genes in the metagenomic assembly with contigs longer than 10 kbp. Next, MMSeqs2 was used to cluster the protein sequences based on


70% similarity and based on 50% similarity, and with or without partial predicted genes32. To calculate the number of clusters that contained proteins encoded in rhizosphere-enriched


contigs, the clusters were searched for the presence of protein IDs of the 1249 rhizosphere-enriched contigs. In total, approximately 8.3% of protein clusters contained genes from the


rhizosphere-enriched contigs. In addition to proteins contained on rhizosphere-enriched contigs, these clusters contained approximately 25% of all proteins encoded in contigs larger than 10 


kb (Supplementary Data 20). SINGLE NUCLEOTIDE VARIANT ANALYSIS To investigate strain level QTLs, we mapped single nucleotide variants (SNVs) identified using inStrain on the 1249


rhizosphere-enriched contigs. A total of 555, 382, and 535,432 SNVs were identified in the modern and wild parental metagenomes respectively. Of these, 162,299 and 142,349 SNVs were unique


to each dataset respectively, as they either contained only reference alleles or did not exceed the inStrain SNV calling thresholds. For each unique SNV locus, coverage in the other dataset


was determined using SAMtools depth after read filtering with settings comparable to inStrain and was considered identical to the reference allele frequency. Including the unique SNVs, this


resulted in a final set of 697,731 SNVs. To select SNVs that showed differential reference allele frequencies between MM and P, first the difference in reference allele frequency (MM–P) was


calculated per SNV. From the distribution of all SNVs, the 95% confidence interval (CI) was determined to select the 5% (30,911) most different SNVs (Supplementary Fig. 8). SNVs were further


selected using a Fisher’s exact test based on the allele read count differences between MM and P. _P_-values were sorted, and a final selection of 15,026 differentially abundant SNVs


distributed over 1037 contigs was obtained using a Benjamini-Hochberg false discovery rate (FDR) correction of 0.01. SNV allele read counts were extracted from the RIL dataset using the


pysam Python package after filtering with settings comparable to inStrain. QUANTITATIVE TRAIT LOCUS ANALYSIS The QTL analysis linking selected amplicon, contig, bin, and SNV features with


plant loci was performed using the R package R/qtl225. Pseudomarkers were added to the genetic map to increase resolution, with a step distance of 1 Mbp between the markers and


pseudomarkers. Plant genome probabilities were calculated using the genetic map with pseudomarkers, plant loci cross data, and error probability of 1E-4. Plant locus kinship matrix was


calculated as proportion of shared alleles using conditional allele probabilities of all plant chromosomes, which were calculated from the plant genome probabilities. A genome scan using a


single-QTL model using a linear mixed model was performed on the SNV allele read counts as phenotypes, plant genotype probabilities as input variables and as covariates the number of leaves,


harvest day, rhizosphere soil weight (g), soil starting weight (g) and plant dry weight (g). The LOD score was determined for each plant locus SNV allele combination. A permutation test


using randomized data was performed with 1000 permutations to assess the distribution of the LOD scores. The 95% quantile was used as threshold for the selection of LOD peaks, as well as a


_P_ = 0.95 Bayes credible interval probability. INDEPENDENT VALIDATION OF QTLS THROUGH BULK SEGREGANT ANALYSIS To validate the QTLs, 33 _Solanum lycopersicum_ cv. Moneymaker (modern), 30


_Solanum pimpinellifolium_ L. accession CGN14498, and 77 RIL accessions (with replicates of 4 each) were grown and their microbiomes characterized through 16S rRNA gene amplicon sequencing.


Parental lines and RIL accessions were germinated in pots filled with 300 g agricultural soil. For each accession, were planted with six plants per replicate pot. The plants were arranged


randomly in the growth chamber (25 °C, 16 h daylight) and watered every day. Bulk soil samples without plants were used as controls (_N_ = 31). Rhizospheric soil was collected according to


standard methods96. In order to synchronize the developmental stage, the plants were harvested after 21 days, or when the 3rd trifoliate leaf was reached. The soil loosely attached to the


roots was removed and the entire root system was transferred to a 15 mL tube containing 5 mL LifeGuard Soil Preservation Solution (MoBio Laboratories). The tubes were vigorously vortexed and


sonicated. Subsequently, the roots were removed and at least 1 g (wet weight) of rhizospheric soil was recovered per sample for DNA extraction. For the bulk soil samples, approximately 1 g


of soil was collected and mixed with 5 mL of LifeGuard solution. To extract rhizospheric DNA, PowerSoil Total DNA/RNA Isolation Kit (MoBio Laboratories, Inc., USA) was used in accordance


with the manufacturer’s instruction. Rhizospheric DNA was obtained using RNA PoweSoil DNA Elution Accessory Kit (MoBio Laboratories, Inc. USA). The quantity and quality of the obtained DNA


was checked by ND1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA) and Qubit 2.0 fluorometer (ThermoFisher Scientific, USA). DNA samples were stored at −20 °C until further


use. The extracted samples were used for amplification and sequencing of the 16S rRNA gene, targeting the variable V3–V4 (Forward Primer: 5′-CCTACGGGNGGCWGCAG-3′ Reverse Primer:


5′-GACTACHVGGGTATACTAATCC-3′) resulting in amplicons of approximately ~460 bp. Dual indices and Illumina sequencing adapters using the Nextera XT Index Kit were attached to the V3–V4


amplicons. Subsequently, library quantification, normalization, and pooling were performed and MiSeq v3 reagent kits were used to finally load the samples for MiSeq sequencing. For more info


please refer to the guidelines of Illumina MiSeq System. The RDP extension to PANDASeq97, named Assembler98, was used to merge paired-end reads with a minimum overlap of 10 bp and at least


a Phred score of 25. Primer sequences were removed from the per sample FASTQ files using Flexbar version 2.599. Reads were processed as before except the Silva version 132 was used for


taxonomic classification82. REPORTING SUMMARY Further information on research design is available in the Nature Research Reporting Summary linked to this article. DATA AVAILABILITY The 16S


amplicons and shotgun metagenomics sequencing data have been deposited in the NCBI database under BioProject ID PRJNA787039 and PRJNA789467, respectively. Metagenome assembled genomes are


available at Zenodo [https://doi.org/10.5281/zenodo.6561541]. The Silva database was used to assign taxonomy to 16S rRNA amplicon sequences [https://www.arb-silva.de/download/archive/]. A


custom database was used to assign taxonomy for Kraken. Due to size limitation, this database is available upon request (please contact J.M.R. at [email protected] and expect 2


weeks of processing time). Source data are provided with this paper. CODE AVAILABILITY The code used in the analysis can be found at Zenodo [https://doi.org/10.5281/zenodo.6561541].


REFERENCES * Oyserman, B. O., Medema, M. H. & Raaijmakers, J. M. Road MAPs to engineer host microbiomes. _Curr. Opin. Microbiol._ 43, 46–54 (2018). Article  CAS  PubMed  Google Scholar 


* Marasco, R. et al. A drought resistance-promoting microbiome is selected by root system under desert farming. _PLoS One_ 7, e48479 (2012). Article  CAS  ADS  PubMed  PubMed Central  Google


Scholar  * Carrión, V. J. et al. Pathogen-induced activation of disease-suppressive functions in the endophytic root microbiome. _Science_ 366, 606–612 (2019). Article  ADS  CAS  PubMed 


Google Scholar  * Finkel, O. M. et al. A single bacterial genus maintains root growth in a complex microbiome. _Nature_ https://doi.org/10.1038/s41586-020-2778-7 (2020). Article  PubMed 


Google Scholar  * Wagner, M. R. et al. Microbe-dependent heterosis in maize. _Proc. Natl Acad. Sci. USA_ 118, e2021965118 (2021). Article  CAS  PubMed  PubMed Central  Google Scholar  *


Sasse, J., Martinoia, E. & Northen, T. Feed your friends: Do plant exudates shape the root microbiome? _Trends Plant Sci._ 23, 25–41 (2018). Article  CAS  PubMed  Google Scholar  *


Canarini, A., Kaiser, C., Merchant, A., Richter, A. & Wanek, W. Root exudation of primary metabolites: Mechanisms and their roles in plant responses to environmental stimuli. _Front.


Plant Sci._ 10, 157 (2019). Article  PubMed  PubMed Central  Google Scholar  * Tracanna, V. et al. Dissecting disease-suppressive rhizosphere microbiomes by functional amplicon sequencing


and 10× metagenomics. _mSystems_ 6, e0111620 (2021). * Crowley, D. E. _Iron Nutrition in Plants and Rhizospheric Microorganisms_ (eds Barton, L. L. & Abadia, J.) 169–198 (Springer


Netherlands, 2006). * Oyserman, B. O. et al. Extracting the GEMs: Genotype, environment, and microbiome interactions shaping host phenotypes. _Front. Microbiol._ 11, 574053 (2021). Article 


PubMed  PubMed Central  Google Scholar  * Pérez-Jaramillo, J. E., Carrión, V. J., de Hollander, M. & Raaijmakers, J. M. The wild side of plant microbiomes. _Microbiome_ 6, 143 (2018).


Article  PubMed  PubMed Central  Google Scholar  * Favela, A., O. Bohn, M. & D. Kent, A. Maize germplasm chronosequence shows crop breeding history impacts recruitment of the rhizosphere


microbiome. _ISME J._ https://doi.org/10.1038/s41396-021-00923-z (2021). * Gruber, K. Agrobiodiversity: The living library. _Nature_ 544, S8–S10 (2017). Article  CAS  ADS  PubMed  Google


Scholar  * Lopez-Delacalle, M. et al. Using tomato recombinant lines to improve plant tolerance to stress combination through a more efficient nitrogen metabolism. _Front. Plant Sci._ 10,


1702 (2019). Article  PubMed  Google Scholar  * Vosman, B. et al. QTL mapping of insect resistance components of Solanum galapagense. _Theor. Appl Genet._ 132, 531–541 (2019). Article  CAS 


PubMed  Google Scholar  * Liu, Z. et al. Identification of a Solanum pennellii chromosome 4 fruit flavor and nutritional quality-associated metabolite QTL. _Front. Plant Sci._ 7, 1671


(2016). Article  PubMed  PubMed Central  Google Scholar  * Pereira, P. A. A., Miranda, B. D., Attewell, J. R., Kmiecik, K. A. & Bliss, F. A. Selection for increased nodule number in


common bean (Phaseolus vulgaris L.). _Plant Soil_ 148, 203–209 (1993). Article  Google Scholar  * Deng, S. et al. Genome wide association study reveals plant loci controlling heritability of


the rhizosphere microbiome. _ISME J._ https://doi.org/10.1038/s41396-021-00993-z (2021). Article  PubMed  PubMed Central  Google Scholar  * Bergelson, J., Mittelstrass, J. & Horton, M.


W. Characterizing both bacteria and fungi improves understanding of the Arabidopsis root microbiome. _Sci. Rep._ 9, 24 (2019). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  *


Wallace, J. G., Kremling, K. A., Kovar, L. L. & Buckler, E. S. Quantitative genetics of the maize leaf microbiome. _Phytobiomes J._ 2, 208–224 (2018). Article  Google Scholar  *


Bergelson, J., Brachi, B., Roux, F. & Vailleau, F. Assessing the potential to harness the microbiome through plant genetics. _Curr. Opin. Biotechnol._ 70, 167–173 (2021). Article  CAS 


PubMed  Google Scholar  * Olm, M. R. et al. inStrain profiles population microdiversity from metagenomic data and sensitively detects shared microbial strains. _Nat. Biotechnol._ 39, 727–736


(2021). Article  CAS  PubMed  Google Scholar  * Voorrips, R. E., Verkerke, W., Finkers, R., Jongerius, R. & Kanne, J. Inheritance of taste components in tomato. _Acta Physiol. Plant_


22, 259–261 (2000). Article  CAS  Google Scholar  * Khan, N. et al. Exploring the natural variation for seedling traits and their link with seed dimensions in tomato. _PLoS One_ 7, e43991


(2012). Article  CAS  ADS  PubMed  PubMed Central  Google Scholar  * Broman, K. W. et al. R/qtl2: Software for mapping quantitative trait loci with high-dimensional data and multiparent


populations. _Genetics_ 211, 495–502 (2019). Article  CAS  PubMed  PubMed Central  Google Scholar  * Sterken, M. G. et al. Plasticity of maternal environment dependent expression-QTLs of


tomato seeds. Preprint at _bioRxiv_ https://doi.org/10.1101/2021.03.29.437558 (2021). * Paulson, J. N., Stine, O. C., Bravo, H. C. & Pop, M. Differential abundance analysis for microbial


marker-gene surveys. _Nat. Methods_ 10, 1200–1202 (2013). Article  CAS  PubMed  PubMed Central  Google Scholar  * Lorenz, K. & Cohen, B. A. Small- and large-effect quantitative trait


locus interactions underlie variation in yeast sporulation efficiency. _Genetics_ 192, 1123–1132 (2012). Article  CAS  PubMed  PubMed Central  Google Scholar  * Kang, D. D., Froula, J.,


Egan, R. & Wang, Z. MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. _PeerJ_ 3, e1165–e1165 (2015). Article  CAS  PubMed 


PubMed Central  Google Scholar  * Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: Assessing the quality of microbial genomes recovered from isolates,


single cells, and metagenomes. _Genome Res._ 25, 1043–1055 (2015). Article  CAS  PubMed  PubMed Central  Google Scholar  * Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic


analysis with Kraken 2. _Genome Biol._ 20, 257 (2019). Article  CAS  PubMed  PubMed Central  Google Scholar  * Steinegger, M. & Söding, J. MMseqs2 enables sensitive protein sequence


searching for the analysis of massive data sets. _Nat. Biotechnol._ 35, 1026–1028 (2017). Article  CAS  PubMed  Google Scholar  * The Tomato Genome Consortium. The tomato genome sequence


provides insights into fleshy fruit evolution. _Nature_ 485, 635–641 (2012). Article  ADS  CAS  Google Scholar  * Lin, T. et al. Genomic analyses provide insights into the history of tomato


breeding. _Nat. Genet._ 46, 1220–1226 (2014). Article  CAS  PubMed  Google Scholar  * Ling, H.-Q., Bauer, P., Bereczky, Z., Keller, B. & Ganal, M. The tomato fer gene encoding a bHLH


protein controls iron-uptake responses in roots. _Proc. Natl Acad. Sci. USA_ 99, 13938–13943 (2002). Article  CAS  ADS  PubMed  PubMed Central  Google Scholar  * Morcillo, R. et al. Plant


transcriptome reprograming and bacterial extracellular metabolites underlying tomato drought resistance triggered by a beneficial soil bacteria. _Metabolites_ 11, 369 (2021). Article  CAS 


PubMed  PubMed Central  Google Scholar  * Zhou, F. & Pichersky, E. The complete functional characterisation of the terpene synthase family in tomato. _N. Phytol._ 226, 1341–1360 (2020).


Article  CAS  Google Scholar  * Kong, H. G., Song, G. C., Sim, H.-J. & Ryu, C.-M. Achieving similar root microbiota composition in neighbouring plants through airborne signalling. _ISME


J._ 15, 397–408 (2021). Article  CAS  PubMed  Google Scholar  * Huang, M. et al. The major volatile organic compound emitted from _Arabidopsis thaliana_ flowers, the sesquiterpene


(_E_)‐β‐caryophyllene, is a defense against a bacterial pathogen. _N. Phytologist_ 193, 997–1008 (2012). Article  CAS  Google Scholar  * Sweeney, C., Lakshmanan, V. & Bais, H. P.


Interplant aboveground signaling prompts upregulation of auxin promoter and malate transporter as part of defensive response in the neighboring plants. _Front. Plant Sci_. 8, 595 (2017). *


Ye, J. et al. An InDel in the promoter of _Al-ACTIVATED MALATE TRANSPORTER9_ selected during tomato domestication determines fruit malate contents and aluminum tolerance. _Plant Cell_ 29,


2249–2268 (2017). Article  CAS  PubMed  PubMed Central  Google Scholar  * Safdarian, M., Askari, H., Shariati, J. V. & Nematzadeh, G. Transcriptional responses of wheat roots inoculated


with Arthrobacter nitroguajacolicus to salt stress. _Sci. Rep._ 9, 1792 (2019). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Nozoye, T. The nicotianamine synthase gene is a


useful candidate for improving the nutritional qualities and Fe-deficiency tolerance of various crops. _Front. Plant Sci._ 9, 340 (2018). Article  PubMed  PubMed Central  Google Scholar  *


Blin, K. et al. antiSMASH 6.0: Improving cluster detection and comparison capabilities. _Nucleic Acids Res._ 49, W29–W35 (2021). Article  CAS  PubMed  PubMed Central  Google Scholar  *


Chewning, S. S. et al. Root-associated streptomyces isolates harboring melC genes demonstrate enhanced plant colonization. _Phytobiomes J._ 3, 165–176 (2019). Article  Google Scholar  *


Schöner, T. A. et al. Aryl polyenes, a highly abundant class of bacterial natural products, are functionally related to antioxidative carotenoids. _ChemBioChem_ 17, 247–253 (2016). Article 


CAS  PubMed  Google Scholar  * Gu, S. et al. Competition for iron drives phytopathogen control by natural rhizosphere microbiomes. _Nat. Microbiol._ 5, 1002–1010 (2020). Article  CAS  PubMed


  PubMed Central  Google Scholar  * Crits-Christoph, A., Bhattacharya, N., Olm, M. R., Song, Y. S. & Banfield, J. F. Transporter genes in biosynthetic gene clusters predict metabolite


characteristics and siderophore activity. _Genome Res._ 31, 239–250 (2021). Article  PubMed Central  Google Scholar  * Attia, M. A. et al. In vitro and in vivo characterization of three


Cellvibrio japonicus glycoside hydrolase family 5 members reveals potent xyloglucan backbone-cleaving functions. _Biotechnol. Biofuels_ 11, 45 (2018). Article  CAS  PubMed  PubMed Central 


Google Scholar  * Nelson, C. E. et al. Comprehensive functional characterization of the glycoside hydrolase family 3 enzymes from _Cellvibrio japonicus_ reveals unique metabolic roles in


biomass saccharification: Complex glucan utilization in _C. japonicus_. _Environ. Microbiol._ 19, 5025–5039 (2017). Article  CAS  PubMed  PubMed Central  Google Scholar  * Gardner, J. G. et


al. Systems biology defines the biological significance of redox‐active proteins during cellulose degradation in an aerobic bacterium. _Mol. Microbiol._ 94, 1121–1133 (2014). Article  CAS 


Google Scholar  * Yin, Y. et al. dbCAN: a web resource for automated carbohydrate-active enzyme annotation. _Nucleic Acids Res._ 40, W445–W451 (2012). Article  CAS  PubMed  PubMed Central 


Google Scholar  * Schrempf, H. & Walter, S. The cellulolytic system of Streptomyces reticuli. _Int. J. Biol. Macromolecules_ 17, 353–355 (1995). Article  CAS  Google Scholar  * Worsley,


S. F. et al. Investigating the role of root exudates in recruiting streptomyces bacteria to the Arabidopsis thaliana microbiome. _Front. Mol. Biosci._ 8, 686110 (2021). Article  CAS  PubMed


  PubMed Central  Google Scholar  * Kamilova, F. et al. Organic acids, sugars, and l -tryptophane in exudates of vegetables growing on stonewool and their effects on activities of


rhizosphere bacteria. _MPMI_ 19, 250–256 (2006). Article  CAS  PubMed  Google Scholar  * Bentley, S. D. et al. Complete genome sequence of the model actinomycete Streptomyces coelicolor


A3(2). _Nature_ 417, 141–147 (2002). Article  ADS  PubMed  Google Scholar  * Matilla, M. A., Espinosa-Urgel, M., Rodríguez-Herva, J. J., Ramos, J. L. & Ramos-González, M. I. Genomic


analysis reveals the major driving forces of bacterial life in the rhizosphere. _Genome Biol._ 8, R179 (2007). Article  CAS  PubMed  PubMed Central  Google Scholar  * Chaparro, J. M. et al.


Root exudation of phytochemicals in Arabidopsis follows specific patterns that are developmentally programmed and correlate with soil microbial functions. _PLoS One_ 8, e55731 (2013).


Article  CAS  ADS  PubMed  PubMed Central  Google Scholar  * Polizeli, M. L. T. M. et al. Xylanases from fungi: Properties and industrial applications. _Appl Microbiol. Biotechnol._ 67,


577–591 (2005). Article  CAS  PubMed  Google Scholar  * Mellerowicz, E. J. & Gorshkova, T. A. Tensional stress generation in gelatinous fibres: A review and possible mechanism based on


cell-wall structure and composition. _J. Exp. Bot._ 63, 551–565 (2012). Article  CAS  PubMed  Google Scholar  * Sugiura, H. et al. Bacterial inducible expression of plant cell wall-binding


protein YesO through conflict between Glycine max and saprophytic Bacillus subtilis. _Sci. Rep._ 10, 18691 (2020). Article  CAS  ADS  PubMed  PubMed Central  Google Scholar  * Meyer, R. S.,


DuVal, A. E. & Jensen, H. R. Patterns and processes in crop domestication: An historical review and quantitative analysis of 203 global food crops: Tansley review. _N. Phytologist_ 196,


29–48 (2012). Article  Google Scholar  * Gopal, M. & Gupta, A. Microbiome selection could spur next-generation plant breeding strategies. _Front. Microbiol_. 7, 1971 (2016). * Busby, P.


E. et al. Research priorities for harnessing plant microbiomes in sustainable agriculture. _PLoS Biol._ 15, e2001793 (2017). Article  CAS  PubMed  PubMed Central  Google Scholar  *


Beilsmith, K. et al. Genome-wide association studies on the phyllosphere microbiome: Embracing complexity in host-microbe interactions. _Plant J._ 97, 164–181 (2019). Article  CAS  PubMed 


Google Scholar  * Wille, L., Messmer, M. M., Studer, B. & Hohmann, P. Insights to plant–microbe interactions provide opportunities to improve resistance breeding against root diseases in


grain legumes. _Plant, Cell Environ._ 42, 20–40 (2019). Article  CAS  Google Scholar  * Horton, M. W. et al. Genome-wide association study of Arabidopsis thaliana leaf microbial community.


_Nat. Commun._ 5, 5320 (2014). Article  ADS  PubMed  Google Scholar  * Sideli, G. M. et al. Quantitative phenotyping of shell suture strength in walnut (Juglans regia L.) enhances precision


for detection of QTL and genome-wide association mapping. _PLoS One_ 15, e0231144 (2020). Article  CAS  PubMed  PubMed Central  Google Scholar  * Lloréns-Rico, V., Vieira-Silva, S.,


Gonçalves, P. J., Falony, G. & Raes, J. Benchmarking microbiome transformations favors experimental quantitative approaches to address compositionality and sampling depth biases. _Nat.


Commun._ 12, 3562 (2021). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Xu, L. et al. Genome-resolved metagenomics reveals role of iron metabolism in drought-induced


rhizosphere microbiome dynamics. _Nat. Commun._ 12, 3209 (2021). Article  CAS  ADS  PubMed  PubMed Central  Google Scholar  * Sade, N. et al. Improving plant stress tolerance and yield


production: Is the tonoplast aquaporin SlTIP2; 2 a key to isohydric to anisohydric conversion? _N. Phytologist_ 181, 651–661 (2009). Article  CAS  Google Scholar  * Reuscher, S. et al.


Genome-wide identification and expression analysis of aquaporins in tomato. _PLoS One_ 8, e79052 (2013). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Schwarz, B. & Bauer,


P. FIT, a regulatory hub for iron deficiency and stress signaling in roots, and FIT-dependent and -independent gene signatures. _J. Exp. Bot._ 71, 1694–1705 (2020). Article  CAS  PubMed 


PubMed Central  Google Scholar  * Nascimento, F. X., Rossi, M. J. & Glick, B. R. Ethylene and 1-aminocyclopropane-1-carboxylate (ACC) in plant–bacterial interactions. _Front. Plant Sci._


9, 114 (2018). Article  PubMed  PubMed Central  Google Scholar  * Cosgrove, D. J. Catalysts of plant cell wall loosening. _F1000Res_ 5, 119 (2016). Article  Google Scholar  * Cosgrove, D.


J. Microbial expansins. _Annu. Rev. Microbiol._ 71, 479–497 (2017). Article  CAS  PubMed  Google Scholar  * Smith, N. W., Shorten, P. R., Altermann, E., Roy, N. C. & McNabb, W. C. The


classification and evolution of bacterial cross-feeding. _Front. Ecol. Evol._ 7, 153 (2019). Article  Google Scholar  * Lunn, J. E., Delorge, I., Figueroa, C. M., Van Dijck, P. & Stitt,


M. Trehalose metabolism in plants. _Plant J._ 79, 544–567 (2014). Article  CAS  PubMed  Google Scholar  * Beauregard, P. B., Chai, Y., Vlamakis, H., Losick, R. & Kolter, R. Bacillus


subtilis biofilm induction by plant polysaccharides. _Proc. Natl Acad. Sci. USA_ 110, E1621–E1630 (2013). Article  CAS  ADS  PubMed  PubMed Central  Google Scholar  * Streit, W. R. Biotin


and other water-soluble vitamins are key growth factors for alfalfa root colonization by _Rhizobium meliioti_ 1021. _MPMI_ 9, 330 (1996). Article  CAS  PubMed  Google Scholar  * Callahan, B.


J. et al. DADA2: High-resolution sample inference from Illumina amplicon data. _Nat. Methods_ 13, 581–583 (2016). Article  CAS  PubMed  PubMed Central  Google Scholar  * Quast, C. et al.


The SILVA ribosomal RNA gene database project: Improved data processing and web-based tools. _Nucleic Acids Res._ 41, D590–D596 (2012). Article  CAS  PubMed  PubMed Central  Google Scholar 


* McMurdie, P. J. & Holmes, S. phyloseq: An R package for reproducible interactive analysis and graphics of microbiome census data. _PLoS One_ 8, e61217 (2013). Article  CAS  ADS  PubMed


  PubMed Central  Google Scholar  * Bankevich, A. et al. SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. _J. Comput. Biol.: J. Comput. Mol. Cell


Biol._ 19, 455–477 (2012). Article  MathSciNet  CAS  Google Scholar  * Li, D. et al. MEGAHIT v1.0: A fast and scalable metagenome assembler driven by advanced methodologies and community


practices. _Methods_ 102, 3–11 (2016). Article  CAS  PubMed  Google Scholar  * Ewels, P., Magnusson, M., Lundin, S. & Käller, M. MultiQC: Summarize analysis results for multiple tools


and samples in a single report. _Bioinformatics_ 32, 3047–3048 (2016). Article  CAS  PubMed  PubMed Central  Google Scholar  * Mikheenko, A., Prjibelski, A., Saveliev, V., Antipov, D. &


Gurevich, A. Versatile genome assembly evaluation with QUAST-LG. _Bioinformatics_ 34, i142–i150 (2018). Article  CAS  PubMed  PubMed Central  Google Scholar  * Li, H. Aligning sequence


reads, clone sequences and assembly contigs with BWA-MEM. arXiv preprint arXiv:1303.3997 (2013). * Marçais, G. et al. MUMmer4: A fast and versatile genome alignment system. _PLoS Comput


Biol._ 14, e1005944 (2018). Article  CAS  PubMed  PubMed Central  Google Scholar  * Kang, D. D. et al. MetaBAT 2: An adaptive binning algorithm for robust and efficient genome reconstruction


from metagenome assemblies. _PeerJ_ 7, e7359 (2019). Article  PubMed  PubMed Central  Google Scholar  * Quinlan, A. R. & Hall, I. M. BEDTools: A flexible suite of utilities for


comparing genomic features. _Bioinformatics_ 26, 841–842 (2010). Article  CAS  PubMed  PubMed Central  Google Scholar  * Ritchie, M. E. et al. limma powers differential expression analyses


for RNA-sequencing and microarray studies. _Nucleic Acids Res._ 43, e47–e47 (2015). Article  CAS  PubMed  PubMed Central  Google Scholar  * Shen, W. & Ren, H. TaxonKit: A practical and


efficient NCBI taxonomy toolkit. _J. Genet. Genomics_ 48, 844–850 (2021). Article  PubMed  Google Scholar  * Hyatt, D. et al. Prodigal: Prokaryotic gene recognition and translation


initiation site identification. _BMC Bioinform._ 11, 119–119 (2010). Article  CAS  Google Scholar  * Shaffer, M. et al. DRAM for distilling microbial metabolism to automate the curation of


microbiome function. _Nucleic Acids Res._ 48, 8883–8900 (2020). Article  CAS  PubMed  PubMed Central  Google Scholar  * Lundberg, D. S. et al. Defining the core Arabidopsis thaliana root


microbiome. _Nature_ 488, 86–90 (2012). * Masella, A. P., Bartram, A. K., Truszkowski, J. M., Brown, D. G. & Neufeld, J. D. PANDAseq: Paired-end assembler for illumina sequences. _BMC


Bioinform._ 13, 1–7 (2012). Article  CAS  Google Scholar  * Cole, J. R. et al. Ribosomal Database Project: Data and tools for high throughput rRNA analysis. _Nucleic Acids Res._ 42, 633–642


(2014). Article  CAS  Google Scholar  * Dodt, M., Roehr, J., Ahmed, R. & Dieterich, C. FLEXBAR—Flexible barcode and adapter processing for next-generation sequencing platforms. _Biology_


1, 895–905 (2012). Article  PubMed  PubMed Central  Google Scholar  Download references ACKNOWLEDGEMENTS The project was financially supported, in part, by the NWO-TTW Perspective program


BackToRoots (TTW-project 14218 to J.M.R., V.J.C., V.C., and B.O.O.), by the NWO-Gravitation program MICRop (to J.M.R., M.H.M.), a National Institutes of Health (NIH) Genome to Natural


Products Network supplementary award (no. U01GM110706 to M.H.M.), a ZonMW Enabling Technologies Hotel project (no. 40-43500-98-210 to M.H.M.), a Senescyt fellowship awarded to S.S.F., and by


internal funding from the Netherlands Institute of Ecology. AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Department of Microbial Ecology, Netherlands Institute of Ecology, Wageningen, The


Netherlands Ben O. Oyserman, Stalin Sarango Flores, Thom Griffioen, Xinya Pan, Wouter Lokhorst, Azkia Nurfikari, Nejc Stopnisek, Viviane Cordovez, Víctor J. Carrión & Jos M. Raaijmakers


* Bioinformatics Group, Wageningen University, Wageningen, The Netherlands Ben O. Oyserman, Elmar van der Wijk, Lotte Pronk, Anne Kupczok & Marnix H. Medema * Institute of Biology,


Leiden University, Leiden, The Netherlands Stalin Sarango Flores, Víctor J. Carrión, Marnix H. Medema & Jos M. Raaijmakers * Department of Data Sciences, Genentech, Inc. South San


Francisco, South San Francisco, CA, USA Joseph N. Paulson * Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA Mercedeh Movassagh * Department of Data


Sciences Dana Farber Cancer Institute, Harvard T.H. Chan School of Public Health, Boston, MA, USA Mercedeh Movassagh * Wageningen Seed Lab, Laboratory of Plant Physiology, Wageningen


University, Wageningen, The Netherlands Wilco Ligterink * Theoretical Biology and Bioinformatics, Utrecht University, Utrecht, The Netherlands Basten L. Snoek Authors * Ben O. Oyserman View


author publications You can also search for this author inPubMed Google Scholar * Stalin Sarango Flores View author publications You can also search for this author inPubMed Google Scholar *


Thom Griffioen View author publications You can also search for this author inPubMed Google Scholar * Xinya Pan View author publications You can also search for this author inPubMed Google


Scholar * Elmar van der Wijk View author publications You can also search for this author inPubMed Google Scholar * Lotte Pronk View author publications You can also search for this author


inPubMed Google Scholar * Wouter Lokhorst View author publications You can also search for this author inPubMed Google Scholar * Azkia Nurfikari View author publications You can also search


for this author inPubMed Google Scholar * Joseph N. Paulson View author publications You can also search for this author inPubMed Google Scholar * Mercedeh Movassagh View author publications


You can also search for this author inPubMed Google Scholar * Nejc Stopnisek View author publications You can also search for this author inPubMed Google Scholar * Anne Kupczok View author


publications You can also search for this author inPubMed Google Scholar * Viviane Cordovez View author publications You can also search for this author inPubMed Google Scholar * Víctor J.


Carrión View author publications You can also search for this author inPubMed Google Scholar * Wilco Ligterink View author publications You can also search for this author inPubMed Google


Scholar * Basten L. Snoek View author publications You can also search for this author inPubMed Google Scholar * Marnix H. Medema View author publications You can also search for this author


inPubMed Google Scholar * Jos M. Raaijmakers View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS The study was conceived and designed by


B.O.O., V.J.C., W.Li, M.H.M., and J.M.R. The greenhouse experimentation and lab work were conducted by B.O.O., S.S.F., V.C., V.J.C., and A.N. Contributions to data analysis came from B.O.O.,


T.G., X.P., E.v.d.W., W.Lo, L.P., N.S., A.K., V.C., V.J.C., B.L.S., M.H.M., J.N.P., and M.M. The manuscript was drafted by B.O.O., B.L.S., M.H.M., and J.M.R. All authors contributed to the


revision and agreed upon the final draft. CORRESPONDING AUTHORS Correspondence to Ben O. Oyserman or Jos M. Raaijmakers. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no


competing interests. PEER REVIEW PEER REVIEW INFORMATION _Nature Communications_ thanks Joëlle Schläpfer, Maggie Wagner, and the other, anonymous, reviewer(s) for their contribution to the


peer review of this work. Peer reviewer reports are available. ADDITIONAL INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps


and institutional affiliations. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION PEER REVIEW FILE DESCRIPTION OF ADDITIONAL SUPPLEMENTARY FILES SUPPLEMENTARY DATA 1 SUPPLEMENTARY DATA 2


SUPPLEMENTARY DATA 3 SUPPLEMENTARY DATA 4 SUPPLEMENTARY DATA 5 SUPPLEMENTARY DATA 6 SUPPLEMENTARY DATA 7 SUPPLEMENTARY DATA 8 SUPPLEMENTARY DATA 9 SUPPLEMENTARY DATA 10 SUPPLEMENTARY DATA 11


SUPPLEMENTARY DATA 12 SUPPLEMENTARY DATA 13 SUPPLEMENTARY DATA 14 SUPPLEMENTARY DATA 15 SUPPLEMENTARY DATA 16 SUPPLEMENTARY DATA 17 SUPPLEMENTARY DATA 18 SUPPLEMENTARY DATA 19 SUPPLEMENTARY


DATA20 SUPPLEMENTARY DATA 21 SUPPLEMENTARY DATA 22 REPORTING SUMMARY SOURCE DATA SOURCE DATA RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative Commons Attribution


4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s)


and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s


Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not


permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit


http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Oyserman, B.O., Flores, S.S., Griffioen, T. _et al._ Disentangling the genetic


basis of rhizosphere microbiome assembly in tomato. _Nat Commun_ 13, 3228 (2022). https://doi.org/10.1038/s41467-022-30849-9 Download citation * Received: 20 December 2021 * Accepted: 19 May


2022 * Published: 16 June 2022 * DOI: https://doi.org/10.1038/s41467-022-30849-9 SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content: Get shareable


link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative