Play all audios:
ABSTRACT While much prior work has explored the constraints on protein sequence and evolution induced by physical protein-protein interactions, the sequence-level constraints emerging from
non-binding functional interactions in metabolism remain unclear. To quantify how variation in the activity of one enzyme constrains the biochemical parameters and sequence of another, we
focus on dihydrofolate reductase (DHFR) and thymidylate synthase (TYMS), a pair of enzymes catalyzing consecutive reactions in folate metabolism. We use deep mutational scanning to quantify
the growth rate effect of 2696 DHFR single mutations in 3 TYMS backgrounds under conditions selected to emphasize biochemical epistasis. Our data are well-described by a relatively simple
enzyme velocity to growth rate model that quantifies how metabolic context tunes enzyme mutational tolerance. Together our results reveal the structural distribution of epistasis in a
metabolic enzyme and establish a foundation for the design of multi-enzyme systems. SIMILAR CONTENT BEING VIEWED BY OTHERS PERVASIVE COOPERATIVE MUTATIONAL EFFECTS ON MULTIPLE CATALYTIC
ENZYME TRAITS EMERGE VIA LONG-RANGE CONFORMATIONAL DYNAMICS Article Open access 12 March 2021 ENRICHING PRODUCTIVE MUTATIONAL PATHS ACCELERATES ENZYME EVOLUTION Article Open access 11
September 2024 PERVASIVE EPISTASIS EXPOSES INTRAMOLECULAR NETWORKS IN ADAPTIVE ENZYME EVOLUTION Article Open access 21 December 2023 INTRODUCTION Enzymes function within biochemical
pathways, exchanging substrates and products to generate useful metabolites. This metabolic context places constraints on enzyme velocity—the product of catalytic activity and enzyme
abundance. For example, the relative velocities of some enzymes must be coordinated to avoid accumulation of deleterious metabolic intermediates1,2,3. In other instances, optimal enzyme
abundance is set by a tradeoff between the cost of protein synthesis and the benefit of efficient nutrient utilization4,5,6. Considered at the pathway scale, metabolic enzymes are often
produced in evolutionarily conserved stoichiometric ratios across species7, providing further indication that relative—not just absolute—enzyme velocity is under selection. More generally,
the relationships amongst the velocity of a given enzyme, metabolic flux through a pathway, and cellular growth rate are non-linear and shaped by interactions between pathway enzymes (Fig.
1a). Indeed, a key result of metabolic control theory is that the control coefficient of an enzyme—defined as the fractional change in pathway-level flux given a fractional change in enzyme
velocity—depends on the starting (native) velocity of the enzyme, but _also_ on the velocity of all other enzymes in the pathway8,9. That is to say, given that enzymes act sequentially to
produce metabolites, the effects of mutations on cellular phenotype can be buffered or amplified depending on which enzymatic reactions control metabolic flux. As a consequence, enzyme
mutations that are neutral in one context may have profound consequences for metabolic flux and growth rate in the background of variation in another10,11,12,13,14. This context-dependence,
or epistasis, amongst metabolic enzymes need not be mediated by direct physical binding, but emerges indirectly through shared metabolite pools and a need to maintain flux while avoiding the
accumulation of deleterious intermediates6,11,15. While prior work has explored how physical protein-protein interactions (binding) constrain protein sequence, the constraints on sequence
and enzymatic activity emerging from these sorts of non-binding functional interactions in metabolism remain unclear. How is this biochemically-mediated epistasis organized in the protein
structure and reflected in the sequence? A quantitative understanding of how pathway context shapes sequence and activity would assist in the interpretation of disease-associated mutations,
the design of new enzymes, and directing the laboratory evolution of metabolic pathways. In this work, we examine the residue-level epistatic interactions between a pair of enzymes that
catalyze consecutive reactions in folate metabolism: dihdyrofolate reductase (DHFR) and thymidylate synthase (TYMS). The activity of these enzymes is strongly linked to _E. coli_ growth
rate, they are frequent targets of antibiotics and chemotherapeutics, and they co-evolve as a module both in the laboratory and across thousands of bacterial genomes1. Taking this enzyme
pair as a simplified model system in which to examine a biochemically-mediated epistatic interaction, we create a mathematical model relating variation in DHFR and TYMS catalytic parameters
to growth rate using a focused set of well-characterized point mutants. Then, to more deeply test this model and comprehensively map the pattern of epistasis between these two enzymes, we
measure the effect of nearly all possible DHFR single mutations (2696 in total) in the context of three TYMS variants selected to span a range of catalytic activities. The model predicts—and
the data shows—that TYMS background profoundly changes both the sign (buffering vs. aggravating) and magnitude of DHFR epistasis. Mapping the epistatic effects of mutation to the DHFR
tertiary structure reveals that they are organized into distinct clusters based on epistatic sign. Additionally, mutations with the largest magnitude epistatic effect to TYMS center around
the DHFR active site, while more weakly epistatic positions radiate outwards. Finally, we infer approximate values for DHFR catalytic power (_k__cat_/Km) across all 2696 mutations by using
growth rate measurements across TYMS backgrounds to constrain the enzyme velocity to growth rate model. The residues linked to catalysis form a structurally distributed network inside the
enzyme and are highly evolutionarily conserved. Taken together, our data demonstrates at single-residue resolution how epistasis mediated through a biochemical interaction reshapes a
mutational landscape. Our results indicate that metabolic context can strongly modulate enzyme evolution in both the clinic and the lab by facilitating or frustrating available mutational
paths. More generally, our results invite one to consider new ideas for the joint design of multi-enzyme systems that take into account shared constraints on relative activity and sequence.
RESULTS AN ENZYME VELOCITY TO GROWTH RATE MODEL FOR DHFR AND TYMS We constructed a mathematical model relating changes in DHFR and TYMS catalytic parameters to growth rate phenotype. Our
goals were to (1) formalize our previous empirical findings describing DHFR/TYMS biochemical coupling1, (2) quantify the absolute and relative constraints on DHFR and TYMS catalytic
activity, and (3) define the relationship between biochemical activity and epistasis. DHFR and TYMS play a central role in folate metabolism, a well-conserved biochemical pathway involved in
the synthesis of purine nucleotides, thymidine, glycine, and methionine16 (Fig. 1b). Consequently, this pathway is strongly linked to cell growth and a frequent target of antibiotics and
chemotherapeutics. DHFR is a 159 amino acid enzyme that catalyzes the reduction of dihydrofolate (DHF) to tetrahydrofolate (THF) using NADPH as a cofactor. The reduced THF then serves as a
carrier for activated one-carbon units in downstream metabolic processes. TYMS catalyzes the oxidation of THF back to DHF during deoxythymidine synthesis and is the sole enzyme responsible
for recycling the DHF pool17,18. Prior work by ourselves and others indicates that these two enzymes are strongly functionally coupled to each other and less coupled to the remainder of the
pathway: they co-evolve with each other in terms of synteny and gene co-occurrence across bacterial species (while far less so with the remainder of the pathway)1, suppressor mutations in
TYMS are sufficient to rescue inhibition of DHFR with trimethoprim in both the lab and the clinic1,19, and reduced expression of DHFR is uniquely rescued by reduced expression of TYMS (and
no other folate metabolic enzyme)1,20,21. Metabolomics data indicated that loss of DHFR function resulted in accumulation of DHF and depletion of reduced folates, and that compensatory loss
of function mutations in TYMS help to restore DHF and THF pools to more native-like levels1,22,23. Thus, DHFR and TYMS are a growth-linked two-enzyme system where epistasis is driven by a
biochemical interaction, with the added simplification that they are relatively functionally decoupled from surrounding metabolic context. With this information in mind we defined a
mathematical model comprised of two parts: the relationship between intracellular THF abundance and growth rate, and the relationship between enzymatic activity and intracellular THF. First,
we considered the relationship between intracellular THF abundance and growth rate. THF limitation due to DHFR loss of function restricts the production of several growth-linked factors,
including thymidine, methionine, glycine, and the purine precursors inosine and AICAR. Under the experimental conditions of our growth rate assays—M9 minimal media with 0.4% glucose, 0.2%
amicase, and 50 µg/ml thymidine—thymidine is not growth limiting (TYMS R166Q is rescued to WT-like growth) and amicase provides a source of free amino acids. These conditions—which remove
selection pressure on TYMS due to thymidine production—emphasize coupling between DHFR and TYMS through the shared THF pool which must be used to produce purine nucleotides. We previously
observed a hyperbolic dependence of growth rate on reduced folate abundance for many THF species in these experimental conditions1. We selected 10-formyl THF with three glutamates as a
representative growth-linked reduced folate for parameterizing the model given it’s clear relationship to growth and proximity to purine biosynthesis. Following a similar approach as
Rodrigues et al, we used a single four-parameter sigmoidal function to relate growth rate to the experimental measurements of intracellular THF concentration24. $$g=\frac{{g}_{\max
}-{g}_{\min }}{1+{({K}/[{{{{\rm{THF}}}}}])}^{n}}+{g}_{\min }\,$$ (1) Here, \({g}_{\max }\) represents the maximal growth rate, _g_min is the minimal growth rate, \(K\) is a constant that
captures the concentration of THF that yields 50% growth, and \(n\) is a Hill coefficient (Supplementary Table 1). The second piece of the model connects variation in DHFR and TYMS enzyme
velocity to intracellular THF concentrations. To simplify our model, we reduced the pathway to a cycle in which DHFR and TYMS catalyze opposing oxidation and reduction reactions (Fig. 1c).
This abstraction assumes that DHFR and TYMS dominate turnover of the DHF and THF pools, and that the reduced folates are considered as a single THF pool. We omitted downstream reactions of
folate metabolism that use one-carbon derivatives of THF in the production of purine precursors, glycine, and methionine (dashed lines in Fig. 1b), as none of these other reactions oxidize
THF back to DHF—they instead add or subtract one-carbon units from the reduced THF. We thus treated the intracellular concentration of total folate ([DHF] + [THF]) as a constant, with DHFR
and TYMS activity setting the balance between the reduced and oxidized forms. This simplification formalizes the notion that DHFR and TYMS are a two-enzyme module tightly coupled to each
other but less so to the remainder of the pathway as indicated by our prior comparative genomics and laboratory evolution experiments1. Given this abstraction, we write a rate equation that
isolates the recycling of THF in terms of a small number of measurable biochemical parameters:
$$\frac{d[{{{{\rm{THF}}}}}]}{{dt}}=\frac{\left[{{{{\rm{DHFR}}}}}\right]*{k}_{{{{{{\rm{cat}}}}}}}^{{{{{\rm{DH}}}}}}}{1+{{{{{\rm{K}}}}}}_{{{{{{\rm{m}}}}}}}^{{{{{\rm{DH}}}}}}/([{{{{{\rm{fol}}}}}}_{{{{{\rm{tot}}}}}}]-\left[{{{{\rm{THF}}}}}\right])}-\frac{\left[{{{{\rm{TYMS}}}}}\right]*{k}_{{{{{{\rm{cat}}}}}}}^{{{{{\rm{TS}}}}}}}{1+{{{{{\rm{K}}}}}}_{{{{{{\rm{m}}}}}}}^{{{{{\rm{TS}}}}}}/(\left[{{{{\rm{THF}}}}}\right])}\,$$
(2) In this equation, DHFR and TYMS are treated as catalyzing opposing reactions with Michaelis Menten kinetics, providing a relationship between steady state kinetics parameters
(\({k}_{{{{{{\rm{cat}}}}}}}^{{{{{{\rm{DH}}}}}}}\), \({{{{{{\rm{K}}}}}}}_{{{{{{\rm{m}}}}}}}^{{{{{{\rm{DH}}}}}}}\), \({k}_{{{{{{\rm{cat}}}}}}}^{{{{{{\rm{TS}}}}}}}\),
\({{{{{{\rm{K}}}}}}}_{{{{{{\rm{m}}}}}}}^{{{{{{\rm{TS}}}}}}}\)) and intracellular THF abundance. From this equation one can find an analytical solution for the steady state concentration of
THF in the form of the Goldbeter-Koshland equation25,26. $$ \frac{[{{{{{\rm{THF}}}}}}_{{ss}}]}{[{{{{{\rm{fol}}}}}}_{{{{{\rm{tot}}}}}}]} \\
=\frac{\frac{{V}_{1}}{{V}_{2}}*\left(1-{\hat{K}}_{{{{{{\rm{m}}}}}}2}\right)
-{\hat{K}}_{{{{{{\rm{m}}}}}}1}-1+\sqrt{4{\hat{K}}_{{{{{{\rm{m}}}}}}2}\frac{{V}_{1}}{{V}_{2}}\left(\frac{{V}_{1}}{{V}_{2}}-1\right)+{\left(\frac{{V}_{1}}{{V}_{2}}\left({\hat{K}}_{{{{{{\rm{m}}}}}}2}-1\right)+{\hat{K}}_{{{{{{\rm{m}}}}}}1}+1\right)}^{2}}}{2*\left(\frac{{{{{{\rm{V}}}}}}_{1}}{{{{{{\rm{V}}}}}}_{2}}-1\right)}\,$$
(3) Where:
$${V}_{1}=\left[{{{{\rm{DHFR}}}}}\right]{k}_{{{{{{\rm{cat}}}}}}}^{{{{{\rm{DH}}}}}}\;\;\;{\hat{K}}_{{{{{{\rm{m}}}}}}1}={{{{{\rm{K}}}}}}_{{{{{{\rm{m}}}}}}}^{{{{{\rm{DH}}}}}}/{{{{{\rm{fol}}}}}}_{{{{{\rm{tot}}}}}}\;\;\;{V}_{2}=\left[{{{{\rm{TYMS}}}}}\right]{k}_{{{{{{\rm{cat}}}}}}}^{{{{{\rm{TS}}}}}}\;\;\;{\hat{K}}_{{{{{{\rm{m}}}}}}2}={{{{{\rm{K}}}}}}_{{{{{{\rm{m}}}}}}}^{{{{{\rm{TS}}}}}}/{{{{{\rm{fol}}}}}}_{{{{{\rm{tot}}}}}}\,$$
(4) To parameterize the complete model, we used a previously collected set of metabolomics and growth rate data for five DHFR point mutants in the background of both WT TYMS and TYMS R166Q
(Supplementary Table 2, Supplementary Table 3)1. The five DHFR point mutations were selected to span a wide range of catalytic activities. TYMS R166Q is an active site mutation with near
complete loss of catalytic activity27. The steady state catalytic parameters (\({k}_{{{{{{\rm{cat}}}}}}}^{{{{{{\rm{DH}}}}}}}\), \({{{{{{\rm{K}}}}}}}_{{{{{{\rm{m}}}}}}}^{{{{{{\rm{DH}}}}}}}\),
\({k}_{{{{{{\rm{cat}}}}}}}^{{{{{{\rm{TS}}}}}}}\), \({{{{{{\rm{K}}}}}}}_{{{{{{\rm{m}}}}}}}^{{{{{{\rm{TS}}}}}}}\)) were experimentally measured in vitro using purified samples of all DHFR and
TYMS variants, with the exception of TYMS R166Q which is near-inactive and assigned an arbitrarily low _k_cat and high Km (Supplementary Table 2, Supplementary Table 3). Additionally, the
relative abundance of 10-formyl-THF was previously measured by liquid chromatography mass spectrometry for all ten DHFR/TYMS combinations1. Given these data, we first fit the relationship
between relative 10-formyl-THF abundance and growth rate for these 10 genotypes (Eq. 1), yielding four parameter values for \({g}_{\max }\), \({{g}}_{\min }\), \(K\), and \({n}\) (Fig. 1d,
Supplementary Table 1). Four more fit parameters then remain in Eq. 3: (1) the concentration of the total folate pool (\([{{{{{{\rm{fol}}}}}}}_{{{{{{\rm{tot}}}}}}}]\)) (2) the intracellular
concentration of DHFR (\(\left[{{{{{\rm{DHFR}}}}}}\right]\)), (3) the intracellular concentration of WT TYMS (\([{{{{{{\rm{TYMS}}}}}}}_{{{{{{\rm{WT}}}}}}}]\)), and (4) the intracellular
concentration of TYMS R166Q (\([{{{{{{\rm{TYMS}}}}}}}_{{{{{{\rm{R}}}}}}166{{{{{\rm{Q}}}}}}}]\)). We chose to fit the concentration of each TYMS variant separately, while defining a single
global parameter for the concentration of all DHFR variants (which was held constant regardless of variant identity). This means that our model does not capture the differential impact of
mutations on DHFR intracellular abundance. Our logic was that TYMS variant concentration should be better constrained by our data, because our experiments quantify the effect of relatively
few TYMS variants in many DHFR genetic backgrounds. In contrast, DHFR variant concentration should be less well constrained as our dataset includes the effect of each DHFR variant in only a
few TYMS genetic backgrounds. This relatively simplified model showed good correspondence to the data when fit (R2 = 0.96, Fig. 1e, Supplementary Table 1). Equations 1 and 3 were then
combined to estimate growth rate as a function of both DHFR and TYMS activity, by linking catalytic activity to THF abundance, and then THF abundance to growth rate. The complete model
worked well to predict growth rate on our initial training set (Fig. 1f). TYMS CONTEXT ALTERS THE SIGN AND MAGNITUDE OF DHFR MUTATIONAL EFFECTS To more rigorously test our model and
understand its predictions, we expanded our dataset to include more DHFR and TYMS variants with experimentally characterized activities. As our initial model was developed using only two
extreme TYMS variants (wild-type and a near complete loss of function variant, R166Q), we were particularly curious to evaluate model performance for TYMS mutations with intermediate effects
on catalysis and _E. coli_ growth. We identified candidate TYMS mutations by examining an earlier growth complementation study28. A handful of these mutants were then cloned, screened for
expression, and when possible, purified and characterized. Through this mini-screen we selected two mutations that stably expressed, purified robustly, and yielded intermediate activities:
TYMS R127A and Q33S (Fig. 2a). The R127A mutation is located in the TYMS active site and is one of four arginines that coordinate the substrate (dUMP) phosphate group. The Q33S mutation is
located at the TYMS dimer interface, distal to the active site. We observed that R127A was more deleterious to catalytic function than Q33S, but that both mutations were more active than
R166Q (which shows almost no measurable activity in vitro, Fig. 2b, Supplementary Table 3). We measured growth rates for seven catalytically characterized DHFR variants (a set of single and
double mutants selected to span a range of catalytic activities) in the background of these four TYMS mutants (WT, R127A, Q33S and R166Q). The point mutants were created in a plasmid
encoding both DHFR and TYMS (see Methods for details), and the plasmids were transformed into an E. coli selection strain lacking the endogenous DHFR and TYMS genes (ER2566 _ΔfolA ΔthyA_).
Growth rates were measured in triplicate using a plate-reader-based assay (28 measurements total; Fig. 2c, Supplementary Fig. 1a,c). We used this focused dataset to re-parameterize the model
equations, this time fitting five total parameters
(\([{{{{{{\rm{fol}}}}}}}_{{{{{{\rm{tot}}}}}}}],[{{{{{\rm{DHFR}}}}}}],[{{{{{{\rm{TYMS}}}}}}}_{{{{{{\rm{WT}}}}}}}],[{{{{{{\rm{TYMS}}}}}}}_{{{{{{\rm{Q}}}}}}33{{{{{\rm{S}}}}}}}],[{{{{{{\rm{TYMS}}}}}}}_{{{{{{\rm{R}}}}}}127{{{{{\rm{A}}}}}}}],[{{{{{{\rm{TYMS}}}}}}}_{{{{{{\rm{R}}}}}}166{{{{{\rm{Q}}}}}}}]\),
Supplementary Table 1). This second round of fitting tested the ability of growth rate data alone to constrain the model—an important step because metabolomics data are available for only a
limited number of DHFR and TYMS mutants and are inherently far lower throughput to collect than growth rates. This iteration of parameterization also tested the capacity of the model to
capture TYMS mutations with intermediate effects on activity. The data were again well described by the model (Fig. 2C, Supplementary Fig. 1b–d). The best fit parameters result in a growth
rate plateau near 0.9 for the most fit mutant combinations (experimental growth rates between 0.7 and 1.0), but allow resolution of growth rates amongst the least fit mutant combinations
(growth rates less that 0.7). We observed some variation in the fit parameters (relative to the older data in Fig. 1); this difference might be attributed to the fact that our newer
experiments used a revised selection vector backbone. As a control for overfitting, we tested the ability of the model to predict growth rates for arbitrary catalytic data. We randomly
shuffled the catalytic parameters (_k__cat_ and Km) among mutations for both DHFR and TYMS, refit all free model parameters, and calculated the RMSD and R2 values between the best fit model
and the shuffled data. Importantly, the model was generally unable to describe the experimental growth rate data when catalytic parameters were shuffled across both DHFR and TYMS
(Supplementary Fig. 1e, f). This indicated that the model provided a specific description of our experiment and was not trivially overfit. The model was less sensitive to shuffling TYMS
catalytic parameters (presumably because we included fit parameters describing the abundance of each TYMS mutation that can compensate for this shuffling, Supplementary Fig. 1h). However, it
was strongly sensitive to shuffling DHFR parameters (Supplementary Fig. 1g). Taken together, this analysis indicated that the model provides a good description of the
enzyme-velocity-to-growth-rate relationship and can be used to predict and interpret how molecular changes in DHFR and TYMS activity modulate growth rate phenotype. To examine the model more
closely, we considered the relationship between TYMS catalytic activity and DHFR mutational sensitivity. As in previous work, we observed that loss-of-function mutations in DHFR can be
partly or even entirely rescued by the loss-of-function mutation TYMS R166Q in the presence of thymidine (Supplementary Fig. 1a–c). TYMS R127A, a less severe loss of function mutation,
showed a similar albeit more modest trend—this mutation was able to partly rescue growth for some (though not all) DHFR mutations. A central factor behind DHFR and TYMS biochemical coupling
is that loss-of-function mutations in TYMS help to preserve reduced folate pools, allowing THF to shuttle one-carbon units in downstream biochemical processes like purine biosynthesis even
when DHFR activity is low. Moreover, loss of TYMS activity reduces accumulation of DHF, which can inhibit upstream reactions1,23. Thus, the TYMS R166Q and R127A variants show positive
(buffering) epistasis to low-activity DHFR mutations. In contrast to our expectation that a more intermediate mutation would also demonstrate intermediate levels of positive (buffering)
epistasis, TYMS Q33S showed negative epistasis to some DHFR mutations (Supplementary Fig. 1c). This means that these DHFR mutations are more deleterious in the background of TYMS Q33S than
in the native TYMS context. To account for this observation, the model predicted that the intracellular concentration of TYMS Q33S is increased relative to wildtype by about three fold (a
fit parameter, Supplementary Table 1) such that the Vmax of TYMS Q33S becomes greater than wildtype
(\([{{{{{{\rm{TYMS}}}}}}}_{{{{{{\rm{Q}}}}}}33{{{{{\rm{S}}}}}}}]{k}_{{{{{{\rm{cat}}}}}}}^{{{{{\rm{TYMS}}}}}\_{Q}}33{{{{{\rm{S}}}}}} >
[{{{{{{\rm{TYMS}}}}}}}_{{{{{{\rm{WT}}}}}}}]{k}_{{{{{{\rm{cat}}}}}}}^{{{{{{\rm{TYMS}}}}}\_{WT}}}\)). This in turn increased the intracellular requirement for DHFR activity, resulting in
negative epistasis. To test the prediction that TYMS Q33S is more highly expressed and thus attains a higher intracellular velocity even though less catalytically active, we conducted a
series of enzyme kinetics assays using cell lysates24. In these experiments we monitored the accumulation of DHF spectrophotometrically in cell lysates (extracted from early log phase cells
grown in conditions identical to our selection) following addition of saturating amounts of the TYMS substrates deoxyuridine monophosphate (dUMP) and N5,N10-methylene tetrahydrofolate (MTHF)
(see also Methods). DHF accumulation over time was compared for WT ER2566 lysates, the ER2566 DHFR/TYMS double knockout strain (ER2566 _ΔfolA ΔthyA_), and strains carrying selection
plasmids encoding TYMS Q33S, TYMS R127A, and TYMS R166Q (Fig. 2d). Importantly, as our plasmid system encodes both DHFR and TYMS, we mutated the DHFR of these plasmids to the
near-catalytically-inactive variant D27N. This prevents DHF from being recycled back to THF by DHFR (which would confound our estimations of intracellular TYMS concentration). In these
experiments, we observed that all plasmid-containing strains exhibited TYMS activity on par with or less than the native ER2566 strain, indicating that our TYMS enzyme is not strongly
overexpressed. Moreover, these data showed that the TYMS activity of Q33S-containing lysates is indeed modestly higher than WT TYMS-containing lysates as predicted by our model. By dividing
the measured lysate velocity (in Fig. 2d), by the _k__cat_ (from Fig. 2b), we obtain estimates of intracellular TYMS concentration and find that these are well-correlated to our model
predictions (Fig. 2e). To further explore the pattern of epistasis across TYMS backgrounds, we simulated growth rates over a range of DHFR _k__cat_ and Km values in each TYMS background
(Fig. 2f). This provided a comprehensive prediction of the TYMS-induced constraints on DHFR activity. In particular, we obtained a regime of DHFR _k__cat_ and Km values that is sufficient to
support growth for each TYMS mutation. From these data we computed epistasis as the difference in growth rates between a given TYMS background and the WT (see also methods). We observed
that TYMS Q33S has negative epistasis to DHFR variants spanning a well-defined band of catalytic parameters. This ridge of negative epistasis (green band in Fig. 2f) describes a range of
catalytic parameters that are sufficient to rescue growth in the context of WT TYMS, but which are deleterious in the context of TYMS Q33S. DHFR variants below this band (with increased
_k__cat_) grow well in both contexts, while DHFR variants above this band (with decreased _k__cat_) grow poorly in both contexts. The simulations also indicated that R127A has weak positive
epistasis over a regime of moderately impaired DHFR variants, but R127A is insufficient to rescue growth for the strongest loss of function variants. Finally, TYMS R166Q was observed to be
broadly rescuing; DHFR variants need only a negligible amount of activity to support growth in this context. Together, our simulations showed that the sign and magnitude of DHFR epistasis
are strongly tuned by TYMS background, and provided quantitative predictions of the catalytic regimes where epistasis is most apparent. THE SINGLE-MUTANT LANDSCAPE OF DHFR IS STRONGLY
MODULATED BY TYMS Next we examined the structural pattern of biochemical epistasis at the residue level across DHFR. This also provided an opportunity to see if the model
predictions—negative epistasis for Q33S and broadly positive epistasis for R166Q—held true across a larger dataset. To accomplish this, we created a plasmid-based saturation mutagenesis
library of DHFR containing all possible single mutations at every position (3002 total). This library was subcloned into all three TYMS backgrounds. Sequencing showed that these libraries
are well-distributed and approach full coverage of all single mutations (97.1%—WT TYMS, 94.6%—TYMS Q33S, 99.3%—TYMS R166Q) (Supplementary Fig. 2). We transformed these libraries into the _E.
coli_ selection strain (ER2566 _ΔfolA ΔthyA_). Transformants for each library were then grown as a mixed population in selective media in a turbidostat to ensure maintenance of exponential
growth and constancy of media conditions. By quantifying the change in the relative frequency of individual mutants over time with next generation sequencing, we obtained a growth rate
difference relative to WT DHFR for nearly all mutations in the library (Fig. 3a, Supplementary Data 1, see methods for details). All relative growth rate measurements were made in
triplicate, with good concordance among replicates (Supplementary Fig. 3). The entire dataset showed that the DHFR mutational landscape was strongly dependent on TYMS background (Fig. 3b–e).
In all three TYMS backgrounds, the distribution of growth rate effects was bi-modal and reasonably well-described by a double gaussian containing one peak of near-neutral mutations and
another (far smaller) peak of mutations with highly deleterious growth rate effects. This is the expected result for an enzyme that shows a sigmoidal relationship between activity and
growth. In the native TYMS context, the vast majority of mutations fall into the near-neutral peak. However, there is a substantial fraction (12%, 343 total) that display growth rates at or
below that of inactive, where inactive was defined as the average growth rate across nonsense mutations in the first 120 residues of DHFR. Consistent with expectation, mutations at known
positions of functional importance tended to be deleterious in the WT TYMS context (W22, D27, F31, T35, L54, R57, T113, G121, and D122)29. For example, both W22 and D27 are directly in the
active site and serve to coordinate substrate through a hydrogen bonding network30, G121 and D122 are part of the βf-βg loop and stabilize conformational changes associated to
catalysis31,32, and F31 contacts the substrate and is associated to the network of promoting motions33,34. In the TYMS Q33S context, many of these deleterious mutations had even more severe
effects or were classified as Null. Null mutations disappeared from our sequencing counts within the first three time points (8 h) of the selection experiment, preventing accurate inference
of growth rate. For example, mutations at position 22 are deleterious in the WT TYMS context, and appear as Null or very deleterious in the Q33S context. The same pattern can be readily
observed for positions 7,14,15, 22, 27, 31, 35, and 121. In contrast, multiple observations are consistent with TYMS R166Q broadly buffering DHFR variation. First, in the TYMS R166Q context,
there are very few deleterious mutations. Nearly all mutations are contained in the near-neutral peak, including mutations at highly conserved active site positions like M20, W22, and L28.
Stop codons and mutations at the active site residue D27 continued to be deleterious, indicating that DHFR activity was still under (very weak) selection in the TYMS R166Q background.
Second, we observed an average of 41 null mutations per experimental replicate in the WT TYMS context, 82 null mutations in the TYMS Q33S context, but only 7 null mutations per experimental
replicate in the R166Q TYMS context. Third, for the TYMS Q33S context as with WT TYMS, we saw that 12% of mutations have growth rates at or below that of inactive variants while only 5% of
mutations displayed growth rates at or below those of inactive mutations in the TYMS R166Q context. To quantify the context dependence of mutational effects, we computed epistasis relative
to WT TYMS for all DHFR mutations with measurable relative growth rates in each of the three TYMS backgrounds (2696 in total, see also methods) (Fig. 4, Supplementary Fig. 4, Supplementary
Data 2). We assessed the statistical significance of epistasis by unequal variance t-test under the null hypothesis that the mutations have equal mean growth rates in both TYMS backgrounds.
These _p_ values were compared to a multiple-hypothesis testing adjusted p-value determined by Sequential Goodness of Fit (_P_ = 0.035 for TYMS Q33S and _P_ = 0.029 for TYMS R166Q, Fig. 4a,
b)35. In the TYMS Q33S background, 95 mutations (3%) showed significant negative epistasis and 280 mutations (9%) showed significant positive epistasis. Many of the DHFR mutations with
positive epistasis to Q33S were near-neutral in the WT context, and displayed small improvements in growth rate that were highly significant due to the low experimental error for the
best-growing mutations (Fig. 4c). In contrast, the mutations with negative epistasis exhibited a range of growth rate effects in the WT context. For the TYMS R166Q background the overall
proportion of significant epistatic mutations was larger: while only 41 mutations (1%) showed significant negative epistasis, 851 mutations (28%) showed significant positive epistasis. A
smaller number of deleterious mutations in the WT context were not rescued by R166Q (gray points with relative growth rates near 0.25). One possibility is that these mutations are
deleterious for reasons beyond disrupting metabolic flux; for example they may result in protein aggregation or off-target physical interactions. Regardless, direct comparison of the
relative growth rates of mutations across the WT, Q33S, and R166Q TYMS backgrounds makes it very obvious that TYMS R166Q was broadly rescuing, while TYMS Q33S had a more subtle effect that
sometimes yielded negative epistasis (Fig. 4c, d). THE VELOCITY-TO-GROWTH MODEL CAPTURES OBSERVED FITNESS LANDSCAPES Next we sought to further test our enzyme velocity to growth-rate model
using the deep mutational scanning data. We refit the model a third time, in order to include all available experimental information in parameterizing the model for this larger set of
mutants. This included a larger dataset of 34 DHFR single mutants with previously reported _k__cat_ and Km values. We additionally characterized _k__cat_ and Km for four new DHFR mutations
(I5K, V13H, E17V and M20Q) that exhibited strong sign epistasis to TYMS to more completely test our ability to predict epistasis. Together this yielded a set of 114 growth rate measurements
with matched _k__cat_ and Km values for DHFR and TYMS (38 DHFR mutations in 3 TYMS backgrounds, Supplementary Table 2). We used these data to perform a bootstrap analysis; iteratively
subsampling the data and refitting the model 1000 times to obtain standard deviations in our model fit and the eight associated parameters (Fig. 5a). The inferred parameters for this large
set of sequencing-based growth rate measurements were qualitatively similar to those obtained for the smaller set of 28 plate-reader based growth rate measurements (7 DHFR mutants in 4 TYMS
backgrounds, Fig. 2), but we observed some discrepancy in the estimated total folate pool and intracellular concentrations of TYMS (Supplementary Table 1). In general, the total folate pool
and concentration of TYMS R166Q were more variable across the bootstrap analysis, indicating that these two parameters are less well constrained by our data (as indicated by the estimated
variances in Supplementary Table 1). Nevertheless, the inferred parameters in both the plate reader and sequencing-based experiments suggested similar relative expression levels for DHFR and
the three TYMS point mutants, with Q33S being roughly 2–3 times more abundant than WT TYMS. These model-fit concentrations of TYMS correlated with the estimates of intracellular
concentration from our lysate assay experiments (Fig. 2e), though they are roughly two orders of magnitude higher. This may reflect potential noise in the model or true biological variation,
as the deep mutational scanning experiment was performed in a turbidostat maintained at low optical density and rapid exponential growth rather than in small-volume batch cultures. Again,
we computed epistasis of each DHFR mutation in the Q33S and R166Q TYMS contexts relative to the WT TYMS background. Overall both the predicted growth rates and pattern of epistasis showed
good agreement to our experimental observations (Fig. 5a, b). The strong correlation (R2 = 0.84) between the experimental growth rate data and the model across hundreds of in vitro
characterized DHFR/TYMS genotypes indicated that our relatively simple two-equation model—which fits only a single parameter for DHFR enzyme abundance across all mutant variants—allows for
reasonable growth rate predictions. Having established model performance on this subset of 114 biochemically characterized DHFR and TYMS sequences, we next examined consistency of the model
with all growth rate measurements (the total model fit). However the effect of most mutations on catalysis is unknown. Thus, for each DHFR point mutant we used Monte Carlo sampling to
identify a space of _k_cat and Km values consistent with the three growth rate measurements (in the three TYMS backgrounds). While three growth rate measurements were insufficient to
uniquely constrain both _k_cat and Km (the solution space is degenerate), this process did permit estimation of log10 catalytic power (_k_cat/Km) for all 2696 characterized point mutants.
For the subset of biochemically characterized DHFR mutants we observed reasonable agreement between the in vitro measurements and those inferred from our experimental growth rate data (R2 =
0.74, Fig. 5c). In the current version of our model, mutant-specific changes in DHFR abundance will be collapsed into the _k_cat parameter. This results in an effective measure of _k_cat
that captures both changes in catalytic activity and abundance, introducing one potential source of inaccuracy in the inference of catalytic power. Nonetheless, we find our relatively
few-parameter model allows for reasonable inference of fold changes in catalytic power. Once catalytic parameters were estimated across all point mutants, we put them back into the model to
assess the correspondence between the predicted (modeled) growth rates, predicted epistasis, and our experimental observations, yielding a global picture of model fit quality. Overall, we
observed that the model well-described the data with two exceptions. First, there was a small proportion of DHFR mutations that were predicted to be rescued by TYMS R166Q but in actuality
were not (70 total, 2% of all DHFR mutations, the horizontal stripe of red dots in Fig. 5d). It is possible that these mutations caused a growth rate defect through toxicity linked to DHFR
mis-folding and aggregation, a mechanism not captured by our model. Alternatively, these mutations may have had differential effects on protein abundance. Second, there was a proportion of
DHFR mutations predicted to have negative epistasis to TYMS R166Q but observed to exhibit mild positive epistasis (Fig. 5e). Again, these differences may be related to the fact that DHFR
abundance is modeled with a single parameter across all mutants. These discrepancies between true and effective _k__cat_ could be addressed in future work by including additional
high-throughput assay data on stability or abundance to tease apart mutational effects on catalysis from stability and abundance36. Nevertheless, the data indicated that our model can
globally describe growth rate phenotypes given variation in enzyme velocity. The resulting model and inferred catalytic parameters now permit estimation of DHFR single mutant fitness in any
TYMS background. We computed the fraction of DHFR point mutants that are neutral (growth rate above 0.9) as a function of variation in TYMS _k_cat and Km. These calculations highlighted that
selection on DHFR activity is strongly shaped by TYMS background, with low-activity TYMS variants increasing the mutational tolerance of DHFR (Fig. 5f, g). This suggests that TYMS
inhibition or loss of function could promote the evolvability of DHFR both in the clinic and laboratory settings. DHFR/TYMS EPISTASIS IS ORGANIZED INTO STRUCTURALLY LOCALIZED GROUPS Next, we
examined the structural pattern of DHFR positions with epistasis to TYMS Q33S and TYMS R166Q. Given that mutations tend to have similar epistatic effects at a particular DHFR position in
our data set (Supplementary Fig. 4), we used k-means clustering to sort positions into four categories according to their pattern of epistasis: negative, insignificant, positive, and strong
positive (Fig. 6a, Supplementary Table 4). The strong positive category solely contained DHFR mutations in the TYMS R166Q background, while the negative epistasis category was predominantly
occupied by DHFR mutations in the TYMS Q33S background. Based on our mathematical model, we expect that positions with strong positive epistasis to TYMS R166Q will have large deleterious
effects on both _k_cat and/or Km. Likewise, we expect that mutations with negative epistasis to TYMS Q33S should reduce _k_cat and/or increase Km. Consistent with expectation, mapping the
strongly epistatic positions to the DHFR structure showed that epistasis is organized into spatially distinct regions of the tertiary structure known to play a key role in catalytic function
(Fig. 6b, c). Mutations with negative epistasis to Q33S tended to be proximal to the DHFR active site, particularly the folate binding pocket. The negative epistasis cluster included
several key positions near or in the Met-20 loop, which is known to undergo conformational fluctuations associated with catalysis (residues A9, V13, E17 and M20)29,32. It also encompassed
positions I5, L24, L28, and F31 which surround the folate substrate. Several of these positions have known roles in catalysis; mutations at position 31 promoted product release (while
slowing hydride transfer), and dynamics of the M20 loop (which includes V13,E17) are associated with substrate binding and product release33,37. Additionally, specific mutations at positions
5, 20, and 28 result in trimethoprim resistance by altering trimethoprim affinity37. These structural and biochemical observations are consistent with the finding that mutations with
negative epistasis tended to yield moderate to severe growth rate defects. Given their structural location near the folate binding pocket, it is possible that some of these negatively
epistatic mutations reduce affinity (and increase Km) for dihydrofolate substrate. For example, prior work has shown that mutations at position 28 increase Km38. In contrast, positions with
positive epistasis to Q33S often had very little (or sometimes a beneficial) effect on growth rate, and were distributed around the DHFR surface (Fig. 4c, Fig. 6b). In the context of TYMS
R166Q only one position—C85—was included in the negative epistasis cluster (Fig. 6c). A large fraction of DHFR positions (53%, 84 total) displayed positive epistasis to TYMS R166Q; these
positions were distributed throughout the DHFR structure. The positions in the strong positive epistasis cluster included mutations with some of the most severe effects on growth rate in the
WT TYMS context. A number of these positions were previously established as important to DHFR catalysis, including residues F31, L54, G121, D122, and S14829. Mutations at these sites can be
detrimental to _k__cat_, Km, or both. The finding that highly epistatic mutations are concentrated at positions associated with DHFR catalysis provides additional support for the model that
DHFR/TYMS coupling is mediated by a shared constraint on relative enzymatic activity. EPISTASIS AND THE STRUCTURAL ENCODING OF DHFR CATALYSIS When the epistatic clusters are viewed together
on the structure, one sees that they form approximate distance-dependent shells around the active site (Fig. 7a–d). Considering the pattern of epistasis to TYMS Q33S, positions with
negative epistasis were closest to the active site, surrounded by positions with insignificant epistasis, and finally positions in the positive epistasis cluster form an outer shell (Fig.
7a, b). For TYMS R166Q, positions in the strong positive epistasis cluster were closest to the active site, followed by positive epistasis positions, and finally those with insignificant
epistasis (Fig. 7c, d). For comparison, we also mapped the model-predicted catalytic power averaged across all mutations at a position to the structure (Fig. 7e). Together, these structural
images paint a picture of the molecular encoding of catalysis and epistasis. Mutations with predicted intermediate-to-large effects on catalysis were nestled near the active site and showed
negative epistasis to Q33S and strong positive to positive epistasis to R166Q. Mutations with more mild effects on catalysis showed weaker positive to insignificant epistasis to R166Q and
Q33S. Though catalysis and epistasis showed an approximate distance-dependent relationship to the DHFR active site, there a number of key positions distal to the active site that exhibited
large growth rate effects, strong positive epistasis to TYMS R166Q, and likely act allosterically to tune catalytic activity (e.g. L110, G121, D122, W133, S148, and Y151). The positions with
the largest estimated effects on catalysis were highly evolutionarily conserved (_P_ < 10−10 by Fisher’s exact test, Supplementary Table 5, Fig. 7f), indicating that our model and
experimental data are capturing features relevant to the fitness of DHFR. Together, these data show that TYMS metabolic context strongly modulates the constraints on DHFR activity and
catalysis. DISCUSSION It is well-appreciated that physical protein interactions place constraints on the individual interacting monomers. Protein interfaces are organized to bind with
appropriate affinity and avoid non-specific interactions39,40. The individual components of physical complexes tend to be expressed in similar ratios to avoid dosage related toxicity and
aggregation41,42. However the extent to which interactions mediated by biochemistry (rather than binding) constrain the function and sequence of individual monomers has remained less clear.
We have explicitly revealed these interactions at single-residue resolution for one model system and coupled them with a mathematical model to quantify the intracellular constraints on DHFR
and TYMS relative catalytic activities. While this study focuses on all possible DHFR single mutations in the context of a few TYMS variants, it would be reasonably straightforward to design
analogous experiments that more densely sample TYMS variation, or quantify amino acid resolution epistasis to other folate metabolic enzymes. While practical constraints on sequencing depth
make it difficult to imagine extending these experiments to cover all possible DHFR/TYMS double mutations, one could design targeted libraries that sample variation at evolutionarily
conserved and/or catalytically important positions. Our mutagenesis data and modeling show that TYMS activity strongly modifies the constraints on DHFR catalytic parameters; shaping both the
range and relative importance of _k__cat_ and Km in modulating growth. This biochemical interaction results in an approximately shell-like pattern of mutational sensitivity to TYMS
background (epistasis) in the DHFR tertiary structure. Extreme loss-of-TYMS function rescued strongly deleterious mutations in some of the most conserved DHFR active site positions, while
moderate TYMS loss-of-function rescued moderately deleterious or weakly deleterious mutations at more peripheral solvent exposed sites. Given these data, we anticipate that inhibition or
loss-of-function in TYMS could promote the evolvability of DHFR in nutrient rich environments by reducing the constraints on DHFR sequence and activity, a hypothesis with consequences for
both laboratory and clinical evolution. For example, inhibiting TYMS activity in the clinic may promote the evolution of drug resistance in DHFR, while activating TYMS may restrict
evolutionary accessible paths. We note that TYMS loss-of-function variants seem to be viable in some natural contexts, as prior work has identified trimethoprim-resistant clinical isolates
with thymidine auxotrophy19. In the laboratory, strains with reduced TYMS activity could provide a less stringent context for testing designed sequences or evolving new DHFR function. The
existence of an enzyme velocity to growth-rate mapping—by definition—allows us to relate variation in DHFR and TYMS catalytic parameters to growth rate. It also allows one (in principle) to
do the inverse: infer in vitro catalytic parameters from growth rate measurements. The intuition follows from classic steady-state Michaelis Menten experiments: to quantify steady state
kinetics in vitro one measures enzyme initial velocity as a function of substrate concentration. In our sequencing-based experiments, variation in TYMS background effectively titrates
intracellular concentrations of DHF (substrate) while growth rate provides an estimate of velocity. Though our current dataset of three TYMS backgrounds is insufficient to uniquely constrain
precise fits for _k__cat_ and Km, we anticipate that the addition of a few additional TYMS backgrounds and/or the use of more sophisticated fitting approaches will permit more accurate
biochemical parameter inference. Indeed, recent work on the small peptide binding proteins (the PDZ and SH3 domains) has shown how measuring the growth rate effect of mutations in different
genetic backgrounds and assay conditions can well-constrain biophysical parameters for binding affinity and protein stability36,43. One might follow a conceptually similar strategy to learn
quantitative biochemical parameters from high throughput growth rate data. New microfluidics-based approaches for high-throughput biochemistry could play a key role in refining and testing
such methodology44. Together our findings shape how we think about designing enzymes and metabolic systems. Typical strategies for designing enzymes do not explicitly consider cellular
context45. As a result, a significant fraction of designs could fail simply because they are not properly matched in terms of velocity to the surrounding pathway. The limited ability of
homologs to complement growth in another species has been observed for a number of enzymes46,47,48,49,50, including DHFR51,52. Thus, even a well-designed catalytically active synthetic
enzyme could fail to rescue growth if placed in the wrong cellular context. Just as computational protein design considers entire physical complexes to create binding interactions with
altered affinity and specificity, one might consider the joint design of biochemically-interacting enzymes to alter metabolic efficiency and growth. Further study of enzyme rates and
abundance across species, as well as characterizations of enzyme velocity to growth rate mappings, will help shape our understanding of the system level constraints placed on metabolic
enzymes. METHODS ESCHERICHIA COLI EXPRESSION AND SELECTION STRAINS ER2566 Δ_folA_ Δ_thyA E. coli_ were used for all growth rate measurements; this strain was a kind gift from Dr. Steven
Benkovic and is the same used in Reynolds et al. 2011 and Thompson et al.38,53. XL1-Blue _E. coli_ (genotype: _recA1 endA1 gyrA96 thi-1 hsdR17 supE44 relA1 lac_ [F’ _proAB lacI_q_ZΔM15_
Tn_10_(Tetr)]) from Agilent Technologies were used for cloning, mutagenesis, and plasmid propagation. BL21(DE3) _E. coli_ (genotype_: fhuA2 [lon] ompT gal (λ DE3) [dcm] ∆hsdS_. _λ DE3_ = _λ
sBamHIo ∆EcoRI-B int::(lacI::PlacUV5::T7 gene1) i21 ∆nin5_) from New England Biolabs were used for protein expression. SELECTION VECTOR FOR DHFR CONSTRUCTS DHFR variants were cloned into a
modified version of the pACYC-Duet 1 vector (Novagen), which we refer to as pTet-Duet. pTet-Duet is a low-copy number vector containing two multiple cloning sites; the first is under control
of the T7 promoter and the second was modified to be regulated by the tetracycline repressor (TetR). DHFR (_folA_) is cloned into the first MCS; TYMS (_thyA_) is cloned into the second MCS.
During selections we do not induce expression of either gene but instead rely on leaky expression in ER2566 Δ_folA_ Δ_thyA E. coli_. We have deposited all of these constructs with Addgene
alongside plasmid maps [https://www.addgene.org/browse/article/28229429/]. EXPRESSION VECTOR FOR DHFR CONSTRUCTS _E.coli folA_ (the gene encoding DHFR) was cloned into the pHis8-3 expression
vector using restriction sites NcoI and XhoI. DHFR was tagged in-frame with an N-terminal 8X-Histidine tag separated from the _folA_ reading frame by a thrombin cleavage site. Individual
point mutant clones were constructed using the Quikchange II site-directed mutagenesis kit (Agilent). EXPRESSION VECTOR FOR TYMS CONSTRUCTS The _thyA_ gene (encoding TYMS) was amplified by
PCR from _E. coli_ MG1655 and cloned into the vector pET24A using XbaI/Xho restriction sites. The point mutants of TYMS (Q33S, R127A, and Q33S) were made using the Agilent QuikChange II
site-directed mutagenesis kit. These constructs are available on AddGene [https://www.addgene.org/browse/article/28229429/]. PLATE-READER BASED GROWTH RATE ASSAYS We constructed a
plasmid-based series of DHFR point mutants (spanning a range of catalytic activities) in four TYMS backgrounds by quickchange mutagenesis. The sequence-verified mutants were deposited with
Addgene ([https://www.addgene.org/browse/article/28229429/]). DHFR and TYMS point mutant combinations in the selection vector were transformed into ER2566 Δ_folA_ Δ_thyA_ chemically
competent cells by heat shock. The cells were recovered for 60 min at 37 °C with shaking at 220 rpm, spread on agar plates (Luria Broth (LB) containing 30 µg/ml chloramphenicol and 50 µg/ml
thymidine), and grown at 37 °C overnight. The next day, liquid overnight cultures were inoculated from a streak over multiple colonies and grown overnight at 37 °C in LB supplemented with 30
µg/ml chloramphenicol and 50 µg/ml thymidine. These overnight cultures were pelleted and washed with M9 minimal media, then resuspended in pre-warmed M9 media supplemented with 0.4%
glucose, 0.2% amicase, 2 mM MgSO4, 0.1 µM CaCl2, 30 µg/ml chloramphenicol (henceforth referred to as M9 selection media). Next, OD600 for all resuspended cultures was measured in a Perkin
Elmer Victor X3 plate reader. Cultures were then diluted to OD600 = 0.1 in prewarmed M9 selection media with 50 µg/ml thymidine and incubated for 4 h at 30 °C, shaking at 220 rpm. After this
period of adaptation and regrowth, cultures were back-diluted to OD600 = 0.1 in 1 ml prewarmed M9 selection media with 50 µg/ml thymidine. These cells were inoculated into 96-well culture
plate at OD600 = 0.005 (10 µl cells into 200 µl total well volume) containing prewarmed M9 selection media with 50 µg/ml thymidine; plates were sealed with EasySeal permeable covers (Sigma
Aldrich). All growth rate measurements were made in triplicate. Plates were shaken for 10 seconds before reading, and Readings of OD600 were taken every 6 min over 24 h using a BioTek
Synergy Neo2 plate reader in a 30 °C climate-controlled room. DHFR SATURATION MUTAGENESIS LIBRARY CONSTRUCTION The DHFR saturation mutagenesis library was constructed as four sublibraries in
the pTet-Duet selection vector (see above for selection vector details). Each sublibrary combines mutations within 40 contiguous amino acid positions to ensure that the mutated region can
be completely covered with short read sequencing (a 300 cycle v2 Illumina sequencing kit). The regions spanned by each sublibrary were as follows: amino acid positions 1-40 (sublibrary 1,
SL1), 41-80 (sublibrary 2, SL2), 81-120 (sublibrary 3, SL3), and 121-159 (sublibrary 4, SL4). ‘Round the Horn’ or inverse PCR (iPCR) with mutagenic NNS primers (N = A/T/G/C, S = G/C) was
used to introduce all 20 amino acid substitutions at a single amino acid position as described in Thompson et al.38 (see Appendix 1 of Thompson et al for a complete list of primers). Library
completeness was verified by deep sequencing. In our initial validation sequencing run we found that mutations at positions W22 and L104 were systematically under-represented; iPCR was
repeated for these positions and they were supplemented into their respective assembled sublibraries. After sub-library assembly, restriction digest and ligation were used to subclone each
sublibrary into pTet-Duet plasmids containing the three different TYMS backgrounds (WT, R166Q, or Q33S). The entire DHFR coding region containing restriction sites (NotI and EcoNI) was
amplified by PCR. PCR reaction was size-verified with agarose gel electrophoresis with an expected band size of 627 bp. The library PCR products and target plasmids were double digested with
NotI and EcoNI for 3 h at 37 °C. To prevent re-circularization, the digested plasmid was treated with Antarctic phosphatase for 1 h at 37 °C. The DHFR insert and treated plasmid were
ligated with T4 DNA ligase overnight at 16 °C. The concentrated ligation product was then transformed into _E. coli_ XL1-blue by electroporation, and recovered in SOB for 1 h at 37 °C. 20 µL
of the recovery culture was serially diluted and plated on LB-agar with 50 µg/mL thymidine and 30 µg/mL chloramphenicol, to permit quantification of transformants and estimate library
coverage following ligation into the alternate TYMS backgrounds. The minimum library coverage was 1000 colony forming units (CFU) per mutation. The remaining recovery culture was grown in a
flask containing 12 ml LB with 30 µg/mL chloramphenicol and 50 µg/mL thymidine at 37 °C, with 220 rpm shaking overnight. 10 ml of the overnight culture was miniprepped with the Gene-Jet
Mini-prep kit (Fisher Scientific, K0503) to obtain the plasmid library. The DHFR deep mutational scanning libraries (in all three TYMS backgrounds) have been deposited at Addgene
([https://www.addgene.org/pooled-library/reynolds-dhfr-mutagenesis], under Addgene IDs 1000000194, 1000000195,1000000196). GROWTH RATE MEASUREMENTS IN THE TURBIDOSTAT FOR ALL DHFR MUTANT
LIBRARIES All sublibraries were inoculated, grown, and sampled in triplicate. Each plasmid sublibrary was transformed into the _E. coli_ double knockout strain ER2566 ∆_folA_ ∆_thyA_ by
electroporation and recovered in SOB for one hour at 37 °C. At this transformation step we again estimated library coverage to ensure the complete library was transformed into our selection
strain. To estimate library coverage, 20 µL of the recovery culture was serial diluted with SOB and plated on LB agar plates containing 30 µg/mL chloramphenicol and 50 µg/mL of thymidine.
The remainder of the recovery culture was inoculated into M9 selection media supplemented with 50 µg/mL thymidine and grown overnight at 37 °C. All selection experiments in this work had an
estimated library coverage of 1000 CFU/mutant or greater. The overnight liquid culture was washed and back-diluted to OD600 = 0.1 in M9 selection media supplemented with 50 µg/mL thymidine,
and incubated for four hours at 30 °C to allow adaptation to selection temperature and to return the culture to log-phase growth. Following adaptation, selection was initiated by
back-diluting these cultures to an OD600 of 0.1 into 17 mL of pre-warmed M9 selection media supplemented with 50 µg/mL thymidine in continuous culture vials. These vials were then incubated
in a turbidostat with a target OD600 of 0.15 at a temperature of 30 °C. The turbidostat maintained a set optical density by adding 2.8 mL dilutions of M9 selection media supplemented with 50
µg/mL thymidine in response to OD detection, and was built according to the design of Toprak et al.54 Culture samples (1 mL each) were taken at the beginning of selection (t = 0 hours) and
at 4, 8, 12, 20, and 24 h into selection. Immediately after each time point, these 1 mL samples were pelleted by centrifugation, supernatant removed, and stored at −20 °C. NEXT GENERATION
SEQUENCING AMPLICON SAMPLE PREPARATION Each turbidostat selection sample (representing a particular timepoint for a sublibrary and replicate) was prepared for sequencing as a PCR amplicon
using Illumina TruSeq-HT i5 and i7 indexing barcodes. To generate these amplicons, each cell pellet from the growth rate assay was thawed and lysed by resuspending the cells with 100 µL dH2O
and incubation at 95 °C for 5 min. Lysates were then clarified by centrifugation at maximum speed for 10 min in a room temperature bench top microcentrifuge. Supernatants containing
plasmids were isolated from the pellet. 1 µL of each supernatant was used as the template for an initial round of PCR with Q5-Hot Start Polymerase (NEB) that amplified the DHFR coding region
of the sublibrary (10 PCR cycles total, standard Q5 reaction conditions). From this first PCR reaction, 1 µL was used in a second round of PCR (22 cycles of denaturation/anneal/elongation)
with primers that added Illumina sequencing adaptors. The sequencing primers are specified in Supplementary Table 6. Together, these two rounds of PCR yielded a final product of size: 315 bp
(SL1), 308 bp (SL2), 298 bp (SL3), 304 bp (SL4). The amplicons were size verified using 1% agarose gel electrophoresis. In the case where a sample did not produce an amplicon, the first
round PCR was repeated with 2 µL of the supernatant rather than 1 µL, with the remaining preparation identical. All amplicons were individually quantified using with Quant-iT™ PicoGreen™
dsDNA Assay Kit (ThermoFisher Scientific) and mixed in equimolar ratio, with a final target amount greater than or equal to 2000 ng. Errors in pipetting volume were minimized by ensuring
that more than 2 µL was taken from each amplicon. This mixture was gel-purified and then cleaned and concentrated using the Zymo Research DNA Clean & Concentrator-5 kit. Purity was
assessed by A260/A80 and A260/A230 nm absorbance ratios. The sample library DNA concentration was measured using a Qubit dsDNA HS Assay in a Qubit 3 Fluorometer (Invitrogen by ThermoFisher
Scientific). The sample library was diluted to 30 nM in a volume of 50 µL of TE buffer (1 mM Tris-HCl (pH 8.5), 10 mM EDTA (pH 4)). This mixed and quantified library was sequenced on an
Illumina HiSeq (150 cycle x 2 paired-end) by GeneWiz. The NGS sequencing run resulted in 252 GB of data, consisting of 337,353,664 reads. The raw data have been deposited with the NCBI
sequencing read archive under project identifier PRJNA791680. DHFR EXPRESSION AND PURIFICATION DHFR mutant variants were expressed in BL21(DE3) _E. coli_ grown at 30 °C in 50 ml Terrific
Broth (TB) with 35 µg/ml Kanamycin (Kan) for selection. Expression was induced at an OD600 = 0.6–0.8 with 250 uM IPTG, and cells were grown at 18 °C for 16–18 h. Cultures were pelleted by
centrifugation for 10 min at 5000 × _g_, 4 °C and supernatant removed; cell pellets were stored at −80 °C. Thawed cell pellets were lysed by sonication in 10 ml lysis buffer (50 mM Tris, 500
mM NaCl, 10 mM imidazole, pH 8.0 buffer containing 0.1 mM PMSF, 0.001 mg/ml pepstatin, 0.01 mg/ml leupeptin, 20 µg/ml DNAseI and 5 µg/mL lysozyme). The resulting lysate was clarified by
centrifugation and incubated with 0.1 ml Ni-NTA agarose (Qiagen) slurry (0.05 ml column volume) equilibrated in Nickel Binding Buffer (NiBB, 50 mM Tris pH 8.0, 500 mM NaCl, 10 mM imidazole)
for 15 min on a tube rocker at 4 °C. The slurry was then transferred to a disposable polypropylene column (BioRad). After washing with 10 column volumes (CV) of NiBB, DHFR was eluted with
0.5 mL 50 mM Tris pH 8.0, 500 mM NaCl, 400 mM imidazole. The eluted protein was concentrated and buffer-exchanged to 50 mM Tris, pH 8.0 in a 10 kDa Amicon centrifugal concentrator
(Millipore) and centrifuged 15 min at 21,000 × _g_, 4 °C to pellet any precipitates. Following buffer exchange, the protein was purified by anion exchange chromatography (using a BioRad
HiTrapQ HP column on a BioRad NGC Quest FPLC). A linear gradient was run from 0 to 1 M NaCl in 50 mM Tris pH 8.0 over 30 ml (30 column volumes, CV) while collecting 0.5 ml fractions.
Fractions containing DHFR were combined, concentrated, flash-frozen in liquid nitrogen, and stored at −80 °C. TYMS EXPRESSION AND PURIFICATION Individual TYMS mutants were expressed in
BL21(DE3) _E. coli_ grown at 37 °C in 50 ml Terrific Broth (TB) with 35 µg/ml Kanamycin (Kan) for selection. Expression was induced with 1 mM IPTG when the cells reached an OD600 = 0.6–0.8,
and the cells were then grown at 18 °C for 16–18 h before harvesting pellets for storage at −80 °C. TYMS was purified from the frozen pellets following a protocol adapted from Changchien et
al.55 Cell pellets were thawed and resuspended in TYMS lysis buffer (20 mM Tris, 10 mM MgCl2, 0.1% deoxycholic acid, pH 7.5 with 5 mM DTT, 0.2 mg/ml lysozyme, 5 µg/ml DNAse I) and incubated
at room temperature while rocking for 15 min. The resulting supernatant was clarified by centrifugation. Next, streptomycin sulfate was added to a final concentration of 0.75% to separate
nucleic acids. The cells were incubated rocking at 4 °C for 10 min and the supernatant was retained following centrifugation for 10 min at >10,000 × _g_. Ammonium sulfate was then added
at 50% saturation (0.3 g/ml), mixed for 10 min at 4 °C, then centrifuged as above, retaining supernatant. Additional ammonium sulfate was then added to the supernatant at 80% saturation (an
additional 0.2 g/ml), mixed for 10 min at 4 °C, and centrifuged as above, retaining the pellet. The pellet was dissolved in 25 mM potassium phosphate pH 6.5 and dialyzed overnight at 4 °C
against 1 L 25 mM potassium phosphate pH 6.5. Following dialysis the protein was purified by anion exchange (HiTrap Q HP column, Cytiva) with a 25 CV linear gradient from 0 M NaCl to 1 M
NaCl in 25 mM potassium phosphate pH 6.5. FPLC fractions containing TYMS were combined and concentrated using a 10 kDa Amicon concentrator (Millipore) and stored at 4 °C for up to a week.
DHFR STEADY-STATE MICHAELIS MENTEN KINETICS DHFR _k__cat_ and Km were determined under Michaelis-Menten conditions with saturating concentrations of NADPH as in prior work53,56. Briefly,
DHFR protein concentration was determined by measuring A280 (extinction coefficient = 33500 M−1cm−1). DHF (Sigma Aldrich) was prepared in MTEN buffer (50 mM MES, 25 mM Tris base, 25 mM
Ethanolamine, 100 mM NaCl, pH 7.0) containing 5 mM DTT (Sigma Aldrich). 100 nM DHFR protein and 100 µM NADPH (Sigma Aldrich) were combined in MTEN buffer with 5 mM DTT and pre-incubated for
1 h at 25 °C prior to measurement. To initiate the reaction, the protein-NADPH solution was mixed with DHF in a quartz cuvette (sampling DHF over a range of concentrations, tuned to the Km
of the mutant). The initial velocity of DHFR was measured spectrophotometrically by monitoring the consumption of NADPH and DHF (decrease in absorbance at 340 nm, Δε340 = 13.2 mM−1 cm−1).
All measurements were made in triplicate; analysis was performed using the Michaelis-Menten nonlinear regression function of Graph Pad Prism. PREPARATION OF TYMS SUBSTRATE FOR ASSAYING
ENZYME ACTIVITY AND STEADY STATE KINETICS (6 R)-methylenetetrahydrofolic acid (MTHF, CH2H4fol) was purchased from Merck & Cie (Switzerland) and dissolved to 100 mM in nitrogen-sparged
citrate-ascorbate buffer (10 mM ascorbic acid, 8.5 mM sodium citrate, pH 8.0). 30 µL aliquots were made in light-safe microcentrifuge tubes, flash-frozen in liquid nitrogen, and stored at
−80 C. Before use, the stock was thawed and diluted to 10 mM in TYMS kinetic reaction buffer (100 mM Tris base, 5 mM Formaldehyde, 1 mM EDTA, pH 7.5) and quantified in an enzymatic assay: 50
µM MTHF, 200 µM dUMP and 1 µM TYMS protein were combined and A340 measured until steady-state reached. Actual concentration was then calculated from the difference in A340 before and after
the reaction using Beer’s Law (MTHF extinction coefficient: 6.4 mM−1cm−1). TYMS LYSATE ASSAYS FOR ESTIMATION OF INTRACELLULAR PROTEIN ABUNDANCE We measured the TYMS activity in crude lysates
of six _E. coli_ strains: (1) WT ER2566, (2) ER2566 ∆_folA_ ∆_thyA_, and the ER2566 ∆_folA_ ∆_thyA_ strain transformed with a pTet-Duet selection vector harboring DHFR D27N matched with
either (3) TYMS R166Q, (4) TYMS R127A, (5) TYMS Q33S, or (6) TYMS WT. Cultures of each were grown to mid-log phase (OD600 = 0.4–0.6) under media conditions identical to our selection (M9
selection media supplemented with 50 µg/mL thymidine). The resulting culture was pelleted for 10 min at 5000 x g, liquid media was decanted, and pellets were weighed and frozen at −80 °C. We
found that pellets from 250 ml of culture were sufficient for triplicate assays. Pellets were thawed and lysed with B-PERII reagent (Thermo Fisher) supplemented with 50 mM MgCl2, 5 mM DTT,
0.2 mg/mL lysozyme, and 0.005 mg/mL DNAseI. The lysis volume was normalized to 1 ml per 0.1 mg of pellet weight. Once resuspended, all pellets were incubated in lysis reagent for 15 min at
room temperature on a tube rotator, then pelleted at 21,000 × _g_ at 4 °C. The soluble fraction resulting from lysis was used for lysate based enzyme activity measurements. To measure enzyme
velocity, 720 µL of soluble cell lysate was combined with 80 µl of the TYMS substrates at saturating concentration (150 µM CH2H4fol and 100 µM dUMP in TYMS assay buffer with 5 mM DTT) in a
cuvette. Immediately following substrate addition, DHF accumulation was monitored spectrophotometrically by absorbance at 340 nM for 900 seconds. All measurements were referenced to a blank
cuvette containing lysate with no addition of substrates. DHF concentration was calculated from A340 using the DHF extinction coefficient (Δε340 = 6.4 mM−1 cm−1) and Beer-Lambert equation.
The resulting DHF concentration measurements over time were entered into GraphPad Prism and background-subtracted to zero for the initial time point across all samples. Enzyme velocity was
estimated by linear fit of the first 200 seconds of data. Dividing this velocity by the kcat values for TYMS (as previously measured in our lab) yielded an estimate of intracellular TYMS
concentration for each variant. TYMS STEADY-STATE MICHAELIS MENTEN KINETICS TYMS _k__cat_ and Km were determined for both dUMP and MTHF under Michaelis-Menten conditions by varying one
substrate and holding the other saturating as in prior work57,58. Briefly, TYMS protein concentration was determined by measuring A280 (extinction coefficient = 53400 M−1cm−1). TYMS protein
was prepared in TYMS assay buffer (100 mM Tris base, 5 mM Formaldehyde, 1 mM EDTA, pH 7.5) containing 50 mM DTT (Sigma Aldrich). 50 nM TYMS protein and either 100 µM dUMP (Sigma Aldrich) or
150 µM MTHF (Merck & Cie) were combined with varying concentrations of the other substrate to initiate the reaction. The production of DHF was monitored spectrophotometrically (increase
in absorbance at 340 nm, Δε340 = 6.4 mM−1cm−1) for 2 min per reaction. All measurements were made in triplicate; analysis was performed using the Michaelis-Menten nonlinear regression
function of Graph Pad Prism. ENZYME VELOCITY TO GROWTH RATE MODEL CONSTRUCTION AND PARAMETERIZATION For the purposes of modeling, we approximated DHFR and TYMS as a two-enzyme cycle in which
DHFR produces THF and consumes DHF, and TYMS produces DHF and consumes THF. This abstraction ignores the different carbon-carrying THF species, instead collapsing them into a single reduced
folate pool. This simplification allows us to construct an analytically solvable model for steady state THF concentration that we can then relate to growth (Eq. 3). First, we fit the free
parameters in the Goldbeter-Koshland equation ([DHFR], [TYMSWT], [TYMSR166Q], foltot) using a set of ten metabolomics measurements for the relative abundance of the 3-glutamate form of
formyl THF as obtained in prior work1. These measurements were made for DHFR mutations G121V, F31Y/L54I, M42F/G121V, F31Y/G121V and the WT in the background of WT TYMS and TYMS R166Q. So why
this particular folate species? We noticed that the relative abundance of many of the reduced THF species in our data set was correlated, and chose formyl THF to model because the
experimental data were less variable and showed a strong, monotonic relationship with cell growth. Then, we fit the free parameters in equation two (\({g}_{\max }\), \({g}_{\min }\), \(K\),
\(n\)) using a set of ten growth rate measurements for the same DHFR/TYMS mutation pairs. This process gave rise to the fits shown in Fig. 1. When assessing model performance against the
larger set of TYMS variants (as in Fig. 2) we refit all parameters (\({g}_{\max }\), \({g}_{\min }\), \(K\), \(n\), [DHFR], [TYMSWT], [TYMSR166Q], [TYMSQ33S], [TYMSR127A], foltot) to the
growth rate data only since we did not have metabolomics data for this larger set. All parameter fits were made in python using the least_squares fitting function of the scipy.optimize
module59; the complete fitting process is documented in Jupyter notebook 1_KGmodel.ipynb in the associated github repository. We assessed the model sensitivity to shuffling the data
(Supplementary Fig. 1e-h) by randomly shuffling all catalytic parameters (_k__cat_, Km) 50 times across DHFR and TYMS and computing an R2 value and RMSD. We also assessed model sensitivity
to subsampling the data; error bars in Figs. 1, 2, and Supplementary Fig. 1g,h correspond to SEM across jackknife re-samplings of the data wherein one DHFR/TYMS combination was left out for
each re-sampling. Finally, to assess the global model fit to the data (as in Fig. 5) we first fit the 9 model parameters (\({g}_{\max }\), \({g}_{\min }\), \(K\), \(n\), [DHFR], [TYMSWT],
[TYMSR166Q], [TYMSQ33S], foltot) using the growth rate measurements of 16 DHFR mutations for which experimental _k__cat_ and Km were known (48 total observations given the three TYMS
backgrounds). Then, fixing these parameters, we fit _k__cat_ and Km values for all 2696 mutations with growth rate measurements in all three TYMS backgrounds to the complete data set of 8088
sequencing-based growth rate observations. This process is documented in Jupyter notebook 4_ModelAndDMSData.ipynb in the associated github repository. NEXT GENERATION SEQUENCING DATA
PROCESSING AND READ COUNTING All PCR amplicons (corresponding to individual replicates, timepoints and sublibraries) were sequenced on an Illumina HiSeq using 2 × 150 paired end reads. The
resulting fastq files were processed and filtered prior to read counting. Briefly, the forward and reverse reads were merged using USEARCH. Each read was quality score filtered (Q-Score ≥20)
and identified as a WT or mutant of DHFR using a custom python script. This python script filtered for full length reads and base call quality scores greater than 20 (error rate ≤1:100).
The reads passing these quality control criteria were compared against the wild-type reference sequence to determine mutation identify. Reads that contained multiple point mutations or
mutations outside the sublibrary of interest were removed from analysis. This process resulted in counts for the WT and each mutant at each time point and replicate. These counts were
further corrected given the expected error in the data (q-score) and Hamming distance from the WT codon to account for potential hopping of WT reads to mutations; a process that was detailed
in McCormick et al.56. RELATIVE GROWTH RATE CALCULATIONS We calculated relative growth rates for individual mutations and the WT over time from the sequencing-based counts
(\({N}_{t}^{{mut}},{N}_{t}^{{WT}}\)). Mutants with fewer than 10 counts were considered absent from the data set and were set to zero to reduce noise. From these thresholded counts, we
calculated a log normalized relative frequency of each mutation over time: $${\log }_{2}\left(f\left(t\right)\right)={\log
}_{2}\left({N}_{{{{{{\rm{t}}}}}}}^{{{{{{\rm{mut}}}}}}}/{N}_{{{{{{\rm{t}}}}}}}^{{{{{{\rm{WT}}}}}}}\right)-{\log
}_{2}\left({N}_{{{{{{\rm{t}}}}}}=0}^{{{{{{\rm{mut}}}}}}}/{N}_{{{{{{\rm{t}}}}}}=0}^{{{{{{\rm{WT}}}}}}}\right)\,$$ (5) We then calculated relative growth rate
(\({m}_{{{{{{\rm{DHmut}}}}}}}^{{{{{{\rm{TS}}}}}}}\)) as the slope of the log relative frequency over time by linear regression. Linear regression was performed using scikit Learn, and
individual points were weighted by the number of counts (in order to down weight less-sampled mutants at later time points). Relative growth rate of a mutant was only calculated if the
mutant was present over at least the first three time points, otherwise it was classified as a Null mutant. Finally, all relative growth rates were normalized such that WT has a relative
growth rate of 1. Growth rates were additionally normalized by the bulk culture growth rate (estimated from the turbidostat, in units of generations per hour) to account for small
vial-to-vial variations culture doublings across the experiment. The standard error in growth rate was computed across triplicate measurements. All calculations are shown in Jupyter notebook
2_DMSGrowthRates.ipynb in the associated github repository. EPISTASIS ANALYSIS Epistasis was calculated according to an additive model: $${\varepsilon
}_{{{{{{\rm{DHmut}}}}}},{{{{{\rm{TSmut}}}}}}}={m}_{{{{{{\rm{DHmut}}}}}}}^{{{{{{\rm{TS}}}}}\_{mut}}}-{m}_{{{{{{\rm{DHmut}}}}}}}^{{{{{{\rm{T}}}}}}{{{{{{\rm{S}}}}}}}_{{{{{{\rm{WT}}}}}}}}\,$$
(6) In our experiments TYMS R127A, Q33S and R166Q have no growth rate effect in the WT DHFR context due to thymidine supplementation. Under this formalism, mutations that show improved
growth in the mutated TYMS background have positive epistasis, while mutations with reduced growth in the mutated TYMS background have negative epistasis. We assessed the statistical
significance of epistasis by unequal variance _t_ test under the null hypothesis that the mutations have equal mean growth rates in both TYMS backgrounds (across three replicate
measurements). These _p_ values were compared to a multiple-hypothesis testing adjusted _p_ value determined by Sequential Goodness of Fit (_P_ = 0.035 for TYMS Q33S and _P_ = 0.029 for TYMS
R166Q)35. K-means clustering of epistatic positions was performed using a custom script based on that described in Thompson et al. 38 All epistasis calculations are shown in Jupyter
notebook 3_Epistasis.ipynb in the associated github repository. REPORTING SUMMARY Further information on research design is available in the Nature Portfolio Reporting Summary linked to this
article. DATA AVAILABILITY The raw sequencing data generated in this study were deposited in FASTQ format in the NCBI sequencing read archive, under BioProject ID PRJNA791680
[https://www.ncbi.nlm.nih.gov/bioproject/791680]. The processed growth rates and epistasis measurements (as inferred from the sequencing data) are available as Supplementary Data 1 and 2.
All code used to process these data are available in github [release v1.0.0: https://doi.org/10.5281/zenodo.10845716]. The processed growth rates and epistasis measurements are also
available in github (see Output directory) as tab-delimited text and python-importable pickle files. Metabolomics data for formyl THF and DHF used in model fitting were previously described1
and are specified in the github python notebook 1_KGModel. Biochemical rate constants for DHFR and TYMS (compiled from both this study and other published works) used in model fitting can
be found in Supplementary Tables 2 and 3. Parameter fits from all described iterations of model fitting are in Supplementary Table 1. The structural data for TYMS (1BID
[https://doi.org/10.2210/pdb1BID/pdb]) and DHFR (1RX2 [https://doi.org/10.2210/pdb1RX2/pdb]) used in this study are available from the PDB. CODE AVAILABILITY Code for the enzyme velocity to
growth rate model, and analysis of all deep mutational scanning data is available on github: https://github.com/reynoldsk/dhfr-tyms-epistasis. The stable DOI for the code release associated
with this publication is: https://doi.org/10.5281/zenodo.10845716. REFERENCES * Schober, A. F. et al. A two-enzyme adaptive unit within bacterial folate metabolism. _Cell Rep._ 27,
3359–3370.e7 (2019). Article CAS PubMed PubMed Central Google Scholar * Kemble, H. E. et al. Flux, toxicity and protein expression costs shape genetic interaction in a metabolic
pathway. http://biorxiv.org/lookup/doi/10.1101/362327 (2018) https://doi.org/10.1101/362327. * Lilja, E. E. & Johnson, D. R. Metabolite toxicity determines the pace of molecular
evolution within microbial populations. _BMC Evol. Biol._ 17, 52 (2017). Article PubMed PubMed Central Google Scholar * Noor, E. et al. The protein cost of metabolic fluxes: Prediction
from enzymatic rate laws and cost minimization. _PLOS Comput. Biol._ 12, e1005167 (2016). Article PubMed PubMed Central Google Scholar * Dekel, E. & Alon, U. Optimality and
evolutionary tuning of the expression level of a protein. _Nature_ 436, 588–592 (2005). Article ADS CAS PubMed Google Scholar * Bar-Even, A. et al. The moderately efficient enzyme:
Evolutionary and physicochemical trends shaping enzyme parameters. _Biochemistry_ 50, 4402–4410 (2011). Article CAS PubMed Google Scholar * Lalanne, J.-B. et al. Evolutionary convergence
of pathway-specific enzyme expression stoichiometry. _Cell_ 173, 749–761.e38 (2018). Article CAS PubMed PubMed Central Google Scholar * Kacser, H. & Burns, J. A. The control of
flux. _Biochem Soc. Trans._ 23, 341–366 (1995). Article CAS PubMed Google Scholar * Dykhuizen, D. E., Dean, A. M. & Hartl, D. L. Metabolic flux and fitness. _Genetics_ 115, 25–31
(1987). Article CAS PubMed PubMed Central Google Scholar * Kacser, H. & Burns, J. A. The molecular basis of dominance. _Genetics_ 97, 639–666 (1981). Article CAS PubMed PubMed
Central Google Scholar * Domingo, J., Baeza-Centurion, P. & Lehner, B. The causes and consequences of genetic interactions (epistasis). _Annu. Rev. Genom. Hum. Genet._ 20, 433–460
(2019). Article CAS Google Scholar * Pokusaeva, V. O. et al. An experimental assay of the interactions of amino acids from orthologous sequences shaping a complex fitness landscape. _PLoS
Genet_ 15, e1008079 (2019). Article CAS PubMed PubMed Central Google Scholar * Kondrashov, A. S., Sunyaev, S. & Kondrashov, F. A. Dobzhansky-Muller incompatibilities in protein
evolution. _Proc. Natl. Acad. Sci. USA_ 99, 14878–14883 (2002). Article ADS CAS PubMed PubMed Central Google Scholar * Szathmáry, E. Do deleterious mutations act synergistically?
Metabolic control theory provides a partial answer. _Genetics_ 133, 127–132 (1993). Article PubMed PubMed Central Google Scholar * Noda-Garcia, L., Liebermeister, W. & Tawfik, D. S.
Metabolite-enzyme coevolution: From single enzymes to metabolic pathways and networks. _Annu. Rev. Biochem_ 87, 187–216 (2018). Article CAS PubMed Google Scholar * Ducker, G. S. &
Rabinowitz, J. D. One-carbon metabolism in health and disease. _Cell Metab._ 25, 27–42 (2017). Article CAS PubMed Google Scholar * Nijhout, H. F., Reed, M. C., Budu, P. & Ulrich, C.
M. A mathematical model of the folate cycle: New insights into folate homeostasis. _J. Biol. Chem._ 279, 55008–55016 (2004). Article CAS PubMed Google Scholar * Leduc, D. et al.
Flavin-dependent thymidylate synthase ThyX activity: Implications for the folate cycle in bacteria. _J. Bacteriol._ 189, 8537–8545 (2007). Article CAS PubMed PubMed Central Google
Scholar * King, C. H., Shlaes, D. M. & Dul, M. J. Infection caused by thymidine-requiring, trimethoprim-resistant bacteria. _J. Clin. Microbiol._ 18, 79–83 (1983). Article CAS PubMed
PubMed Central Google Scholar * Howell, E. E., Foster, P. G. & Foster, L. M. Construction of a dihydrofolate reductase-deficient mutant of Escherichia coli by gene replacement. _J.
Bacteriol._ 170, 3040–3045 (1988). Article CAS PubMed PubMed Central Google Scholar * Rodrigues, J. V. & Shakhnovich, E. I. Adaptation to mutational inactivation of an essential
gene converges to an accessible suboptimal fitness peak. _Elife_ 8, e50509 (2019). Article CAS PubMed PubMed Central Google Scholar * Kwon, Y. K., Higgins, M. B. & Rabinowitz, J. D.
Antifolate-induced depletion of intracellular glycine and purines inhibits thymineless death in _E. coli_. _ACS Chem. Biol._ 5, 787–795 (2010). Article CAS PubMed PubMed Central Google
Scholar * Kwon, Y. K. et al. A domino effect in antifolate drug action in Escherichia coli. _Nat. Chem. Biol._ 4, 602–608 (2008). Article CAS PubMed PubMed Central Google Scholar *
Rodrigues, J. V. et al. Biophysical principles predict fitness landscapes of drug resistance. _PNAS_ 113, E1470–E1478 (2016). Article CAS PubMed PubMed Central Google Scholar * Ferrell,
J. E. & Ha, S. H. Ultrasensitivity part I: Michaelian responses and zero-order ultrasensitivity. _Trends Biochem. Sci._ 39, 496–503 (2014). Article CAS PubMed PubMed Central Google
Scholar * Goldbeter, A. & Koshland, D. E. An amplified sensitivity arising from covalent modification in biological systems. _Proc. Natl. Acad. Sci._ 78, 6840–6844 (1981). Article ADS
MathSciNet CAS PubMed PubMed Central Google Scholar * Sotelo-Mundo, R. R., Changchien, L., Maley, F. & Montfort, W. R. Crystal structures of thymidylate synthase mutant R166Q:
Structural basis for the nearly complete loss of catalytic activity. _J. Biochem. Mol. Toxicol._ 20, 88–92 (2006). Article CAS PubMed Google Scholar * Michaels, M. L., Kim, C. W.,
Matthews, D. A. & Miller, J. H. Escherichia coli thymidylate synthase: Amino acid substitutions by suppression of amber nonsense mutations. _PNAS_ 87, 3957–3961 (1990). Article ADS CAS
PubMed PubMed Central Google Scholar * Schnell, J. R., Dyson, H. J. & Wright, P. E. Structure, dynamics, and catalytic function of dihydrofolate reductase. _Annu. Rev. Biophys.
Biomol. Struct._ 33, 119–140 (2004). Article CAS PubMed Google Scholar * Shrimpton, P. & Allemann, R. K. Role of water in the catalytic cycle of E. coli dihydrofolate reductase.
_Protein Sci._ 11, 1442–1451 (2002). Article CAS PubMed PubMed Central Google Scholar * Gekko, K., Kunori, Y., Takeuchi, H., Ichihara, S. & Kodama, M. Point mutations at glycine-121
of Escherichia coli dihydrofolate reductase: Important roles of a flexible loop in the stability and function. _J. Biochem_ 116, 34–41 (1994). Article CAS PubMed Google Scholar *
Sawaya, M. R. & Kraut, J. Loop and subdomain movements in the mechanism of _Escherichia coli_ dihydrofolate reductase: crystallographic evidence. _Biochemistry_ 36, 586–603 (1997).
Article CAS PubMed Google Scholar * Chen, J. T., Taira, K., Tu, C. P. D. & Benkovic, S. J. Probing the functional role of phenylalanine-31 of Escherichia coli dihydrofolate reductase
by site-directed mutagenesis. _Biochemistry_ 26, 4093–4100 (1987). Article CAS PubMed Google Scholar * Watney, J. B., Agarwal, P. K. & Hammes-Schiffer, S. Effect of mutation on
enzyme motion in dihydrofolate reductase. _J. Am. Chem. Soc._ 125, 3745–3750 (2003). Article CAS PubMed Google Scholar * Carvajal-Rodríguez, A., de Uña-Alvarez, J. & Rolán-Alvarez,
E. A new multitest correction (SGoF) that increases its statistical power when increasing the number of tests. _BMC Bioinforma._ 10, 209 (2009). Article Google Scholar * Faure, A. J. et
al. Mapping the energetic and allosteric landscapes of protein binding domains. _Nature_ 604, 175–183 (2022). Article ADS CAS PubMed Google Scholar * Tamer, Y. T. et al. High-order
epistasis in catalytic power of dihydrofolate reductase gives rise to a rugged fitness landscape in the presence of trimethoprim selection. _Mol. Biol. Evolution_ 36, 1533–1550 (2019).
Article CAS Google Scholar * Thompson, S., Zhang, Y., Ingle, C., Reynolds, K. A. & Kortemme, T. Altered expression of a quality control protease in E. coli reshapes the in vivo
mutational landscape of a model enzyme. _eLife_ 9, e53476 (2020). Article CAS PubMed PubMed Central Google Scholar * Karanicolas, J. & Kuhlman, B. Computational design of affinity
and specificity at protein–protein interfaces. _Curr. Opin. Struct. Biol._ 19, 458–463 (2009). Article CAS PubMed PubMed Central Google Scholar * McClune, C. J., Alvarez-Buylla, A.,
Voigt, C. A. & Laub, M. T. Engineering orthogonal signalling pathways reveals the sparse occupancy of sequence space. _Nature_ 574, 702–706 (2019). Article ADS CAS PubMed PubMed
Central Google Scholar * Li, G.-W., Burkhardt, D., Gross, C. & Weissman, J. S. Quantifying absolute protein synthesis rates reveals principles underlying allocation of cellular
resources. _Cell_ 157, 624–635 (2014). Article CAS PubMed PubMed Central Google Scholar * Papp, B., Pál, C. & Hurst, L. D. Dosage sensitivity and the evolution of gene families in
yeast. _Nature_ 424, 194–197 (2003). Article ADS CAS PubMed Google Scholar * Otwinowski, J. Biophysical inference of epistasis and the effects of mutations on protein stability and
function. _Mol. Biol. Evolut._ 35, 2345–2354 (2018). Article CAS Google Scholar * Markin, C. J. et al. Revealing enzyme functional architecture via high-throughput microfluidic enzyme
kinetics. _Science_ 373, eabf8761 (2021). Article CAS PubMed PubMed Central Google Scholar * Lovelock, S. L. et al. The road to fully programmable protein catalysis. _Nature_ 606, 49–58
(2022). Article ADS CAS PubMed Google Scholar * Michener, J. K., Camargo Neves, A. A., Vuilleumier, S., Bringel, F. & Marx, C. J. Effective use of a horizontally-transferred
pathway for dichloromethane catabolism requires post–transfer refinement. _eLife_ 3, e04279 (2014). Article PubMed PubMed Central Google Scholar * Michener, J. K., Vuilleumier, S.,
Bringel, F. & Marx, C. J. Phylogeny poorly predicts the utility of a challenging horizontally transferred gene in Methylobacterium strains. _J. Bacteriol._ 196, 2101–2107 (2014). Article
PubMed PubMed Central Google Scholar * Kim, J. & Copley, S. D. Inhibitory cross-talk upon introduction of a new metabolic pathway into an existing metabolic network. _Proc. Natl.
Acad. Sci. USA_ 109, E2856–E2864 (2012). Article ADS CAS PubMed PubMed Central Google Scholar * Kachroo, A. H. et al. Systematic humanization of yeast genes reveals conserved functions
and genetic modularity. _Science_ 348, 921–925 (2015). Article ADS CAS PubMed PubMed Central Google Scholar * Russ, W. P. et al. An evolution-based model for designing chorismate
mutase enzymes. _Science_ 369, 440–445 (2020). Article ADS MathSciNet CAS PubMed Google Scholar * Bhattacharyya, S. et al. Transient protein-protein interactions perturb E. coli
metabolome and cause gene dosage toxicity. _Elife_ 5, e20309 (2016). Article PubMed PubMed Central Google Scholar * Bershtein, S. et al. Protein homeostasis imposes a barrier on
functional integration of horizontally transferred genes in bacteria. _PLoS Genet_ 11, e1005612 (2015). Article PubMed PubMed Central Google Scholar * Reynolds, K. A., McLaughlin, R. N.
& Ranganathan, R. Hot spots for allosteric regulation on protein surfaces. _Cell_ 147, 1564–1575 (2011). Article CAS PubMed PubMed Central Google Scholar * Toprak, E. et al.
Building a morbidostat: an automated continuous-culture device for studying bacterial drug resistance under dynamically sustained drug inhibition. _Nat. Protoc._ 8, 555–567 (2013). Article
PubMed PubMed Central Google Scholar * Changchien, L.-M. et al. High-level expression of escherichia coli and bacillus subtilis thymidylate synthases. _Protein Expr. Purif._ 19, 265–270
(2000). Article CAS PubMed Google Scholar * McCormick, J. W., Russo, M. A., Thompson, S., Blevins, A. & Reynolds, K. A. Structurally distributed surface sites tune allosteric
regulation. _Elife_ 10, e68346 (2021). Article CAS PubMed PubMed Central Google Scholar * Agrawal, N., Hong, B., Mihai, C. & Kohen, A. Vibrationally enhanced hydrogen tunneling in
the Escherichia coli thymidylate synthase catalyzed reaction. _Biochemistry_ 43, 1998–2006 (2004). Article CAS PubMed Google Scholar * Wang, Z. et al. Mg2+ binds to the surface of
thymidylate synthase and affects hydride transfer at the interior active site. _J. Am. Chem. Soc._ 135, 7583–7592 (2013). Article CAS PubMed PubMed Central Google Scholar * Virtanen, P.
et al. SciPy 1.0: Fundamental algorithms for scientific computing in Python. _Nat. Methods_ 17, 261–272 (2020). Article CAS PubMed PubMed Central Google Scholar * Stout, T. J., Sage,
C. R. & Stroud, R. M. The additivity of substrate fragments in enzyme–ligand binding. _Structure_ 6, 839–848 (1998). Article CAS PubMed Google Scholar Download references
ACKNOWLEDGEMENTS We thank Olivier Rivoire for early conversations regarding our data, and Elliott Ross for critical feedback on both the modeling and manuscript. We also thank the Reynolds
lab for feedback on experimental design, data analysis, and the manuscript. Research reported in this publication was supported by the National Institute of General Medical Sciences of the
National Institutes of Health under Award Number R01GM136842 to KAR. The content is solely the responsibility of the authors and does not necessarily represent the official views of the
National Institutes of Health. This work was also partly supported in its early stages by the Gordon and Betty Moore Foundation’s Data Driven Discovery Initiative through grant GBMF4557 to
KAR. AUTHOR INFORMATION Author notes * Thuy N. Nguyen Present address: Form Bio, Dallas, TX, 75226, USA * Samuel Thompson Present address: Department of Bioengineering, Stanford University,
Stanford, CA, 94305, USA AUTHORS AND AFFILIATIONS * The Green Center for Systems Biology, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA Thuy N. Nguyen, Christine
Ingle & Kimberly A. Reynolds * The Lyda Hill Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA Thuy N. Nguyen, Christine Ingle &
Kimberly A. Reynolds * The Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA Thuy N. Nguyen, Christine Ingle & Kimberly A. Reynolds *
Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, 94158, USA Samuel Thompson Authors * Thuy N. Nguyen View author publications You can also
search for this author inPubMed Google Scholar * Christine Ingle View author publications You can also search for this author inPubMed Google Scholar * Samuel Thompson View author
publications You can also search for this author inPubMed Google Scholar * Kimberly A. Reynolds View author publications You can also search for this author inPubMed Google Scholar
CONTRIBUTIONS T.N.N. and K.A.R. conceptualized the work and designed experiments. K.A.R. created the mathematical model. T.N.N. collected all deep mutational scanning data. T.N.N. and K.A.R.
analyzed the data. C.I. collected plate-reader based growth rate measurements (used in model development), performed all Michaelis-Menten enzyme kinetics assays, and performed the TYMS
lysate assays. S.M.T. constructed the deep mutational scanning library and contributed to data interpretation. K.A.R. wrote the original draft. T.N.N, C.I., S.M.T. and K.A.R. revised the
paper. K.A.R. provided supervision and funding acquisition. CORRESPONDING AUTHOR Correspondence to Kimberly A. Reynolds. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no
competing interests. PEER REVIEW PEER REVIEW INFORMATION _Nature Communications_ thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is
available. ADDITIONAL INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. SUPPLEMENTARY
INFORMATION SUPPLEMENTARY INFORMATION PEER REVIEW FILE DESCRIPTION OF ADDITIONAL SUPPLEMENTARY FILES SUPPLEMENTARY DATA 1 SUPPLEMENTARY DATA 2 REPORTING SUMMARY RIGHTS AND PERMISSIONS OPEN
ACCESS This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format,
as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third
party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the
article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright
holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Nguyen, T.N., Ingle, C., Thompson,
S. _et al._ The genetic landscape of a metabolic interaction. _Nat Commun_ 15, 3351 (2024). https://doi.org/10.1038/s41467-024-47671-0 Download citation * Received: 17 August 2023 *
Accepted: 09 April 2024 * Published: 18 April 2024 * DOI: https://doi.org/10.1038/s41467-024-47671-0 SHARE THIS ARTICLE Anyone you share the following link with will be able to read this
content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative