A hydrophobic ratchet entrenches molecular complexes

Select a language for the TTS:
UK English Female
UK English Male
US English Female
US English Male
Australian Female
Australian Male
Language selected: (auto detect) - EN

Nature

Play all audios:

Most proteins assemble into multisubunit complexes1. The persistence of these complexes across evolutionary time is usually explained as the result of natural selection for functional

properties that depend on multimerization, such as intersubunit allostery or the capacity to do mechanical work2. In many complexes, however, multimerization does not enable any known

function3. An alternative explanation is that multimers could become entrenched if substitutions accumulate that are neutral in multimers but deleterious in monomers; purifying selection

would then prevent reversion to the unassembled form, even if assembly per se does not enhance biological function3,4,5,6,7. Here we show that a hydrophobic mutational ratchet systematically

entrenches molecular complexes. By applying ancestral protein reconstruction and biochemical assays to the evolution of steroid hormone receptors, we show that an ancient hydrophobic

interface, conserved for hundreds of millions of years, is entrenched because exposure of this interface to solvent reduces protein stability and causes aggregation, even though the

interface makes no detectable contribution to function. Using structural bioinformatics, we show that a universal mutational propensity drives sites that are buried in multimeric interfaces

to accumulate hydrophobic substitutions to levels that are not tolerated in monomers. In a database of hundreds of families of multimers, most show signatures of long-term hydrophobic

entrenchment. It is therefore likely that many protein complexes persist because a simple ratchet-like mechanism entrenches them across evolutionary time, even when they are functionally

gratuitous.

Data have been deposited in the Open Science Framework (https://osf.io/) under accession GTJ86, including alignment, phylogeny, sequences and posterior probability of ancestral

reconstructions; list of PDB identifiers for coordinates of dimers and monomers in our structural database; and molecular dynamics trajectories.

Scripts and code for structural bioinformatics analysis have been deposited at github (https://github.com/JoeThorntonLab).

We thank J. Bridgham for cell culture training and advice, A. Pillai for assistance with experiments, and members of the Thornton Laboratory for comments. Molecular dynamics computations

were performed on resources provided by SNIC through Uppsala Multidisciplinary Center for Advanced Computational Science (UPPMAX) under Projects SNIC 2019/8-36 and SNIC 2019/3-189. Supported

by a Chicago Fellowship (G.K.A.H.), NIH R01GM131128 (J.W.T.) and R01GM121931 (J.W.T.).

Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA

Georg K. A. Hochberg, Brian P. H. Metzger & Joseph W. Thornton

Department of Chemistry, Texas A&M University, College Station, TX, USA

Department of Chemistry – BMC, Uppsala University, Uppsala, Sweden

Department of Human Genetics, University of Chicago, Chicago, IL, USA

G.K.A.H. and J.W.T. conceived the project and oversaw the manuscript writing. G.K.A.H. performed phylogenetics, ancestral sequence reconstruction, protein purification, cell culture, and

biophysical experiments. Y.L. and A.L. performed and interpreted native MS experiments. E.G.M. performed and analysed molecular dynamics simulations. G.K.A.H. and B.P.H.M. designed

bioinformatic analyses, which G.K.A.H. performed. G.K.A.H. and J.W.T. interpreted all data. All authors contributed to manuscript writing.

Peer review information Nature thanks Douglas Theobald, Claus Wilke and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

a, Phylogeny of steroid receptors and related nuclear receptor family members. AR, androgen receptors, PR, progestorone receptors, GR, gluccocortociod receptors, MR, mineralocortocoid

receptors. Sequence identifiers are in brackets. This topology corresponds to the ‘Chordate tree’ in Extended Data Fig. 2. Scale bar, expected substitutions per site. b, Sequence alignment

of the human ER and GR LBDs, with the MAP sequences of AncSR1 and AncSR2. Green, C-terminal extension. Most ERs contain additional sequence on the C terminus that is unalignable, even among

ERs.

a,b, Distribution of posterior probabilities (PP) of the maximum a posteriori (MAP) state at each site in reconstructed LBDs (top) and DBDs (bottom) of AncSR1 (a) and AncSR2 (b). c,

Stoichiometry of purified alternative LBD reconstructions (AltAll) of AncSR1 (pink) and AncSR2 (green), as measured by SEC-MALS. AncSR1 is a dimer, AncSR2 a monomer. AltAll reconstructions

contain the MAP state at unambiguously reconstructed sites and the state with the next highest PP at all ambiguously reconstructed wites. d, The ‘chordate’ phylogeny (top) was used for

primary ancestral reconstructions; it places the gene duplication yielding ERs and kSRs within the chordates. An alternative less parsimonious tree (‘Bilaterian’ because it places the

duplication deep in the Bilateria, bottom), has very slightly higher likelihood but requires two additional gene losses (dashed lines). The Bilaterian topology was used for alternative

reconstructions (AltPhy). Node labels, approximate likelihood ratio test statistic and transfer bootstrap value. lnl, log-likelihood. e, Distribution of per-site posterior probabilities for

reconstructed LBDs on the Bilaterian topology for AncSR1 (top) and AncSR2 (bottom). f, Stoichiometry of purified AltPhy versions of AncSR1 (pink) and AncSR2 (green) LBDs, as measured by

SEC-MALS. The average molar mass and elution time of AltPhy-AncSR1-LBD are between that of a dimer and a monomer, indicating that it is a fast-exchanging, weaker dimer than other AncSR1-LBD

versions.

a, Activation of AncSR1 from 40 ng ERE response element plasmid as a function of the AncSR1 plasmid concentration. Grey bar, concentration at which assays in Fig. 2f were performed. b, Molar

fraction in the dimeric form measured by nMS as a function of LBD concentration for AncSR1-LBD (purple) and dimerization-interface mutants SR1-LBD(+3) (black) and SR1-LBD(L184E) (grey).

Dissociation constant (Kd) estimated by nonlinear regression is indicated next to each curve. c, Dimeric fraction as a function of LBD concentration for AncSR1-LBD (purple) and

activation-helix mutant SR1-LBD(L126Q) (grey), which affects activation but not dimerization.

a, SEC of AncSR2 LBD (top) and mutants that delete the CTE (ΔCTE) or contain point mutations that impair CTE-LBD interactions (bottom), when fused to MBP. The mutants elute in the same

fraction as AncSR2, demonstrating that they are monomeric and that re-exposing the patch does not re-establish dimerization. b, TEV cleavage of AncSR2 mutants in the absence (left) and

presence (right) of 2% Triton X-100. The positions of bands corresponding to the uncleaved construct, cleaved MBP, cleaved LBD, and TEV protease are indicated. This experiment was performed

twice, with similar results. See Supplementary Fig. 1 for uncropped gels. c, Average root mean square deviation (r.m.s.d.) from replicate 2-μs molecular dynamics simulations of AncSR2-LBD

(WT) and ΔCTE mutant. The average Cα r.m.s.d. in pairwise comparisons of all simulations is shown as a heatmap. d, SEC-MALS trace of AncSR1-LBD fused to the CTE of AncSR2-LBD. The LBD is

still dimeric.

a, Difference between the fraction of residues that are hydrophobic in dimer interfaces versus that on solvent-exposed surfaces of the same proteins. The histogram shows the distribution of

this difference across every protein in our structural database. b, Fraction of hydrophobic residues in dimer interfaces as a function of the number of interface residues. The variation in

the fraction is caused mostly by very small interfaces. c, Expected equilibrium fraction of hydrophobic amino acids from mutation alone. Black: expectation based on GC content and the

genetic code. Red dots and lines: mean and standard deviation of the hydrophobic fraction of residues observed in 200 replicate simulations using mutational spectra from mutation

accumulation experiments (Fig. 4b), plotted against GC content of the organism tested. d, GC content of organisms represented by proteins in our database.

Supplemental Data: 1 Raw gel images. Uncropped gels for data presented in Extended Data Figure 4b. Boxes are drawn around lanes that were used in for the figure. Supplemental Data: 2 Scaled

Q matrices based on mutation accumulation experiments. Row indicates the initial state, column the mutated state. a, M. musculus. b, S. cerevisiae. c, E.coli. d, P aeruginosa.

Anyone you share the following link with will be able to read this content: