Play all audios:
ABSTRACT Whole-cell cross-linking coupled to mass spectrometry is one of the few tools that can probe protein–protein interactions in intact cells. A very attractive reagent for this purpose
is formaldehyde, a small molecule which is known to rapidly penetrate into all cellular compartments and to preserve the protein structure. In light of these benefits, it is surprising that
identification of formaldehyde cross-links by mass spectrometry has so far been unsuccessful. Here we report mass spectrometry data that reveal formaldehyde cross-links to be the
dimerization product of two formaldehyde-induced amino acid modifications. By integrating the revised mechanism into a customized search algorithm, we identify hundreds of cross-links from
in situ formaldehyde fixation of human cells. Interestingly, many of the cross-links could not be mapped onto known atomic structures, and thus provide new structural insights. These
findings enhance the use of formaldehyde cross-linking and mass spectrometry for structural studies. SIMILAR CONTENT BEING VIEWED BY OTHERS CELL FIXATION IMPROVES PERFORMANCE OF IN SITU
CROSSLINKING MASS SPECTROMETRY WHILE PRESERVING CELLULAR ULTRASTRUCTURE Article Open access 02 October 2024 PROTEIN HIGHER-ORDER-STRUCTURE DETERMINATION BY FAST PHOTOCHEMICAL OXIDATION OF
PROTEINS AND MASS SPECTROMETRY ANALYSIS Article 09 November 2020 DENATURING MASS PHOTOMETRY FOR RAPID OPTIMIZATION OF CHEMICAL PROTEIN-PROTEIN CROSS-LINKING REACTIONS Article Open access 25
April 2024 INTRODUCTION Formaldehyde (FA) has been used as a fixative and preservative for many decades1,2. It is reactive toward both proteins and DNA, and forms inter-molecular cross-links
between macromolecules3, as well as intra-molecular chemical modifications4,5. The high reactivity of FA together with its high permeability into cells and tissues has led to its use in
numerous applications in biology, biotechnology, and medicine6. FA cross-linking of proteins is assumed to involve the formation of a methylene bridge between two proximal amino acids
(R1-CH2-R2)7,8. However, direct evidence to support this mechanism is sparse. In terms of mass, the methylene bridge adds 12 Da (one carbon atom) to the total mass of the two cross-linked
amino acids. Mass spectrometry has confirmed this 12 Da addition to the masses of short linear peptides after FA incubation5,9,10,11. Yet, these studies were not able to identify pairs of
peptides that were linked via methylene bridges. Thus, it is unclear whether the observed 12 Da additions were bona fide cross-links or simply local modification of a single peptide. Another
puzzling fact is the lack of reports on the use of FA in the experimental technique of cross-linking coupled to mass spectrometry (XL-MS)12,13,14. In XL-MS, mass spectrometry identifies the
protein residues that are linked based on the unique mass of the cross-linker. This information is then used to probe protein interactions15 and structures16. It seems fair to assume that
if the methylene bridge reaction were easy to detect, FA would have been commonly used for in situ XL-MS17,18,19,20. Yet, we were only able to find reports of FA being used to stabilize
protein complexes that were later cross-linked with a different reagent21,22. Given this lack of evidence, we hypothesize that FA cross-linking of proteins involves a different chemical
mechanism. Identification of cross-linked peptides requires accurate knowledge of the chemical mechanism in order to calculate the mass of the cross-link product. Specifically, a search of
mass spectrometry data with an incorrect mass of the adduct will not yield any identifications. Here we conduct an unbiased mass-spectrometric search for the FA adduct that leads to a
different reaction product with a mass of 24 Da and not the 12 Da expected. This reaction only occurs in structured proteins (rather than peptides), perhaps explaining why earlier studies
did not observe it. RESULTS FA CROSS-LINKING OF PURIFIED PROTEINS We first surveyed the FA cross-linking products that occur within structured proteins by cross-linking a mixture of three
purified proteins (bovine serum albumin (BSA), Ovotransferrin, and α-Amylase). The mixture was incubated with FA for twenty minutes, and then quenched, denatured, digested by trypsin into
peptides, and analyzed by mass spectrometry (Fig. 1). The general practice to identify a cross-link is by matching the measured mass to a theoretical total mass of the two peptides plus the
mass of the cross-linker. Here, we did not limit our search to one predetermined cross-linker mass, but rather scanned through a range of possible masses. Figure 2 shows the number of
cross-links that the scan identified for each cross-linker mass that was tested. It was surprising to see that the dominating reaction product adds exactly 24 Da (two carbon atoms) to the
total mass of the two peptides. This is different from the 12 Da mass expected under the methylene bridge mechanism7. The broadening of the peak, which apparently includes reactions that add
25, 26 and 27 Daltons, is an artifact resulting from incorrect assignment of the mono-isotopic mass by the mass spectrometer (Supplementary Fig. 1). This artifact is common in XL-MS
analysis23,24 and should not be interpreted as being due to alternative reaction products. We also tested a different brand of FA, which resulted in the same mass-scan profile (Supplementary
Fig. 2a). We find that the 24 Da reaction is not two separate 12 Da reactions occurring in parallel for two reasons: First, while one expects that a lower concentration of FA will show less
of the 24 Da reaction and more of the 12 Da reaction, we find that for both high and low concentrations of FA the mass-scan profiles are the same (Supplementary Fig. 2b). Second, ion
species corresponding to mass additions of 36 or 48 Da were not observed in Fig. 2, but such species should have occurred according to a parallel cross-linking model. Further support for the
uniqueness of the 24 Da reaction is seen in the unusual fragmentation pattern of its MS/MS spectra (Fig. 3a–c). We find that the cross-link is highly susceptible to higher-energy
collisional dissociation (HCD), and fragments in which it stayed intact could not be detected. Instead, it breaks symmetrically to give a mass addition of 12 Da on each peptide. Peaks
corresponding to the total mass of one of the peptides plus 12 Da were among the most intense in the observed MS/MS spectra. The two peptides then break a second time to yield the standard
b- and y-fragments as well as modified b- and y-fragments with an additional 12 Da mass. We find additional evidence for this two-step fragmentation model when we follow the change in
fragmentation as a function of the normalized collision energy (Supplementary Fig. 4). Low collision energies are sufficient to break the cross-links, but are insufficient to break the
stronger bonds of the b- and y-fragments. The unique fragmentation pattern associated with the 24 Da reaction resembles that of the cleavable cross-linkers frequently used in XL-MS25. Yet,
an important distinction is the 100% cleavage efficiency of the 24 Da reaction, much higher than observed with other cleavable cross-linking reagents. The unusual fragmentation may partly
explain why the 24 Da reaction was not reported in previous FA studies. With the understanding of the unique properties associated with XL-MS of FA, we designed an analysis application that
is tailored specifically to identify the 24 Da reaction and its subsequent MS/MS pattern. The application successfully identified cross-links in the three-protein mixture in a
concentration-dependent manner (Fig. 3d). Interestingly, the application could also detect a small number of cross-links corresponding to the 12 Da reaction, but at a ratio of less than 1:7
relative to the 24 Da reaction. Supplementary Data 1 lists an example of the identifications from one such cross-linking experiment. An attempt to analyze the same data with MeroX, an
application tailored for cleavable cross-linkers26, gave only a third of the identifications (Supplementary Data 2), and these were a subset of our results. The smaller number is caused by
certain features of FA cross-linking, such as multiple link sites, that are currently not supported by MeroX. The modified +12 Da fragments in the MS/MS spectra allowed us to better
characterize the amino acids that are most likely to partake in the reaction. To that end, we computationally modified in turn each residue along the cross-linked peptides, and determined
which modification site was most compatible with the observed fragmentation pattern. The number of times each amino acid was found to be the most compatible was then normalized by dividing
it by the total number of occurrences of that amino acid. This analysis clearly marks lysine and arginine residues to be the most prevalent in the 24 Da reaction (Supplementary Fig. 5). The
high reactivity of FA with these two amino acids is fully consistent with previous studies performed on peptides and single amino acids5,7,10. However, we note that a third of the identified
cross-links involve at least one peptide that does not have a lysine residue. In these particular peptides aspartic acid and tyrosine residues are the most likely to be the linked residues.
Interestingly, tyrosine was previously shown to be the third most reactive residue toward FA under certain conditions5. We conclude that the majority of FA cross-links occur between lysine
or arginine residues, but a significant fraction of cross-links also involve asparagine, histidine, aspartic acid, tyrosine, and glutamine residues. The fragmentation pattern of the 24 Da
reaction does not enable identification of the two residues undergoing cross-linking. As a typical example, the fragmentation pattern of the peptide pair shown in Fig. 3b is consistent with
the cross-link occurring on any of the first four residues in the upper (red) peptide. The localization is also ambiguous in the lower (blue) peptide as the first aspartic residue and the
middle lysine-aspartic residues are all likely sites for the cross-link given the fragmentation. Therefore, the MS measurement shown in Fig. 3b may actually report a group of isomers of the
same two peptides with different cross-link sites on each. This ambiguity usually does not occur with cross-linking reagents with high chemical specificity toward one particular amino acid
type. The uncertainty in localizing the cross-link sites prevents the measurement of the exact distance spanned by a FA cross-link. Instead, we estimate the cross-link distance as the
minimal Cα–Cα distance between the two peptides on the protein structure. Supplementary Fig. 6 shows the histograms of the minimal distances observed for the cross-links from several FA
concentrations, and results from experiments with the cross-linking reagent disuccinimidyl suberate (DSS). This comparison indicates that FA cross-links are on average shorter than those of
DSS. FA MODIFICATIONS ON LINEAR PEPTIDES As a control to the experiments on structured proteins, we incubated the peptide digest from the same three proteins with FA, and analyzed the
products by mass spectrometry (Supplementary Fig. 7a). This analysis did not identify any cross-link between a pair of peptides in the digest. Yet, an analysis of single linear peptides
found a high abundance of FA-related modifications (Supplementary Fig. 7b, c). Contrary to the cross-links, which adds 24 Da, these modifications are dominated by a reaction that adds 12 Da
to the peptides. Just 20 min of incubation with 2% FA, is sufficient to form peptides with a single 12 Da modification at significant numbers. These modifications were nearly absent when the
digest was not treated with FA (No XL), and can therefore be attributed to the FA reactivity. Peptides with multiple modifications in parallel (24, 36, 48, and 60 Da) were also frequent,
and increased in frequency at longer incubation times. Such modifications are fully consistent with observations of previous mass spectrometry studies of FA effects in peptides5,9,10,11. We
conclude that the chemistry of local modifications is fundamentally different from that of long-range cross-linking. Whereas a 12 Da reaction is the most prevalent for local modifications, a
24 Da reaction dominates cross-linking. IN SITU FA CROSS-LINKING OF HUMAN CELL CULTURES With this clear understanding of the 24 Da cross-linking reaction, we attempt to identify FA
cross-links from in situ cross-linking experiments on intact human cells. PC9 adenocarcinoma cells were incubated in 1%, 2%, 3%, 4.5%, or 6% FA solutions for 10 min. After the FA was washed
out, the cells were lysed and the protein content prepared for mass spectrometry. We measured 10% of the peptide digest from each FA concentration directly in the mass spectrometer. The
other 90% were enriched for cross-linked peptides using SCX27, and then measured in the mass spectrometer. Standard proteomics analysis identified in the digests a set of 1692 proteins with
medium-to-high abundance. In order to speed up the search for cross-links, we took advantage of the complete dissociation of the FA cross-link during MS/MS fragmentation, which allows
matching each peptide to the fragments independently of the other in the pair. An application implementing this strategy analyzed each mass spectrometry run against the database of the 1692
proteins in about 5 min (“Methods”). Overall, the in situ cross-linking experiments involved 59 data-dependent mass spectrometry runs. The analyses of these runs searched for two separate
cross-linker masses: 12 and 24 Da. We then pooled together all the identifications from these analyses into a non-redundant list of 559 cross-links (Supplementary Data 3). The
false-detection rate for this list of cross-links was estimated to be 3% of the entire list, and 16% of the inter-protein list. The false-detection rate estimation was based on decoy
analysis that spiked the search database with reversed sequences (“Methods”). The 24 and 12 Da cross-linking reactions accounted for 74 and 26% of the cross-links, respectively. This
reaffirms the dominance of the 24 Da reaction in FA cross-linking also in the case of in situ FA applications. Interestingly, the 12 Da reaction is more prevalent in situ than it was for the
mixture of purified proteins, possibly reflecting influences of the cellular environment on its efficiency. The identified cross-links occur within a subset of 276 proteins that are of
relatively high abundance in the PC9 cell line28. This is expected because we did not enrich for any particular protein. Encouragingly, the cross-linked proteins originate from the nucleus
(histones), cytoplasm (ribosomes and TRiC/CCT), mitochondria (HSP60), and endoplasmic reticulum (BiP), indicating that the FA has reached most cellular compartments. We could map 280 of the
cross-links onto solved atomic structures. Figure 4a shows the histogram of the minimal Cα–Cα distances spanned by these cross-links. The histogram includes only cross-links between two
peptides that are not consecutive along the protein sequence. The FA cross-links fit the atomic structures well, having a minimal Cα–Cα distance below 25 Å for 97% of them (272 cases). Of
the 559 cross-links, 90 (16%) are inter-protein (between two different proteins in a complex) and the rest are intra-protein (within the same protein polypeptide). A subset of 28
inter-protein cross-links had no corresponding atomic structures, but they showed strong indications of being true positives. All had good fragmentation of both peptides (20 fragments or
more on the weakest peptide), and most were previously reported to be part of a protein complex (Table 1). These cross-links provide structural data—of in situ origin—on the relevant
interactions. Particularly, each cross-link narrows down the interaction site to the vicinity of the two linked peptides. We highlight two subsets of cross-links, which were employed for
constrained docking. The first subset involves the binding site of the nascent polypeptide-associated (NAC) complex on the ribosome. Previously, Pech et al.29 showed that a conserved region
in βNAC, which is predicted to form an α-helix, is binding with the ribosome. Two in situ cross-links cover this sequence region, and link it to the C-terminal of ribosomal protein L22. We
applied PatchDock30 with the restraints of the cross-links, to dock a model of that region onto the ribosome. The best scoring model (Fig. 4b) was close to two ribosomal proteins L22 and
L31, a binding mode that is consistent with previous in vitro evidence showing βNAC to also interact with L3129. A larger subset of cross-links mapped the interaction sites of several actin
regulators onto the outer surface of the actin filament (Fig. 4c). This is consistent with their functions in regulation of bundling and bifurcation of the filaments. We performed all-atom
docking onto the actin filament of plastin-2, for which a reliable homology model of the actin-binding CH domain could be built. This docking was restrained by the cross-link between actin
and plastin-2. Remarkably, the model that ranked third by its PatchDock score had a 3.2 Å deviation from a recent cryo-EM structure of filamin A (Fig. 4d), which is homologous to plastin.
The available cryo-EM structures31,32 were determined from in vitro reconstruction of actin filaments with a large excess of filamin A. Thus, our docking result provides in situ support for
the relevance of the cryo-EM structure. Moreover, it suggests that the binding of filamin and plastin to the actin filament are very similar. In contrast to the cross-links in Table 1, a
subset of nine inter-protein cross-links had two different indications of being false positives. First, they had marginal MS/MS fragmentation evidence (14–19 fragments on the weaker peptide
in the pair). Second, the two cross-linked proteins had never been reported in the literature to be interacting. Assuming that all the intra-protein cross-links are correct, then these nine
cross-links are the only false positives in the entire list. As they comprise 1.6% of the list (9 out of 559), this is in accord with our a priori estimation of the false-detection rate.
DISCUSSION We have established four features of long-range FA cross-links in proteins. First, they occur only in structured proteins. Hence, the reliance of previous studies on peptide
assays incorrectly classified the prevalent 12 Da modification as a cross-link. Second, the dominant cross-linking reaction involves two carbon atoms (24 Da) and not one. Third, these
cross-links are very labile and cleave completely under MS/MS fragmentation. Finally, the most intense MS/MS fragmentation products carry an unusual 12 Da modification. We believe that all
these factors have contributed to the fact that the chemistry of the long-range FA cross-link has not been characterized correctly. In light of the findings, we suggest the following
mechanism of FA cross-linking (Fig. 5). The reaction starts with the accepted imine formation on the side chains of lysines. The imine formation is in accord with the prevalent 12 Da
modification that others and we have observed on peptides and proteins. However, the cross-link itself forms by a dimeric interaction of two imines33. This symmetric formation is compatible
with three observations. First, it explains the symmetrical cleavage of the link under MS/MS fragmentation. Second, if one assumes that the imine modification is only mildly reactive, then
it is clear why cross-linking occurs only in structured proteins: the stable structure of the protein keeps the modifications in proximity for sufficient time for cross-linking to occur.
Third, the dimerization is consistent with the known reversibility of FA cross-linking, which implies that all steps of the mechanism are reversible. In particular, the MS/MS spectra clearly
demonstrate the full reversal of the last dimerization step by the introduction of mild collision energy. In Fig. 5, the cross-linking mechanism is exemplified on two lysine side chains,
but FA cross-linking does not necessarily require two lysines. Indeed, for many of the in situ cross-links (Supplementary Data 3) one of the peptides has no lysine residues. Therefore, the
hypothesized model would have to be revised for cross-linking in the more general case. The current data cannot conclusively determine what is the chemical structure of the linkage site. One
possibility is that the two imines undergo cycloaddition to form a 1,3-diazetidine linkage. Such a strained ring structure would be consistent with the tendency of the link to break
completely under HCD fragmentation. Nonetheless, other chemical structures are equally possible and efforts to better characterize the linkage site by NMR are ongoing. In our experience, FA
is not a more potent reagent compared with reagents based on NHS-esters. Yet, it has several advantages, notably its solubility and proven ability to penetrate cells and tissues rapidly.
This makes FA an attractive reagent for in situ XL-MS, which is currently not as developed as XL-MS applications on purified protein solutions or lysates. We believe that the findings of
this work will now allow for a much wider use of FA for in situ XL-MS experiments. METHODS CROSS-LINKING OF THE THREE-PROTEIN MIXTURE A mixture solution of three purified proteins was
prepared by reconstituting lyophilized protein powder in PBS (all reagents were purchased from Sigma unless noted otherwise). The proteins were bovine serum albumin (BSA), Ovotransferrin,
and α-Amylase with respective final molarity in the mixture of 10, 10, and 20 µM. Each cross-linking experiment occurred in 108 µL of solution comprising a total protein mass of 260 µg. In
most experiments we cross-linked with a formalin solution (37% FA and 10% methanol) from Sigma (product number F8775). We also tested formalin with the same composition from another brand
(DAEJUNG chemicals, Korea, product number 4044-4400). The formalin was incubated with the protein mixture at the desired FA concentration and the cross-linking reaction occurred at room
temperature under gentle agitation. The cross-linking incubation time was 20 min. The cross-linking reaction was quenched by addition of ammonium bicarbonate to a final concentration of 0.5
M for 10 min before proceeding to mass spectrometry preparation. The results of each experimental condition are an average of six mass spectrometry runs from three experimental replicates,
each with two technical replicates. CROSS-LINKING OF DIGEST FROM THE THREE-PROTEIN MIXTURE Peptide digest was prepared from the three-protein mixture by trypsin digestion as described in the
Mass spectrometry subsection ahead. The peptides were desalted on SepPak C18 column (Waters), eluted, dried in SpeedVac, and reconstituted in PBS. FA was added to a concentration of 2% and
the incubation time was either 20 min, 2 h, or 24 h. The solution was quenched by addition of ammonium bicarbonate to a final concentration of 0.5 M for 10 min. The peptides were desalted on
C18 stage tips and eluted for mass spectrometry analysis. The results of each incubation time are an average of two experimental replicates, each with two technical replicates. IN SITU
CROSS-LINKING OF PC9 CELLS Human lung cancer cell line PC9 (ECACC, catalog No. 90071810) were seeded in Dulbecco’s modified Eagle’s medium, and were supplemented with 1×
penicillin–streptomycin (Gibco Invitrogen) and 10% fetal bovine serum (Biological Industries) at 37 °C under 5% CO2/95% air. The cells were grown to 80% confluency in 10-cm plates. The
growth media was removed and the cells washed three times with 3 ml of warm PBS buffer. We added to each plate 2 ml of PBS with FA at different concentrations: 1, 2, 3, 4.5, or 6%. The cells
were incubated with FA for 15 min at 37 °C, and then washed three times with cold PBS to remove the FA. We incubated the cells with hypertonic buffer (50 mM HEPES pH = 7.5, 500 mM NaCl, 0.5
mM EDTA, 0.0005% Tween20) for 15 min, and then scraped the cells from the plate. The cells were centrifuged at 4 °C and the supernatant was discarded. The cell pellet was resuspended for 15
min with hypotonic buffer (above buffer without NaCl), and then further lysed with sonication (5 s on, 25 s off, 5 times, 50% amplitude). The cell lysate was centrifuged at 4 °C and the
supernatant was collected. The lysate was processed by the filter-aided sample preparation protocol34 in order to remove the detergent and nucleic acids prior to the mass spectrometry
analysis. ENRICHMENT BY STRONG CATION EXCHANGE (SCX) CHROMATOGRAPHY We followed the SCX protocol by Klykov et al.27. Briefly, desalted peptide digest was dried in SpeedVac and reconstituted
in 50 μl of buffer A (20% Acetonitrile, formic acid titrated to pH of 3.0). Separation was performed with an Äkta Pure system on a 100 × 1.0 mm PolySULFOETHYL A SCX column (PolyLC, USA)
using a gradient of buffer B (20% Acetonitrile, 0.5 M NaCl, formic acid titrated to pH of 3.0) and 100 μl fractions. Fractions corresponding to NaCl concentrations of 100 mM and higher were
desalted and used for mass spectrometry analysis. MASS SPECTROMETRY The proteins were precipitated in acetone at −80 °C for 1 h followed by centrifugation at 10,000 × _g_. The pellet was
resuspended in 20 μl of 8 M urea with 10 mM DTT. After 30 min, iodoacetamide was added to a final concentration of 50 mM and the alkylation reaction proceeded for 30 min. The urea was
diluted by adding 200 μl of digestion buffer (25 mM TRIS pH = 8.0; 10% acetonitrile), trypsin (Promega) was added at a 1:100 protease-to-protein ratio, and the protein was digested overnight
at 37 °C under agitation. Following digestion, the peptides were desalted on C18 stage tips and eluted by 55% acetonitrile. The eluted peptides were dried in a SpeedVac, reconstituted in
0.1% formic acid, and measured in the mass spectrometer. The samples were analyzed by a 120 min 0–40% acetonitrile gradient on a liquid chromatography system coupled to a Q-Exactive Plus
mass spectrometer (Thermo). We were careful not to raise the temperature of the sample above 40 °C through all the preparation stages (alkylation, digestion, desalting, and in the analytical
column of the LC) in order not to break the FA cross-links. The RAW data files from the mass spectrometer were converted to MGF format by Proteome Discoverer (Thermo), which was the input
format for our analysis applications. The method parameters of the run were as follows: data-dependent acquisition; Full MS resolution 70,000; MS1 AGC target 1e6; MS1 Maximum IT 200 ms; Scan
range 450–1800; dd-MS/MS resolution 35,000; MS/MS AGC target 2e5; MS2 Maximum IT 300 ms; Loop count Top 12; Isolation window 1.1; Fixed first mass 130; MS2 Minimum AGC target 800; HCD
energy (NCE) 26;Charge exclusion: unassigned,1,2,3,8,>8; Peptide match—off; Exclude isotope—on; Dynamic exclusion 45 s. SCANNING FOR THE MASS OF THE CROSS-LINKING REACTION We modified our
analysis application, FindXL35, so that it ran multiple times, each time with a different cross-linker mass. We scanned all the integer masses from −30 to 50 Da. FindXL exhaustively
enumerates all the possible peptide pairs and compare them to the measured MS/MS events in search of matches that fulfill the criteria below. The search parameters were as follows: Sequence
database—the sequences of BSA, Ovotransferrin, and α-Amylase; Protease—trypsin, allowing up to three miscleavage sites; Fixed modification of cysteine by iodoacetamide; Variable modification
of methionine by oxidation; Cross-linking can occur on any residue type; Cross-linker is non-cleavable; MS/MS fragments to consider—b-ions and y-ions as well as b-ions and y-ions with the
additional mass of the second peptide and the cross-linker; MS1 tolerance – 6 ppm; MS2 tolerance—8 ppm. A cross-link was identified as a match between a MS/MS event and a peptide pair if it
fulfilled four conditions: (1) The mass of the precursor ion is the same as the expected mass of the cross-linked peptide pair within the MS1 tolerance; (2) At least four MS/MS fragments
(within the MS2 tolerance) were identified on each peptide; (3) The fragmentation score of the cross-link (defined as the number of matching MS/MS fragments divided by the combined length of
the two peptides) is 0.6 or higher; (4) The peptides are not overlapping nor consecutive in the protein sequence. The purpose of the fourth criterion is to count only cross-links that span
a long range on the primary structure. IDENTIFYING THE AMINO ACIDS INVOLVED IN THE 24 DA REACTION The identified cross-links from all the replicates involving 2 and 4% FA cross-linking were
pooled together for this analysis. For each cross-link, we analyzed the two peptides independently of each other. For each peptide, we computationally modified (added 12 Da) in turn to each
residue. We then determined which residue position was most compatible with the MS/MS fragmentation pattern (highest number of fragments that can be assigned by the modified peptide at 8 ppm
tolerance). The number of times each amino acid was found to be the most compatible was then normalized by dividing it by the total number of occurrences of that amino acid in all the
peptides (normalized count). IDENTIFYING LINEAR PEPTIDES WITH MODIFICATIONS The identification of modifications formed by FA on linear peptides was based only on matching the mass of the
precursor ion (i.e., MS1) to the theoretical mass of the peptide+modification. This approach was taken because of insufficient knowledge as to where these modifications occur, or how they
affect the MS/MS fragmentation. To make the identification more stringent, we set a very narrow tolerance of 1 ppm on the match between the measured and theoretical mass of the peptide plus
the modification. Of note, with such a narrow tolerance we did not find any ambiguous cases in which the measured mass could be assigned to more than one peptide. We ran the analysis eight
times, each time searching for a different modification: 0.0 (no modification), 12.0, 24.0, 36.0, 48.0, 60.0, 57.0215 (off-target alkylation), and 15.9949 (oxidation) Da. The estimate of the
relative abundance of each modification was calculated as the ratio between the number of identified peptides with that modification and the number of identified peptides without
modification (0.0 Da). Other search parameters were: Sequence database—the sequences of BSA, Ovotransferrin, and α-Amylase; Protease—trypsin, allowing up to three miscleavage sites; Fixed
modification of cysteine by iodoacetamide. Methionine oxidation was not considered. CROSS-LINK IDENTIFICATION IN A SMALL SET OF PROTEINS This analysis application exhaustively enumerates all
the possible peptide pairs, and compare them to the measured MS/MS events in search of matches that fulfill the criteria below. The search parameters were as follows: Sequence database—the
sequences of BSA, Ovotransferrin, and α-Amylase; Protease—trypsin, allowing up to three miscleavage sites; Fixed modification of cysteine by iodoacetamide; Variable modification of
methionine by oxidation; Cross-linking can occur on any residue type; Cross-linker is always cleaved; MS/MS fragments to consider: b-ions, y-ions, *b-ions (b-ions plus 12.0 Da), and *y-ions
(y-ions plus 12.0 Da); MS1 tolerance—6 ppm; MS2 tolerance—8 ppm; Cross-linker mass—one of three possible masses: 24.0, 25.00335, and 26.0067. The three cross-linker masses were considered in
turn in the calculation of the theoretical mass of the two cross-linked peptides. These masses address the incorrect reporting of the mono-isotopic mass (Supplementary Fig. 1). A cross-link
was identified as a match between a measured MS/MS event and a peptide pair if it fulfilled five conditions: (1) The mass of the precursor ion is within the MS1 tolerance of the theoretical
mass of the linked peptide pair (with either of the three possible cross-link masses); (2) At least four modified MS/MS fragments (*b and *y) were identified within the MS2 tolerance on
each peptide; (3) The fragmentation score of the cross-link (defined as the number of all matching MS/MS fragments divided by the combined length of the two peptides) is 1.0 or higher; (4)
The peptides are not overlapping in the protein sequence; (5) There is no other peptide pair or linear peptide that match the data with equal or better fragmentation score. Given the small
size of the sequence database, we estimated the false-detection rate in the following way. The analysis of data from the 4% FA experiment was repeated ten times with an erroneous
cross-linker mass of 61.0, 62.0, 63.0, … 70.0 Da. This led to fragmentation scores that were much lower than the scores obtained with the correct cross-linker mass. On average, 2 erroneous
cross-links had a fragmentation score above 1.0 in each decoy run, whereas runs with the correct cross-linker mass (24.0 Da) identified ∼60 cross-links above the 1.0 score. We therefore
estimate the false-detection rate to be 2 in 60 cross-links or ∼3%. CROSS-LINK IDENTIFICATION IN A LARGE SET OF PROTEINS This application relied on the complete cleavage of the FA
cross-links in order to separately assign a MS/MS fragmentation score to each peptide. This division allows for a practical run time of _O(n)_ with suitable preprocessing. The search
parameters were as follows: Sequence database—comprising the 1692 human proteins that were identified in the samples. Note that runs on the full human proteome (20,000 proteins) are
possible, but take up to 4 h; Protease—trypsin, allowing up to two miscleavage sites; Fixed modification of cysteine by iodoacetamide; Cross-linking can occur on any residue type;
Cross-linker is always cleaved; MS/MS fragments to consider: b-ions, y-ions, *b-ions (b-ions plus 12.0 Da), and *y-ions (y-ions plus 12.0 Da); MS1 tolerance – 4.2 ppm; MS2 tolerance – 6.5
ppm; Cross-linker mass—one of five possible masses: 24.0, 25.00335, 26.0067, 12.0, and 13.00335 Da. All of these masses were considered in turn in the calculation of the theoretical mass of
the two cross-linked peptides. The five masses address the incorrect reporting of the mono-isotopic mass (Supplementary Fig. 1), as well as the much less frequent 12 Da reaction. A
cross-link was reported if it fulfilled four conditions: (1) The mass of the precursor ion is within the MS1 tolerance of the theoretical mass of the cross-linked peptide pair (with any of
the five cross-link masses); (2) Each peptide had at least 19 MS/MS fragments (b, y, *b and *y) within the MS2 tolerance, OR its fragmentation score (defined as the number of matching MS/MS
fragments divided by its length) was 1.8 or higher; (3) The peptides are not overlapping in the protein sequence; (4) There is no other peptide pair or linear peptide that match the data
with equal or better fragmentation score. To estimate the false-detection rate of the reported list of cross-links, we spiked the sequence database with a decoy set comprising some of the
sequences in reverse. The proteins used for the decoys were chosen randomly and their number is user defined. In the case of the PC9 lysate, the number of decoy sequences was set to 1/15 the
total number of sequences. We therefore estimate the number of false positives in the cross-link list to be 15 times the number of cross-links that include a reverse decoy peptide.
COMPUTATIONAL DOCKING Docking was performed with PatchDock30. The cross-link was implemented as distance constraints that must be under 12 Å in accepted models. Homology models of βNAC and
Plastin-2 were generated by HHPred36. REPORTING SUMMARY Further information on research design is available in the Nature Research Reporting Summary linked to this article. DATA AVAILABILITY
The mass spectrometry data have been deposited to the ProteomeXchange Consortium via the PRIDE37 partner repository with the dataset identifier PXD015435. Source data are provided with this
paper. All other data are available from the corresponding author on reasonable request. CODE AVAILABILITY A standalone analysis application for identification of formaldehyde cross-links
is available at https://github.com/Kalisman-Lab/Search_Formaldehyde_Cross-links. The underlying source code in Java is available at
https://github.com/Kalisman-Lab/Search_Formaldehyde_Cross-links_Source_Code. REFERENCES * Karnovsky, M. J. A formaldehyde-glutaraldehyde fixative of high osmolality for use in electron
microscopy. _J. Cell Biol._ 27, 137–138A (1965). Google Scholar * Carson, F. L., Martin, J. H. & Lynn, A. J. Formalin fixation for electron microscopy: a re-evaluation. _J. Clin.
Pathol._ 59, 365–373 (1973). Article CAS Google Scholar * Solomon, M. J. & Varshavsky, A. Formaldehyde-mediated DNA-protein crosslinking: a probe for in vivo chromatin structures..
_PNAS_ 82, 6470–6474 (1985). Article ADS CAS PubMed PubMed Central Google Scholar * Chang, Y. T. & Loew, G. H. Reaction mechanisms of formaldehyde with endocyclic imino groups of
nucleic acid bases. _J. Am. Chem. Soc._ 116, 3548–3555 (1994). Article CAS Google Scholar * Metz, B. et al. Identification of formaldehyde-induced modifications in proteins reactions with
model peptides. _J. Biol. Chem._ 279, 6235–6243 (2004). Article CAS PubMed Google Scholar * Hoffman, E. A., Frey, B. L., Smith, L. M. & Auble, D. T. Formaldehyde crosslinking: a
tool for the study of chromatin complexes. _J. Biol. Chem._ 290, 26404–26411 (2015). Article CAS PubMed PubMed Central Google Scholar * Fraenkel-Conrat, H. & Olcott, H. S. The
reaction of formaldehyde with proteins. V. Cross-linking between amino and primary amide or guanidyl groups. _J. Am. Chem. Soc._ 70, 2673–2684 (1948). Article CAS PubMed Google Scholar
* Feldman, M. Y. Reactions of nucleic acids and nucleoproteins with formaldehyde. _Prog. Nucleic Acid Res. Mol. Biol._ 13, 1–49 (1973). Article CAS PubMed Google Scholar * Metz, B. et
al. Identification of formaldehyde-induced modifications in proteins: reactions with insulin. _Bioconjugate Chem._ 17, 815–822 (2006). Article CAS Google Scholar * Toews, J., Rogalski, J.
C., Clark, T. J. & Kast, J. Mass spectrometric identification of formaldehyde-induced peptide modifications under in vivo protein cross-linking conditions. _Anal. Chim. Acta_ 618,
168–183 (2008). Article CAS PubMed Google Scholar * Wang, Z. J. et al. Chemical modifications of peptides and proteins with low concentration formaldehyde studied by mass spectrometry.
_Chin. J. Anal. Chem._ 44, 1193–1199 (2016). Article CAS Google Scholar * Leitner, A., Faini, M., Stengel, F. & Aebersold, R. Crosslinking and Mass Spectrometry: An Integrated
Technology to Understand the Structure and Function of Molecular Machines. _Trends Biochem Sci._ 41, 20–32 (2016). Article CAS PubMed Google Scholar * Schneider, M., Belsom, A. &
Rappsilber, J. Protein tertiary structure by crosslinking/mass spectrometry. _Trends Biochem Sci._ 43, 157–169 (2018). Article CAS PubMed PubMed Central Google Scholar * Sinz, A.
Cross-linking/mass spectrometry for studying protein structures and protein-protein interactions: where are we now and where should we go from here? _Angew. Chem. Int Ed. Engl._ 57,
6390–6396 (2018). Article CAS PubMed Google Scholar * Herzog, F. et al. Structural probing of a protein phosphatase 2A network by chemical cross-linking and mass spectrometry. _Science_
337, 1348–1352 (2012). Article ADS CAS PubMed Google Scholar * Rappsilber, J. The beginning of a beautiful friendship: cross-linking/mass spectrometry and modelling of proteins and
multi-protein complexes. _J. Struct. Biol._ 173, 530–540 (2011). Article CAS PubMed PubMed Central Google Scholar * Weisbrod, C. R. et al. In vivo protein interaction network identified
with a novel real-time cross-linked peptide identification strategy. _J. Proteome Res._ 12, 1569–1579 (2013). Article CAS PubMed PubMed Central Google Scholar * Kaake, R. M. et al. A
new in vivo cross-linking mass spectrometry platform to define protein-protein interactions in living cells. _Mol. Cell Proteom._ 13, 3533–3543 (2014). Article CAS Google Scholar *
Chavez, J. D. et al. Chemical crosslinking mass spectrometry analysis of protein conformations and supercomplexes in heart tissue. _Cell Syst._ 6, 136–141 (2018). Article CAS PubMed
Google Scholar * Fasci, D., van Ingen, H., Scheltema, R. A. & Heck, A. J. R. Histone interaction landscapes visualized by crosslinking mass spectrometry in intact cell nuclei. _Mol.
Cell Proteom._ 17, 2018–2033 (2018). Article CAS Google Scholar * Robinson, P. J. et al. Structure of a complete mediator-RNA polymerase II pre-initiation complex. _Cell_ 166, 1411–1422
(2016). Article CAS PubMed PubMed Central Google Scholar * Wang, X. et al. The proteasome-interacting Ecm29 protein disassembles the 26 s proteasome in response to oxidative stress. _J.
Biol. Chem._ 292, 16310–16320 (2017). Article CAS PubMed PubMed Central Google Scholar * Lenz, S., Giese, S. H., Fischer, L. & Rappsilber, J. In-search assignment of monoisotopic
peaks improves the identification of cross-linked peptides. _J. Proteome Res._ 17, 3923–3931 (2018). Article CAS PubMed PubMed Central Google Scholar * Götze, M., Iacobucci, C., Ihling,
C. H. & Sinz, A. A simple cross-linking/mass spectrometry workflow for studying system-wide protein interactions. _Anal. Chem._ 91, 10236–10244 (2019). Article PubMed CAS Google
Scholar * Sinz, A. Divide and conquer: cleavable cross-linkers to study protein conformation and protein-protein interactions. _Anal. Bioanal. Chem._ 409, 33–44 (2017). Article CAS PubMed
Google Scholar * Iacobucci, C. et al. A cross-linking/mass spectrometry workflow based on MS-cleavable cross-linkers and the MeroX software for studying protein structures and
protein-protein interactions. _Nat. Protoc._ 13, 2864–2889 (2018). Article CAS PubMed Google Scholar * Klykov, O. et al. Efficient and robust proteome-wide approaches for cross-linking
mass spectrometry. _Nat. Protoc._ 13, 2964–2990 (2018). Article CAS PubMed Google Scholar * Geiger, T., Wehner, A., Schaab, C., Cox, J. & Mann, M. Comparative proteomic analysis of
eleven common cell lines reveals ubiquitous but varying expression of most proteins. _Mol. Cell Proteom._ 11, M111.014050 (2012). Article CAS Google Scholar * Pech, M., Spreter, T.,
Beckmann, R. & Beatrix, B. Dual binding mode of the nascent polypeptide-associated complex reveals a novel universal adapter site on the ribosome. _J. Biol. Chem._ 285, 19679–19687
(2010). Article CAS PubMed PubMed Central Google Scholar * Schneidman-Duhovny, D., Inbar, Y., Nussinov, R. & Wolfson, H. J. PatchDock and SymmDock: servers for rigid and symmetric
docking. _Nucl. Acids Res._ 33, W363–W367 (2005). Article CAS PubMed PubMed Central Google Scholar * Galkin, V. E., Orlova, A., Cherepanova, O., Lebart, M. C. & Egelman, E. H.
High-resolution cryo-EM structure of the F-actin-fimbrin/plastin ABD2 complex. _Proc. Natl Acad. Sci. USA_ 105, 1494–1498 (2008). Article ADS CAS PubMed PubMed Central Google Scholar *
Iwamoto, D. V. et al. Structural basis of the filamin A actin-binding domain interaction with F-actin. _Nat. Struct. Mol. Biol._ 25, 918–927 (2019). Article CAS Google Scholar * Layer,
R. W. The chemistry of imines. _Chem. Rev._ 63, 489–510 (1965). Article Google Scholar * Wiśniewski, J. R., Zougman, A., Nagaraj, N. & Mann, M. Universal sample preparation method for
proteome analysis. _Nat. Methods_ 6, 359–362 (2009). Article PubMed CAS Google Scholar * Kalisman, N., Adams, C. M. & Levitt, M. Subunit order of eukaryotic TRiC/CCT chaperonin by
cross-linking, mass spectrometry, and combinatorial homology modeling. _Proc. Natl Acad. Sci. USA_ 109, 2884–2889 (2012). Article ADS CAS PubMed PubMed Central Google Scholar * Söding,
J., Biegert, A. & Lupas, A. N. The HHpred interactive server for protein homology detection and structure prediction. _Nucleic Acids Res._ 33, W244–W248 (2005). Article PubMed PubMed
Central CAS Google Scholar * Perez-Riverol, Y. et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. _Nucleic Acids Res._ 47,
D442–D450 (2019). Article CAS PubMed Google Scholar * Kurokawa, H. L., Mikami, B. & Hirose, M. Crystal structure of diferric hen ovotransferrin at 2.4 Å resolution. _J. Mol. Biol._
254, 196–207 (1995). Article CAS PubMed Google Scholar * Natchiar, S. K., Myasnikov, A. G., Kratzat, H., Hazemann, I. & Klaholz, B. P. Visualization of chemical modifications in the
human 80 s ribosome structure. _Nature_ 551, 472–477 (2017). Article ADS CAS PubMed Google Scholar * Beatrix, B., Sakai, H. & Wiedmann, M. The α and β subunit of the nascent
polypeptide-associated complex have distinct functions. _J. Biol. Chem._ 275, 37838–37845 (2000). Article CAS PubMed Google Scholar * Kobayashi, R., Kubota, T. & Hidaka, H.
Purification, characterization, and partial sequence analysis of a new 25-kDa actin binding protein from bovine aorta: a SM22 homolog. _Biochem. Biophys. Res. Commun._ 198, 1275–1280 (1994).
Article CAS PubMed Google Scholar * Welch, M. D., Iwamatsu, A. & Mitchison, T. J. Actin polymerization is induced by Arp2/3 protein complex at the surface of Listeria monocytogenes.
_Nature_ 385, 265–269 (1997). Article ADS CAS PubMed Google Scholar * Janji, B. et al. Phosphorylation on Ser5 increases the F-actin-binding activity of L-plastin and promotes its
targeting to sites of actin assembly in cells. _J. Cell Sci._ 119, 1947–1960 (2006). Article CAS PubMed Google Scholar * Huang, L., Wong, T. Y., Lin, R. C. & Furthmayr, H.
Replacement of threonine 558, a critical site of phosphorylation of moesin in vivo, with aspartate activates F-actin binding of moesin. Regulation by conformational change. _J. Biol. Chem._
274, 12803–12810 (1999). Article CAS PubMed Google Scholar * Safer, D., Elzinga, M. & Nachmias, V. T. Thymosin beta 4 and Fx, an actin-sequestering peptide, are indistinguishable.
_J. Biol. Chem._ 266, 4029–4032 (1991). CAS PubMed Google Scholar * Hein, M. Y. et al. A human interactome in three quantitative dimensions organized by stoichiometries and abundances.
_Cell_ 163, 712–723 (2015). Article CAS PubMed Google Scholar * Soh, Y. M. et al. Molecular basis for SMC rod formation and its dissolution upon DNA binding. _Mol. Cell_ 57, 290–303
(2015). Article CAS PubMed PubMed Central Google Scholar * Koegler, E. et al. p28, a novel ERGIC/cis Golgi protein, required for Golgi ribbon formation. _Traffic_ 11, 70–89 (2010).
Article CAS PubMed Google Scholar Download references ACKNOWLEDGEMENTS This work was supported by the Israel Science Foundation grant number 1768/15. M.S. was supported by the FSHD
Global Foundation grant number 41. We thank Uri Raviv for his help and advice in various stages of this work. We thank David Morgenstern and Dina Schneidman for critical reading of the
manuscript. AUTHOR INFORMATION Author notes * These authors contributed equally: Tamar Tayri-Wilk, Moriya Slavin, Joanna Zamel. AUTHORS AND AFFILIATIONS * Institute of Life Sciences, The
Hebrew University of Jerusalem, Jerusalem, 9190401, Israel Tamar Tayri-Wilk, Moriya Slavin, Joanna Zamel, Ayelet Blass, Shon Cohen, Alex Motzik, Xue Sun, Oren Ram & Nir Kalisman *
Institute of Chemistry, The Hebrew University of Jerusalem, Jerusalem, 9190401, Israel Tamar Tayri-Wilk * Wolfson Centre for Applied Structural Biology, The Hebrew University of Jerusalem,
Jerusalem, 9190401, Israel Deborah E. Shalev * Department of Pharmaceutical Engineering, Azrieli College of Engineering, Jerusalem, Israel Deborah E. Shalev Authors * Tamar Tayri-Wilk View
author publications You can also search for this author inPubMed Google Scholar * Moriya Slavin View author publications You can also search for this author inPubMed Google Scholar * Joanna
Zamel View author publications You can also search for this author inPubMed Google Scholar * Ayelet Blass View author publications You can also search for this author inPubMed Google Scholar
* Shon Cohen View author publications You can also search for this author inPubMed Google Scholar * Alex Motzik View author publications You can also search for this author inPubMed Google
Scholar * Xue Sun View author publications You can also search for this author inPubMed Google Scholar * Deborah E. Shalev View author publications You can also search for this author
inPubMed Google Scholar * Oren Ram View author publications You can also search for this author inPubMed Google Scholar * Nir Kalisman View author publications You can also search for this
author inPubMed Google Scholar CONTRIBUTIONS Conceptualization, N.K.; Methodology, T.T-W., M.S., and N.K.; Investigation, T.T-W., M.S., and J.Z.; Software, A.B., S.C., and N.K.; Resources,
A.M., X.S., D.E.S., and O.R.; Writing—original draft, T.T-W., and N.K.; Writing—review & editing, D.E.S., O.R., and N.K.; Visualization, T.T-W., M.S., D.E.S., and N.K.; Supervision, O.R.
and N.K.; Funding acquisition, N.K. CORRESPONDING AUTHOR Correspondence to Nir Kalisman. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests. ADDITIONAL
INFORMATION PEER REVIEW INFORMATION _Nature Communications_ thanks Michael Trnka, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer
reports are available. PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. SUPPLEMENTARY INFORMATION
SUPPLEMENTARY INFORMATION PEER REVIEW FILE DESCRIPTION OF ADDITIONAL SUPPLEMENTARY FILES SUPPLEMENTARY DATASET 1 SUPPLEMENTARY DATASET 2 SUPPLEMENTARY DATASET 3 SUPPLEMENTARY DATASET 4
SUPPLEMENTARY DATASET 5 REPORTING SUMMARY SOURCE DATA SOURCE DATA RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative Commons Attribution 4.0 International License,
which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link
to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or
exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Reprints
and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Tayri-Wilk, T., Slavin, M., Zamel, J. _et al._ Mass spectrometry reveals the chemistry of formaldehyde cross-linking in structured
proteins. _Nat Commun_ 11, 3128 (2020). https://doi.org/10.1038/s41467-020-16935-w Download citation * Received: 10 March 2020 * Accepted: 02 June 2020 * Published: 19 June 2020 * DOI:
https://doi.org/10.1038/s41467-020-16935-w SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not
currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative