Structural and biochemical analyses of selectivity determinants in chimeric Streptococcus Class A sortase enzymes
Melody Gao and D. Alex Johnson contributed equally to this work.
Funding information: Research Corporation for Scientific Advancement, Grant/Award Number: Cottrell Scholar Award; National Science Foundation, Grant/Award Numbers: CHE-MRI-1429164, CHE-CAREER-2044958; U.S. Department of Energy, Grant/Award Number: DE-AC02-05CH11231
Abstract
Sequence variation in related proteins is an important characteristic that modulates activity and selectivity. An example of a protein family with a large degree of sequence variation is that of bacterial sortases, which are cysteine transpeptidases on the surface of gram-positive bacteria. Class A sortases are responsible for attachment of diverse proteins to the cell wall to facilitate environmental adaption and interaction. These enzymes are also used in protein engineering applications for sortase-mediated ligations (SML) or sortagging of protein targets. We previously investigated SrtA from Streptococcus pneumoniae, identifying a number of putative β7–β8 loop-mediated interactions that affected in vitro enzyme function. We identified residues that contributed to the ability of S. pneumoniae SrtA to recognize several amino acids at the P1′ position of the substrate motif, underlined in LPXTG, in contrast to the strict P1′ Gly recognition of SrtA from Staphylococcus aureus. However, motivated by the lack of a structural model for the active, monomeric form of S. pneumoniae SrtA, here, we expanded our studies to other Streptococcus SrtA proteins. We solved the first monomeric structure of S. agalactiae SrtA which includes the C-terminus, and three others of β7–β8 loop chimeras from S. pyogenes and S. agalactiae SrtA. These structures and accompanying biochemical data support our previously identified β7–β8 loop-mediated interactions and provide additional insight into their role in Class A sortase substrate selectivity. A greater understanding of individual SrtA sequence and structural determinants of target selectivity may also facilitate the design or discovery of improved sortagging tools.
1 INTRODUCTION
Class A sortases are enzymes located on the surface of gram-positive bacteria that attach proteins to the cell wall. Sortase-mediated protein display allows bacteria to interact with their environments, for example, with proteins for bacterial adhesion and/or acquisition of nutrients, and can include pathogenic factors that enable the bacteria to infect host organisms.1, 2 The catalytic mechanism of sortases involves the recognition and cleavage of a specific sequence, followed by ligation of an incoming amine nucleophile.1-3 This reactivity has also been harnessed for protein engineering applications, and sortases have emerged as powerful tools for the post-translational derivatization of protein targets with various non-native modifications.3 The traditional recognition motif of Class A sortase (or SrtA) proteins, which is found within the cell wall sorting signal (CWSS) of gram positive bacteria, is the sequence LPXTG (where X = any amino acid, and L = P4, P = P3, X = P2, T = P1, and G = P1′). This sequence is recognized by all Class A sortases investigated to date, however, other recognition sequences have been identified or engineered for several SrtA proteins in the last decade, greatly increasing the potential for sortase-mediated ligation (SML), or sortagging, applications.4-12
Despite a relatively large degree of sequence variation amongst the hundreds of identified SrtA proteins in bacteria, these cysteine transpeptidases contain a conserved catalytic triad, consisting of His, Cys, and Arg residues.1, 2, 13 The most well studied Class A sortase is that from Staphylococcus aureus (saSrtA), which continues to see frequent use in sortagging applications.3 As of 2019, there were approximately 10 known structures of Class A sortases, with several being of saSrtA.13 Overall, the sortase fold, consisting of a closed eight-stranded β-barrel architecture, is conserved in all structures of SrtA proteins solved to date; however, there are variations consistent with the degree of sequence differences.2 For example, between saSrtA and Streptococcus pyogenes SrtA (spySrtA), there are a number of unique structural characteristics that affect enzyme function (Figure 1). Specifically, saSrtA requires a Ca2+ cofactor and its β7–β8 loop near the active site contains an additional five residues and a Trp (W194) which dramatically affects activity (Figure 1a).12, 14 All structural comparisons with saSrtA will use the peptidomimetic-bound structure (PDB ID 2KID) as this is the only one to our knowledge of saSrtA in the active state.15-19 Previous work shows that allosteric activation, driven by Ca2+ binding, affects several structural features near the active site, including the relative conformation and/or location of the β6–β7 and β7–β8 loops.15-20 In the case of spySrtA, a partially helical C-terminal extension of 24 residues is evident in the reported crystal structure that is absent in saSrtA (Figure 1). A detailed description of how each of these features determines target recognition and selectivity remains incomplete, particularly for the Streptococcus SrtA proteins. Unique features of B. anthracis SrtA (baSrtA) have also been previously described, for example, regulation of enzymatic activity by an N-terminal appendage as well as a disordered-to-ordered transition in the β7–β8 loop upon ligand binding.21

A number of protein families use specificity-determining loops to encode differing target selectivity amongst members. Classic examples include kinases and serine proteases.22-28 Specific regions of the activation loop of kinases contribute to substrate specificity by directly interacting with amino acids adjacent to the phosphorylation site.25, 28 In serine proteases, substitution of two conserved surface loops (nine residues total) efficiently converts selectivity of trypsin to that of the related enzyme chymotrypsin.22, 23 There are also examples in scaffolding domains, including SH2 and SH3 domains, where conserved loops interact directly with the peptide and determine the selectivity of both SH2 (the EF and BG loops) and SH3 (the RT and n-Src loops) domains.29-37 Work from ourselves and others strongly indicates that Class A sortases are another protein family that exhibits functionally relevant sequence variation in specificity-determining loops.12, 38, 39
In our previous work, we investigated the selectivity determinants of Streptococcus pneumoniae SrtA (spSrtA) at the P1′ position of the CWSS.12 We found that the sequence of the β7–β8 loop dramatically affects enzyme activity and selectivity at this substrate position.12 Because spSrtA crystallizes as a domain-swapped dimer, which is enzymatically inactive in our hands, we used previously published Class A sortase structures to investigate the stereochemistry of our biochemical results.12, 40, 41 Now, we investigate two additional sortases, those from S. pyogenes (spySrtA) and S. agalactiae (sagSrtA), to see if the β7–β8 loop has broad effects on enzyme function and target recognition for Streptococcus Class A sortases.
We find that the β7–β8 loop affects spySrtA and sagSrtA in a manner consistent with that of S. pneumoniae SrtA. To investigate how the β7–β8 loop sequence affects each protein, we created a series of chimeric enzymes, swapping the loop sequences from several of those previously studied.11, 12 As seen previously, while some loop sequences hinder enzyme activity in our FRET-based assay, others improve target substrate cleavage, which is the presumed rate-limiting step of the sortase-catalyzed transpeptidation reaction.42 Here, we also use X-ray crystallography to look at the stereochemistry of both spySrtA and sagSrtA β7–β8 chimeric proteins. Finally, we use mutagenesis, structural, and sequence analyses to investigate conserved characteristics in the β7–β8 loops from Streptococcus SrtA proteins. Taken together, these analyses provide new insights on the role of conserved loops near the active site of Streptococcus Class A sortases.
2 RESULTS
2.1 Enzyme assays of wild-type S. pyogenes and S. agalactiae SrtA proteins
Based on our results using spSrtA, we designed a number of β7–β8 loop chimeras using spySrtA and sagSrtA as the “scaffolds.” The wild-type sequences used were spySrtA82-249 (PDB ID 3FN5), sagSrtA79-238 (the sequence crystallized previously, in PDB ID 3RCC), or sagSrtA79-247, which includes the final nine C-terminal residues of sagSrtA based on UniProt ID SRTA_STRA3.43-45 For simplicity, we will refer to these as: spySrtA, sagSrtA238, and sagSrtA247. The β7–β8 loop sequences of these wild-type constructs were as follows: spySrtA (sequence: CTDIEATER, the catalytic Cys and Arg are included and underlined as reference points for the loop boundaries) and sagSrtA (CTDPEATER). Notably, the β7–β8 loops of spySrtA and sagSrtA differ at only one position, three residues C-terminal to the catalytic Cys, which we will refer to as β7–β8+3. The β7–β8+3 residue is Ile in spySrtA and Pro in sagSrtA. The wild-type sequences of spySrtA and sagSrtA247 are overall 65% identical (Figure 2a), which is consistent with relative sequence identities amongst other representative Streptococcus Class A sortases (Figure S1).

In order to assess relative activity and selectivity, we used a FRET-based enzyme assay involving synthetic peptide substrates. This assay utilizes well-established FRET quencher probes consisting of a substrate sequence with an N-terminal 2-aminobenzoyl fluorophore (Abz) and C-terminal 2,4-dinitrophenyl (Dnp) quencher.12, 14, 46, 47 For all assays, fluorescence was monitored for 2 h at room temperature and analyzed relative to a benchmark reaction consisting of wild-type saSrtA and the Abz-LPATGG-K(Dnp) peptide.12 For simplicity, we will remove the “Abz-” and “G-K(Dnp)” from peptide names hereafter, as they are the same for all substrates (e.g., Abz-LPATGG-K(Dnp) will be referred to as LPATG). Additional experimental details are provided in the Materials and Methods, and all averaged assay data and standard deviation values are in Table S1. All sortase enzymes were expressed and purified as previously described and as in the Materials and Methods.12 Purity was assessed by SDS-PAGE and monomeric protein fractions were pooled following size exclusion chromatography, as previously described.12
With the necessary materials in hand, we first evaluated the reactivity of wild-type spySrtA, sagSrtA238, and sagSrtA247 proteins with peptides differing only at the P1′ position (indicated in bold): LPATA, LPATG, and LPATS (Figure 2b). We included LPATA and LPATS sequences because alanine and serine are well-recognized alternative nucleophiles for S. pyogenes SrtA.9, 43, 46, 48-51 A cell wall-anchored proteins of gram-positive bacteria computational prediction tool, CW-PRED, also identifies two P1′ serine-containing sequences in three strains of S. agalactiae (2603, A909, and NEM316).52 As our data shows, spySrtA was quite active, and exhibited robust reactivity that was comparable to the benchmark saSrtA/LPATG reaction. This protein also exhibited comparable reactivity at the 2 h reaction endpoint with G-, S-, and A-containing peptides (Figure 2b). Activity for spySrtA was also markedly higher than spSrtA, which is consistent with a loop interaction described in our previous work.12 Specifically, the β6−2 position in spSrtA is R184, which was found to have a negative impact on reactivity that was attributed to a putative interaction with the β7–β8−1 Glu of this enzyme.12 In spySrtA, the corresponding β6−2 position is T185, which likely minimizes this interaction and increases reactivity. Indeed, the spySrtA structure does not show evidence for this type of interaction (Figure S2a). Turning to the S. agalactiae constructs, sagSrtA238 was catalytically inactive for all peptides tested, while sagSrtA247 reacted with all three, albeit at a lower level than spySrtA (Figure 2b). SagSrtA247 also exhibited a preference for LPATA, and we observed 50% and 72% reductions in the relative activities for the G- and S-containing peptides as compared to LPATA, respectively. This is in contrast to spSrtA, which displayed almost identical relative fluorescence values after 2 h for G-, S-, and A-containing peptides (0.29 ± 0.04, 0.26 ± 0.02, and 0.26 ± 0.01, respectively) (Figure 2b).12
2.2 The β7–β8 loop of Streptococcus SrtA proteins broadly affects enzyme activity and selectivity
Based on our previous results with spSrtA, we next wanted to substitute β7–β8 loop sequences from other SrtA proteins into spySrtA and sagSrtA247.12 We chose to substitute the SrtA β7–β8 loop sequences from: S. aureus (CDDYNEKTGVWEKR), E. faecalis (CGDLQATTR), L. monocytogenes (CDKPTETTKR), and S. pneumoniae (CEDLAATER). These sequences were chosen due to their variable effects on spSrtA.12 For example, spSrtAaureus (subscript denotes origin of the β7–β8 loop sequence) was relatively active and selective for a P1′ Gly residue, spSrtAfaecalis was relatively active and non-selective at P1′, and spSrtAmonocytogenes was inactive.12
In total we tested eight additional variants: spySrtAaureus, spySrtAfaecalis, spySrtAmonocytogenes, spySrtApneumoniae, sagSrtAaureus, sagSrtAfaecalis, sagSrtAmonocytogenes, and sagSrtApneumoniae. All proteins were expressed, purified and characterized as described previously and in the Materials and Methods.12 In general, we saw similar trends to those seen with spSrtA (Figure 3).12 For example, in the case of the S. aureus loop swaps we observed that both spySrtAaureus and sagSrtAaureus were selective for LPATG, as seen with spSrtA (Figure 3). All constructs containing the L. monocytogenes also showed a preference for LPATG, along with a clear reduction (2–3 fold) in activity as compared to the wild-type enzyme.

For the E. faecalis loop swaps, both the sagSrtAfaecalis and spySrtAfaecalis variants exhibited good reactivity that was generally higher than the corresponding wild-type enzyme. In contrast to spSrtA, however, the observed reactivity changes were not uniform across G-, S-, and A-containing peptides. Specifically, for sagSrtAfaecalis the relative activities for the LPATG and LPATS peptides were increased ~2.5-fold and 2.6-fold, respectively, as compared to wild-type, whereas for LPATA, it was only increased 1.2-fold (Figure 3b). In the case of spySrtAfaecalis, analysis at the 2 h reaction timepoint initially suggested activity comparable to the wild-type enzyme and no preference for G-, S-, and A-containing peptides (Figure 3a). However, differences between spySrtA and spySrtAfaecalis were evident at earlier reaction time points (Figure S3). In particular, at early stages in the reaction (e.g., 10 min) the E. faecalis β7–β8 loop in spySrtAfaecalis appeared to significantly increase reactivity with the G- and S-containing peptides (>2-fold relative to spySrtA), and may have also had an effect on LPATA, but it is inconclusive due to large error bars for early time points in this reaction (Figure S3d).
Finally, installation of the S. pneumoniae loop resulted in decreases in activity for most enzyme-substrate combinations. For spySrtApneumoniae, activity was ~20% lower than spySrtA for all peptides tested. In sagSrtApneumoniae, the activity of the protein for the G- and S-containing peptides was similar to sagSrtA247, but was reduced ~2.6-fold for LPATA (Figure 3b). Focusing on spSrtA and spySrtA, comparison of the β7–β8 loop sequences revealed that while the final three positions of the seven-residue loop are identical (ATE), there are three differences in the first four positions (EDLA for spSrtA versus TDIE for spySrtA, differences in bold). Based on our previous work, we attribute the lower relative activities for the G-, S-, and A-containing peptides in spySrtApneumoniae to the β7–β8+1 Glu.12 Specifically, we predicted this may be due to an interaction with the β6−2 R184 residue in spSrtA, which was supported by the observation that E208A and E208G spSrtA mutants, both at the β7–β8+1 position, each revealed ≥2-fold increases in relative reaction progress for all three peptides.12 Therefore, we in silico created T209E in the wild-type spySrtA structure to probe this hypothesis, and indeed saw that different rotamers of the mutated Glu are within distances consistent with forming a non-covalent interaction with the β6−2 T185 from the spySrtA scaffold (Figure S2b). Taken together, we consider this interaction to be a likely cause for the reduced activity in spySrtApneumoniae.
2.3 Structure determination and analysis of wild-type sagSrtA247 protein
Our rationale for choosing to investigate the role of β7–β8 loop residues in S. pyogenes and S. agalactiae was to develop a structural model for probing these selectivity determinants. Both spySrtA and sagSrtA238 were previously crystallized and we reasoned that experimental structural data would enable an understanding of the stereochemistry of sortase-substrate interactions in a way not available to spSrtA, which crystallizes as a catalytically inactive domain-swapped dimer.40, 41, 43, 45
Beginning with sagSrtA, an important consideration for the reported sagSrtA238 structure was that we found this variant to be inactive in our enzyme assays (Figure 2b). The crystal structure of sagSrtA238 shows a dodecameric protein comprised of two hexametric rings (Figure S4a). To our knowledge, there is no physiological relevance to the dodecameric assembly and this may be a purification and/or crystallization artifact. The asymmetric unit contains one-and-a-half of these units, or 18 protomers total, with the other half of the second dodecamer present in a molecule related by symmetry. Each protomer is bound to three zinc ions, with additional ions modeled in the overall structure. There is no known biological requirement for higher-order oligomers in sagSrtA activity or for zinc-binding and presumably, the presence of these ions is due to the crystallization conditions, for example, zinc acetate or zinc sulfate.44 This previous work was focused on comparing the structures of sagSrtA238 with the S. agalactiae Class C1 sortase and did not include enzyme activity data.45 Of particular interest to us, however, is that in all 18 protomers of the sagSrtA238 asymmetric unit, there are unresolved residues in either the β4–β5 loop, β7–β8 loop, or both (Figure S4b). Based on our enzyme assay data and previous structural analyses, we therefore sought to crystallize and determine a structure of sagSrtA247 that would display relevant loop residues in an enzymatically active protein construct.12
Crystallization conditions for sagSrtA247 were identified using the Hampton PEGRX screen and optimized, as described in the Materials and Methods. We ultimately solved the structure of sagSrtA247 to Rwork/Rfree = 0.186/0.207 at 1.4 Å resolution, which includes residues Q82-L247 (Figure 4). All diffraction and refinement statistics are in Table 1.

sagSrtA247 | ΔN188 sagSrtAaureus | spySrtAfaecalis | spySrtAmonocytogenes | |
---|---|---|---|---|
Data collection | ||||
Space group | P 21 21 21 (19) | I 2 (5) | P 21 21 21 (19) | P 21 21 21 (19) |
Unit cell dimensions | ||||
a, b, c (Å) | 44.13, 54.64, 60.25 | 71.30, 33.52, 115.6 | 33.60, 56.92, 71.35 | 34.29, 58.14, 73.27 |
α, β, γ (°) | 90, 90, 90 | 90, 91.8, 90 | 90, 90, 90 | 90, 90, 90 |
Resolutiona (Å) | 40.5–1.4 (1.5–1.4) | 35.6–1.8 (1.9–1.8) | 44.5–1.7 (1.8–1.7) | 45.5–1.6 (1.7–1.6) |
Rsymb (%) | 5.3 (69.4) | 9.6 (111.4) | 7.1 (190.9) | 6.7 (44.2) |
I/σIc | 29.08 (3.78) | 11.60 (1.53) | 22.91 (1.5) | 16.36 (3.02) |
Completeness (%) | 99.6 (98.6) | 99.3 (98.0) | 99.7 (98.0) | 98.5 (99.6) |
Refinement | ||||
Total # of reflections | 29,340 | 25,788 | 15,645 | 19,635 |
Reflections in the test set | 1,463 | 1,254 | 769 | 960 |
Rworkd/Rfreee | 18.5/20.7 | 18.5/24.9 | 18.0/21.7 | 26.3/30.1 |
Number of atoms | ||||
Protein | 1,277 | 2,567 | 1,245 | 1,241 |
Water | 206 | 213 | 107 | 228 |
Ramachandran plotf (%) | 99.38/0.62/0 | 97.23/2.77/0 | 99.37/0.63/0 | 97.39/1.96/0.65 |
Bav (Å2) | ||||
Protein | 15.70 | 17.89 | 24.85 | 15.24 |
Bond length RMSD (Å) | 0.010 | 0.007 | 0.007 | 0.006 |
Bond angle RMSD (°) | 1.102 | 0.943 | 0.949 | 0.925 |
PDB code | 7S56 | 7S54 | 7S57 | 7S53 |
- a Values in parentheses are for data in the highest-resolution shell.
- b Rsym = ∑h∑i|I(h) − Ii(h)|/∑h∑iIi(h), where Ii(h) and I(h) values are the ith and mean measurements of the intensity of reflection h.
- c SigAno = |F(+) − F(−)|/σ.
- d Rwork = ∑||Fobs|h − |Fcalc||h/∑|Fobs|h, h ϵ {working set}.
- e Rfree is calculated as Rwork for the reflections h ϵ {test set}.
- f Favored/allowed/outliers.
The sagSrtA247 structure adopts the conserved sortase fold, with a closed eight-stranded antiparallel β-sheet at its core (Figure 4a).2 Residues G225-F238, which are present in the crystallized sagSrtA238 construct, but are not resolved in that crystal structure, form a C-terminal helix that directly interacts with residues in the β1, β2, β5, and β6 strands, in a hydrophobic manner (Figure 4b). There are also several hydrogen bonds formed in residues C-terminal to F238, specifically S239, K240, N243, and Q244 which are not present in the crystallized sagSrtA238 construct. These interactions are largely mediated by mainchain atoms, but also include the sidechains of S87, N104, K240, N243, and Q244 (Figure 4c). In addition, the side chain of I245 is a part of a hydrophobic pocket formed with V142, L145, and L152, which are residues in the β4–β5 loop (Figure 4d). We predict that the lack of these interactions destabilizes the sagSrtA238 monomer, resulting in an inactive enzyme.
Interestingly, the C-terminus of saSrtA is substantially shorter than that of sagSrtA, or other Streptococcus SrtA proteins. Alignment of available saSrtA structures, including PDB IDs 1IJA (NMR), 1T2P (X-ray crystallography), and 2KID (NMR, +LPAT* peptidomimetic) indicate that the C-terminus of saSrtA, K206 (using 2KID and 1T2P numbering), corresponds stereochemically to K223 in sagSrtA (Figure S5a).15, 20, 53 Structural analyses of the two hydrophobic pockets that involve C-terminal residues in sagSrtA247 suggest that the saSrtA sequence would be unlikely to accommodate a similar C-terminal extension (Figure S5b,c). Specifically, an overlay of relevant structures suggests steric clashes between E77 in saSrtA with F238 in sagSrtA247 and R124 in saSrtA with I245 in sagSrtA247 (Figure S5b,c).
Structural alignments with protomers in sagSrtA238 (PDB ID 3RCC), spySrtA (3FN5), and S. mutans SrtA (4TQX) reveal that overall, sagSrtA247 adopts a conformation most similar to that of spySrtA (Figure S6a). Alignment with main chain atoms in each of the 18 protomers of the sagSrtA238 asymmetric unit reveal an average root-mean squared deviation (RMSD) value of 0.690 Å over 384 atoms, with the highest similarity between our structure and chain O (0.544 Å over 371 atoms) and lowest with chain K (0.792 Å over 394 atoms). Alignment with the two protomers of spySrtA revealed RMSD values of: 0.503 Å (551 atoms, with chain A) and 0.475 Å (535 atoms, chain B), and with S. mutans SrtA, the mainchain atoms align with an RMSD value of 0.566 Å over 549 atoms (Figure S6a). The largest differences between sagSrtA247 and spySrtA occur at the N-termini of both proteins (Figure S6b). In addition, we see ~1 Å shifts in two of the structurally-conserved loops that surround the peptide-binding cleft, the β4–β5 and β7–β8 loops, likely due to differences in crystal packing (Figure S6c). Notably, the sidechain location and orientation of residues in these loops previously identified as being selectivity determinants of spSrtA activity are stereochemically conserved (Figure S6d).12 Taken together, our structure suggests that the active sagSrtA protein adopts a similar monomeric conformation as spySrtA.
2.4 Structural analyses of chimeric Streptococcus SrtA proteins
We next wanted to investigate how our β7–β8 loop chimeras affect the structures of sagSrtA and spySrtA. We attempted to crystallize all eight of our loop chimeras, using previously optimized conditions for sagSrtA and spySrtA, as well as by setting up commercially available crystal screens (e.g., Hampton PEG/ION, Index, and/or PEGRx). We were able to crystallize and solve structures of two of our chimeric proteins: spySrtAfaecalis and spySrtAmonocytogenes (Table 1). In addition, we successfully crystallized spySrtApneumoniae and sagSrtApneumoniae, but they were not of diffraction quality.
We resolved all residues of the β7–β8 loop of our spySrtAfaecalis structure (Figure S7a), and all but the middle two residues of the spySrtAmonocytogenes loop (Figure S7b). The overall spySrtA variant conformations are identical to the wild-type protein, and alignments of mainchain atoms revealed RMSD values of 0.351 Å (533 atoms) and 0.252 Å (488 atoms) for spySrtAfaecalis and spySrtAmonocytogenes, respectively (Figure S7c).
In the spySrtA variant structures, the orientation of the spySrtAfaecalis loop is very similar to the wild-type protein (Figure 5a), and the intraloop hydrogen bond between the conserved β7–β8+2 Asp and β7–β8+6 Thr is maintained. This is not the case in the spySrtAmonocytogenes β7–β8 loop, as compared to the L. monocytogenes SrtA (lmSrtA) structure (Figure 5b,c).54 Here, the wild-type position of the β7–β8+3 Pro sterically clashes with the β4–β5+3 F145 residue in spySrtA and as a result, the β7––β8+3 Pro in spySrtAmonocytogenes is shifted away relative to the β4–β5 loop (Figure 5b,c). This results in breakage of the intraloop hydrogen bond; whereas the distance between β7–β8+1 D118 and β7–β8+6 T123 is 2.8 Å in lmSrtA, it is 7.7 Å in spySrtAmonocytogenes (black arrows in Figure 5b,c).

In the active conformation of saSrtA (PDB ID 2KID), the 12 residues in the β7–β8 loop adopt a tight structure, mediated by several intraloop hydrogen bonds and non-covalent interactions, all of which include the sidechain atoms of N188 (Figures 5e and S7d). Therefore, we wanted to test the contribution of this residue on the spySrtAaureus and sagSrtAaureus proteins, with the variants ΔN188 spySrtAaureus and ΔN188 sagSrtAaureus (Figure 3a,b). These proteins were similar to other saSrtA loop variants in that they were selective for LPATG; however, the relative activities were reduced by half as compared to spySrtAaureus and 4.5-fold as compared to sagSrtAaureus, respectively (Figure 3a,b). We next crystallized and solved the structure of ΔN188 sagSrtAaureus (Table 1), and were able to resolve all 11 residues in the loop in one of the protomers (Figure S7e). Alignment of the sagSrtA247 and ΔN188 sagSrtAaureus structures reveals the greatest structural variability in the β6–β7 and β7–β8 loops, and the overall RMSD = 0.298 Å (496 mainchain atoms) (Figure S7f).
Comparison of the β7–β8 loops in ΔN188 sagSrtAaureus and sagSrtA247 indicates that the loop in ΔN188 sagSrtAaureus adopts a more open shape (Figure 5d). In the absence of N188, it is unsurprising that the β7–β8 loop in ΔN188 sagSrtAaureus is missing the equivalent saSrtA interactions and shows only very weak, and likely unfavorable repulsive electrostatic ones between D185 and E195 (Figure S7g). Finally, we see displacement of the W194 residue in the ΔN188 sagSrtAaureus structure (Figure 5f). We hypothesize this is the largest contributor to the weaker activity in ΔN188 sagSrtAaureus, as mutation to alanine at this residue was previously shown to reduce the activity of saSrtA by approximately 2-fold, which is of a similar magnitude to the effect of ΔN188 on spySrtAaureus (Figure 3a,b).14 Interestingly, despite differences in overall shape of the β7–β8 loop in ΔN188 sagSrtAaureus, the protein retains the stringent selectivity of the saSrtA protein, recognizing only LPATG (Figure 3b).
2.5 Mutagenesis of Streptococcus SrtA proteins
Based on our structures, it is not immediately clear why sagSrtA247 is less active than spySrtA (Figure 3a,b). It is also not obvious why, for example, the ΔN188 mutation described above reduced spySrtAaureus activity 2-fold, but sagSrtAaureus activity 4.5-fold, or why there appears to be reduced ability to recognize S-containing peptides for several of the sagSrtA variants, as compared to spySrtA (Figure 3a,b). In the vicinity of the peptide-binding cleft, there are four non-conservative mutations (Figure 6a). Of these, we identified two that may contribute directly to the relatively low activity of sagSrtA247: K183 and P209.

Beginning with the K183 residue of sagSrtA247, we noted that it occupied the β6−2 position of the enzyme. In spySrtA, a threonine (T185) is present at the β6−2 position, which should not interact with the β7–β8−1 Glu, thereby avoiding an interaction that was previously shown to reduce enzyme activity in spSrtA.12 The Lys substitution in sagSrtA, however, would allow the β6−2 K183 to interact with the β7–β8−1 E213 (Figure 6b), and potentially reduce activity in a manner similar to the hypothesized interaction of the β6−2 R184 with β7–β8−1 E214 in spSrtA.12
With respect to the second residue (P209), we noticed that it occupied the β7–β8+3 loop position in sagSrtA247, similar to that in lmSrtA. We previously hypothesized that the β7–β8+3 Pro negatively affected spSrtAmonocytogenes and mutation of the wild-type Leu in L209P spSrtA reduced activity by about twofold.12 To test this, we mutated the β7–β8+3 loop residue in sagSrtA247 and spySrtA to that of the other protein, or P209I sagSrtA247 and I211P spySrtA, respectively. Relative enzyme activities were assayed and indeed, we saw an ~2-fold increase in activity for P209I sagSrtA for G-, S-, and A-containing peptides (Figure 6c). Interestingly, we saw minimal differences in the activities of I211P spySrtA as compared to wild-type spySrtA, suggesting that the β7–β8+3 residue interaction may not be as critical for spySrtA, which is relatively more active than either sagSrtA or spSrtA (Figure 2b).
2.6 Sequence patterns in the β7–β8 loops of Streptococcus SrtA proteins
Finally, we wanted to gain a general understanding of sequence patterns in the β7–β8 loops of Streptococcus SrtA proteins. Therefore, we created a WebLogo of 37 β7–β8 loops from Streptococcus SrtA proteins from the UniProt database (Figure 6d and Table S2).55, 56 We identified the loop sequences by using the catalytic cysteine and arginine residues to mark the N- and C-terminal residues of the β7–β8 loop, respectively. Our WebLogo analysis agreed with our biochemical and structural observations. The β7–β8+2 residue is an Asp in all of the loops, while the β7–β8+6 residue is a Thr (or Ser) in 35/37 sequences, consistent with an intraloop hydrogen bond interaction observed here and previously.12 We also observed an interaction between the β7–β8+3 position and the β4–β5 loop, typically of a hydrophobic nature.12 Consistent with this, our WebLogo analysis showed that the β7–β8+3 position is hydrophobic (Ala, Ile, Leu, Met, or Tyr) in 31 of the sequences, with the remaining sequences containing either Gln (5/37 sequences) or a Pro that is present in only the sagSrtA enzyme. It is unclear if the Gln can interact with the β4–β5 residues previously identified and if, as discussed for spySrtA, this interaction is correlated to the presence of a β7–β8−1/β6−2 interaction.
Notably, all 37 sequences contain a β7–β8+5 Ala residue (Table S2). Analyses of this residue in the sagSrtA247 and spySrtA structures show that it is solvent-exposed (Figure S8). However, alignment of sagSrtA247 with saSrtA-LPAT* (2KID) revealed that the β7–β8+5 A211 sidechain points directly towards the peptide (Figure 6e). Furthermore, the carbonyl of A211 interacts with the guanidinium group of the catalytic arginine and the Ala is in the same stereochemical position as W194 in saSrtA (Figure 6e). Taken together, this suggests that the β7–β8+5 Ala residue in Streptococcus SrtA proteins may play an important role in enzyme function, just as the β7–β8+10 Trp residue does in saSrtA.
3 DISCUSSION
Work from ourselves and others indicates that the structurally conserved, yet sequence variable, β6–β7 and β7–β8 loops in Class A sortases directly affect target recognition and enzyme activity.12, 38, 39 Previously, we used the Class A sortase from S. pneumoniae to investigate the differences in selectivity and activity at the P1′ position as compared to SrtA from S. aureus and seven other organisms.11, 12 Here, we extended these studies to look at similar chimeric SrtA enzymes from Streptococcus pyogenes and Streptococcus agalactiae, which were previously crystallized.43-45 Using protein biochemistry and structural biology, we find additional evidence in support of our hypothesis that the β7–β8 loop residues in these proteins determine overall enzyme activity and selectivity in a similar manner to spSrtA. Specifically, our data strongly supports the presence of three interactions mediated by β7–β8 loop residues in Streptococcus SrtA proteins that can mediate enzyme function.12 Although the exact nature of these interactions can vary in SrtA proteins from different organisms, for example, S. aureus, we argue that related ones are likely present across the broad sortase superfamily.
Our work also highlights the need for additional sortase structures that are paired with biochemical data. For example, we discovered that the sagSrtA construct previously crystallized is an inactive enzyme, and lacks several important C-terminal interactions.44, 45 This is also notable because the C-terminus of saSrtA is substantially shorter than that of the Streptococcus SrtA proteins and without the biochemical knowledge of enzyme activity, fundamental information about these enzymes is missed. In addition, the only available spSrtA structures in the Protein Data Bank are of domain-swapped dimers, which are not enzymatically active in our hands (data not shown).40, 41 Related to activity, calcium is known to allosterically activate saSrtA and an N-terminal appendage to regulate B. anthracis SrtA, but it is not clear if there are inter- or intra-protein interactions that regulate the function of additional sortase enzymes.18, 19, 21 Sequence variation, as studied here, may play critical roles in these interactions, and complementary structural and biochemical information will be necessary to thoroughly elucidate these mechanisms. Overall, considering the observations about contributions of individual residues to activity and/or selectivity by ourselves and others, there remains much to be learned from the study of individual sortase enzymes.
The work presented here may also have implications for the continued development of sortase mediated-ligation as a tool for protein engineering. Recent applications of sortagging in cells and the evolution of sortases to recognize specific targets for potential therapeutics are amongst a number of exciting developments in the field.6, 57 A greater understanding of substrate selectivity and target recognition could enable more sophisticated orthogonal labeling schemes in which multiple sortase enzymes can be utilized to recognize and modify distinct sequences on a single protein or simultaneous labeling of multiple targets.3, 9, 49, 58 This ability to add numerous site-specific tags to protein targets in vitro and in vivo would be a powerful addition to the arsenal of protein engineering.
4 EXPERIMENTAL PROCEDURES
4.1 Sequences used
The wild-type spySrtA sequence used is from the published structure, PDB ID 3FN5. This sequence was originally amplified from serotype M1 S. pyogenes strain SF370 genomic DNA, as previously described.43 This sequence is 74% identical (85% similar) to the S. pyogenes Class A sortase in UniProt, A0A2W5CEK0_STRPY (unreviewed). The wild-type sagSrtA sequence used is from the published structure, PDB ID 3RCC. This sequence was originally amplified from genomic DNA of S. agalactiae strain 2,603 V/R (locus tag of SAG0961), as previously described.44 This sequence is 99% identical, differing only at Q132, which is a proline in 3RCC, to S. agalactiae SrtA in UniProt, SRTA_STRA3 (reviewed). This substitution occurs in the β3–β4 loop. All constructs in this work, including chimeric and mutant proteins, were purchased from Genscript in the pET28a(+) vector.
4.2 Protein expression and purification
All proteins were expressed and purified as previously described for related SrtA proteins.11, 12 Briefly, plasmids were transformed into Escherichia coli BL21 (DE3) competent cells and grown in LB media, with protein induction at OD600 0.6–0.8 using 0.15 mM IPTG for 18–20 hr at 18°C. The cells were harvested in lysis buffer [0.05 M Tris pH 7.5, 0.15 M NaCl, 0.5 mM ethylenediaminetetraacetic acid (EDTA)] and whole cell lysate was clarified using centrifugation. The supernatant was filtered and loaded onto a 5 ml HisTrap HP column (GE Life Sciences, now Cytiva), followed by washing (0.05 M Tris pH 7.5, 0.15 M NaCl, 0.02 M imidazole, 0.001 M TCEP) and then elution (wash buffer with 0.3 M imidazole) of the desired protein. The His-tags of proteins prepared for crystallography were proteolyzed using tobacco etch virus (TEV) protease overnight at 4°C and a ratio of ~1:100 (TEV:protein). The His6-TEV sequence was left on proteins used for enzyme assays. Size exclusion chromatography (SEC) was conducted using a HiLoad 16/600 Superdex 75 column (GE Life Sciences, now Cytiva) in SEC running buffer (0.05 M Tris pH 7.5, 0.15 M NaCl, 0.001 M TCEP). Purified protein corresponding to the monomeric peak was concentrated using an amicon ultra-15 centrifugal filter unit (10,000 NWML) and analyzed by SDS-PAGE and analytical SEC.12 Protein concentrations were determined using theoretical extinction coefficients calculated using ExPASy ProtParam.59 Protein not immediately used was flash-frozen in SEC running buffer and stored at −80°C.
4.3 Crystallization
Prior to crystallization, spySrtA variants were dialyzed into crystallization buffer (20 mM Tris pH 7.5, 150 mM NaCl), based on previously published conditions.43 The protein concentrations used for crystallization were as follows: sagSrtA247 (15 mg/ml), ΔN188 sagSrtAaureus (16 mg/ml), spySrtAfaecalis (42 mg/ml), and spySrtAmonocytogenes (20 mg/ml). The proteins were crystallized using the hanging drop vapor diffusion technique with well and protein solution mixed in a 1:1 ratio (2 μl:2 μl). Crystallization conditions for the spySrtA variants were optimized using the crystal conditions for the apo protein.43 For sagSrtA247, initial crystallization conditions were identified using the PEGRx screen from Hampton Research. The crystallization conditions of the crystals used for data collection were: sagSrtA247 [20% (vol/vol) 2-propanol, 0.1 M MES monohydrate pH 6.1, 20% (wt/vol) PEG monomethyl ether 2,000], ΔN188 sagSrtAaureus [12% (vol/vol) 2-propanol, 0.02 M MES monohydrate pH 6, 24% (wt/vol) PEG monomethyl ether 2,000], spySrtAfaecalis [0.2 M sodium acetate, 0.1 M Tris pH 6, 30% (wt/vol) PEG 8,000], and spySrtAmonocytogenes [0.2 M sodium acetate, 0.1 M Tris pH 6.5, 24% (wt/vol) PEG 8,000]. For all proteins, glycerol was used as a cryoprotectant and the cryo solutions were equal to crystallization conditions plus 20% (vol/vol) glycerol for all except sagSrtA247 (plus 15% (vol/vol) glycerol). The crystals were flash-cooled by plunging into liquid nitrogen.
4.4 Data collection, structure determination, and protein analyses
Initial data for sagSrtA247 were collected to 2.0 Å on a Bruker Apex CCD diffractometer at λ = 1.54056 nm. Data were collected at the Advanced Light Source (ALS) at Lawrence Berkeley National Laboratory (LBNL) on beamline 5.0.1 and 5.0.2, at λ = 1.00004 nm or 0.99988 nm over 360°, with Δϕ = 0.25° frames and an exposure time of 0.5 s per frame. Data were processed using the XDS package (Table 1).60, 61 Molecular Replacement was performed using Phenix with the following search models: spySrtA (PDB ID 3FN5) for spySrtAfaecalis and spySrtAmonocytogenes, sagSrtA238 (3RCC) for sagSrtA247 and sagSrtA247 for ΔN188 sagSrtAaureus. Refinement was performed using Phenix, manual refinement was done using Coot, and model geometry was assessed using MolProbity and the PDB validation server.62-64 Phenix.Xtriage was also used to assess data quality, specifically to identify a number of outliers in the spySrtAmonocytogenes data.64 All crystal data and refinement statistics are in Table 1. Sequence alignments were performed using T-coffee or BlastP.65, 66 Visualization of alignments were done using Jalview or Boxshade.67 WebLogo was also used to visualize sequences.68 Structural analyses and figure rendering were done using PyMOL. PDB accession codes for the structures presented here are: sagSrtA247 (7S56), ΔN188 sagSrtAaureus (7S54), spySrtAfaecalis (7S57), and spySrtAmonocytogenes (7S53).
4.5 Peptide synthesis
Model peptide substrates were synthesized via manual Fmoc solid phase peptide synthesis (SPPS) as previously described.12
4.6 Fluorescence assay for sortase activity
Enzyme assays were conducted using a Biotek Synergy H1 plate reader as previously described.12 The fluorescence intensity of each well was measured at 2-min time intervals over a 2 h period at room temperature (λex = 320 nm, λem = 420 nm, and detector gain = 75). All reactions were performed in at least triplicate. For each substrate sequence, the background fluorescence of the intact peptide in the absence of enzyme was subtracted from the observed experimental data. Background-corrected fluorescence data was then normalized to the fluorescence intensity of a benchmark reaction between wild-type saSrtA and Abz-LPATGG-K(Dnp).12 Data figures were prepared using GraphPad Prism 9.1.2 or Kaleidagraph.
ACKNOWLEDGEMENTS
The authors would like to thank the other members of the Amacher and Antos labs for helpful discussions and assistance. They would also like to thank the Berkeley Center for Structural Biology (BCSB) for being an excellent resource for the crystallography community. The BCSB is supported in part by the National Institutes of Health, National Institute of General Medical Sciences, and the Howard Hughes Medical Institute. The Advanced Light Source is supported by the Director, Office of Science, Office of Basic Energy Sciences, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. Other grant information: JFA and JMA were both funded by Cottrell Scholar Awards from the Research Corporation for Science Advancement. JFA was also funded by NSF CHE-CAREER-2044958. The Rigaku X-ray Diffractometer was funded by NSF CHE-MRI-1429164 and used to collect initial sagSrtA247 diffraction data. In addition, IMP and HMK received Elwha Undergraduate Summer Research Awards and DAJ received a Joseph & Karen Morse Student Research in Chemistry Fellowship to fund summer research.
AUTHOR CONTRIBUTIONS
Melody Gao: Investigation (equal); writing – review and editing (supporting). David Alex Johnson: Investigation (equal); writing – review and editing (supporting). Isabel M. Piper: Investigation (supporting); writing – review and editing (supporting). Hanna M. Kodama: Investigation (supporting). Justin E. Svendsen: Investigation (supporting). Elise Tahti: Investigation (supporting). Frederick Longshore-Neate: Investigation (supporting). Brandon Vogel: Investigation (supporting).
CONFLICT OF INTERESTS
The authors declare no competing interest.