Volume 63, Issue 12 e202310910
Review
Open Access

Empowering Site-Specific Bioconjugations In Vitro and In Vivo: Advances in Sortase Engineering and Sortase-Mediated Ligation

Dr. Zhi Zou

Corresponding Author

Dr. Zhi Zou

DWI – Leibniz-Institute for Interactive Materials, Forckenbeckstraβe 50, 52074 Aachen, Germany

RWTH Aachen University, Institute of Biotechnology, Worringerweg 3, 52074 Aachen, Germany

These authors contributed equally to this work.

Search for more papers by this author
Dr. Yu Ji

Dr. Yu Ji

RWTH Aachen University, Institute of Biotechnology, Worringerweg 3, 52074 Aachen, Germany

These authors contributed equally to this work.

Search for more papers by this author
Prof. Dr. Ulrich Schwaneberg

Corresponding Author

Prof. Dr. Ulrich Schwaneberg

DWI – Leibniz-Institute for Interactive Materials, Forckenbeckstraβe 50, 52074 Aachen, Germany

RWTH Aachen University, Institute of Biotechnology, Worringerweg 3, 52074 Aachen, Germany

Search for more papers by this author
First published: 11 December 2023
Citations: 5

Graphical Abstract

Sortase-mediated ligation (SML) is extensively employed in site-specific bioconjugations. This review presents an overview of recent advancements in sortase engineering and the resulting innovative in vitro and in vivo SML applications. Moreover, we explore the potential future impact of integrating SML with cutting-edge techniques, such as click chemistry, unnatural amino acid incorporation, and CRISPR-Cas9, for interdisciplinary research.

Abstract

Sortase-mediated ligation (SML) has emerged as a powerful and versatile methodology for site-specific protein conjugation, functionalization/labeling, immobilization, and design of biohybrid molecules and systems. However, the broader application of SML faces several challenges, such as limited activity and stability, dependence on calcium ions, and reversible reactions caused by nucleophilic side-products. Over the past decade, protein engineering campaigns and particularly directed evolution, have been extensively employed to overcome sortase limitations, thereby expanding the potential application of SML in multiple directions, including therapeutics, biorthogonal chemistry, biomaterials, and biosensors. This review provides an overview of achieved advancements in sortase engineering and highlights recent progress in utilizing SML in combination with other state-of-the-art chemical and biological methodologies. The aim is to encourage scientists to employ sortases in their conjugation experiments.

1 Introduction

Functional protein modification enables many biological and therapeutic applications.1 Classical approaches for protein modification typically involve utilizing reactive side-chain groups of amino acid residues such as -NH2 in lysine, -SH in cysteine, or -OH in serine, threonine, and tyrosine.2 However, these approaches often result in heterogeneous labeling due to the presence of multiple reactive groups in the protein of interest. In contrast, enzymatic conjugation approaches offer a high degree of site/sequence specificity and have emerged as crucial alternatives for protein functionalization. Various enzymes, including sortase,3 intein,4 asparaginyl endopeptidases (butelase5 and OaAEP16), subtiligase,7 microbial transglutaminase,8 lipoic acid ligase,9 biotin ligase,10 Spy-ligase,11 and Snoop ligase12 have been developed for this purpose. Sortases are a large group of transpeptidases found in Gram-positive bacteria.13 Sortases exhibit a high level of overexpression efficiency (up to 232.4 mg yield per liter),14 robustness,15 and resilience in reprogramming,16 surpassing many other peptidase enzymes. Due to these features, sortases stand out as the most widely utilized enzymes in the field of site-specific bioconjugation. Nonetheless, enzymatic reactions often exhibit comparable low robustness at varied application conditions when compared to chemical methods. Thus, optimizing catalytic performances and robustness of bond-forming enzymes towards different user-defined applications and circumstances has been a central focus in the field of bioconjugation.

Enzyme engineering is usually performed in three approaches and combinations thereof: chemical modifications, semi/rational design and directed evolution.17 Chemical modifications, such as PEGylation, glycosylation, cyclization, and immobilization, are frequently utilized to enhance enzyme properties such as biocompatibility, low biodegradability, and robustness toward stressed conditions.18 Semi/rational design requires a comprehensive understanding of the structure-function relationships of the targeted enzymes in order to identify beneficial positions/regions.19 Conversely, directed evolution offers a structure-independent approach that can, in principle, be applied to engineer any protein of interest. Directed evolution, awarded with the Noble prize in chemistry in 2018, has become a routinely applied methodology to engineer proteins for a wide range of biotechnological/pharmaceutical products.20

In this review, we summarize how enzyme engineering has expanded the utility of sortases in bioconjugation. We begin by summarizing successful engineering efforts on sortases for advanced site-specific bioconjugations, highlighting the noteworthy achievements in enhanced activity/robustness, reprogrammed selectivity, calcium independence, and reduced reversibility. These improvements have been achieved through different protein engineering strategies, primarily employing directed evolution. Subsequently, we discuss the broadened applications of sortase-mediated ligation (SML) facilitated by engineered sortase variants, along with the exploration of noncanonical pathways. Additionally, we provide a summary of recent advancements in combining SML with chemical and biological tools, such as click chemistry, unnatural amino acid incorporation, and CRISPR-based genome editing. These combinations address a broad spectrum of interdisciplinary research topics such as biorthogonal chemistry, therapeutics, and biosensors.

2 Advances in Sortase Engineering and Sortase-Mediated Ligation

2.1 Sortases and Sortase-Mediated Ligation

Sortases are a family of cysteine transpeptidases that play crucial roles in anchoring proteins on the cell surface of Gram-positive bacteria.13, 21 Sortases specifically recognize secreted proteins bearing the respective sorting motifs and covalently attach them to positioned amino groups on the cell surface. As a result, sortases are involved in various cellular processes, such as peptidoglycan synthesis, pili formation, the assembly of multi-subunit hair-like fibers, iron acquisition, and spore formation.22 The transpeptidation function of sortase was first reported in 1999 through the study of a housekeeping sortase A from Staphylococcus aureus (SaSrtA, Figure 1a).23 SaSrtA cleaves the amide bond between threonine and glycine within one polypeptide (acyl donor), generating a sortase-peptide intermediate. This generated intermediate is then intercepted by a glycine nucleophile (acyl acceptor), resulting in fusion of the acyl donor and acceptor in a high site-specific manner (Scheme 1).23 To date, a superfamily of sortases are found in Gram-positive bacteria and are partitioned into six distinct classes (sortase A to F) based on their functions.21 The discussions of biological functions of sortases in Gram-positive bacteria exceed the scope of this review and we recommend other reviews for a more complete overview.13, 21, 22, 24

Details are in the caption following the image

Representative structures of sortases. (a) NMR structure of SaSrtA with a covalently bound LPAT motif (PDB: 2KID). The eight-stranded β-barrel core is labeled in light blue and numbered. The three active sites (H120, C184 and R197), Ca2+, LPAT motif, β6/β7 loop and β7/β8 loop are highlighted in green, yellow, purple, magenta and blue, respectively. (b-e) Structures of sortase A from Bacillus. anthracis (BaSrtA, PDB: 2KW8), sortase A from Streptococcus pyogenes (SpSrtA, PDB: 3FN5), sortase B from Staphylococcus aureus (SaSrtB, PDB: 1NG5), and sortase A from Corynebacterium diphtheriae (CdSrtA, PDB: 5K9A). The catalytic cysteine residues, β6/β7 loop, and β7/β8 loop are highlighted in green (stick presentation), magenta, and blue, respectively. Unique structural features (versus SaSrtA) are highlighted in red.

Details are in the caption following the image

Sortase-mediated ligation of an LPXTG tagged protein and a glycine nucleophile (X means any amino acid residue, POI: Protein of Interest) by the sortase A from Staphylococcus aureus (SaSrtA).

Early structural studies revealed that SaSrtA harbors an eight-stranded β-barrel structure housing three active site residues: His120, Cys184, and Arg197 (Figure 1a).25 Subsequent studies have demonstrated the conservation of this eight-stranded β-barrel core across all sortase classes (Figure1).26 A wealth of evidence from mutagenesis and structural investigations has shown that the flexible loops that connect strands β6 and β7 (β6/β7 loop) as well as strands β7 and β8 (β7/β8 loop) play pivotal roles in modulating the activity and sorting specificity of sortases.27 For instance, upon binding to the LPXTG substrate, SaSrtA undergoes adaptive recognition by closing and immobilizing the β6/β7 loop, while the β7/β8 loop undergoes conformational changes, unmasking the binding site for the tri-glycine nucleophile and thereby facilitating catalysis towards product formation.25a Unlike the conserved eight-stranded β-barrel backbones, the length and configuration of loops of sortases, even within the same class, exhibit significant diversity (Figure 1). This diversity may serve as a primary reason for the variations in activity and sequence specificities among sortases.

SaSrtA requires Ca2+ for efficient ligation.28 Ca2+ binds to a pocket situated between the β3/β4 and β6/β7 loops (Figure 1a). Analysis of NMR dynamics indicates alterations in mobility and a restructured β6/β7 loop.25a, 28 Both of these changes are believed to facilitate the accurate positioning of the LPXTG motif within the binding groove. Interestingly, this Ca2+-dependent regulatory mechanism has thus far only been identified in SaSrtA and not in other sortases.

The first proof of principle of protein ligation through sortase-mediated ligation was reported in 2004.3 Since then, SML has emerged as a powerful tool for protein modifications. Figure 2 provides an overview of the typical applications of SML, including protein/protein ligations,2a, 29 functionalization of protein termini,30 head-to-tail cyclization of proteins,31 surface labeling of phages or cells,32 and immobilization of proteins onto solid or soft carriers.33 Notably, SML has shown its general compatibility with various types of conjugations, extending beyond protein-protein ligations. Additional examples include the site-specific peptide-sugar,34 peptide-nucleotide,35 protein-lipid,36 protein-amine,37 protein-hydrazine,38 protein-polymer,39 protein-phage,32b, 40 protein-cell,32a polymer-polymer41 conjugations and more.29b, 29c, 42 In this review, the primary focus is to present a protein engineer‘s perspective on the successfully employed approaches utilized for customizing the characteristics of sortases. These approaches are aimed at exploring new conjugation possibilities and/or meet the demands of further applications. Table 1 summarizes the protein engineering campaigns of sortases. Specific examples are highlighted in the following sections to illustrate their novelty and application impact.

Details are in the caption following the image

Canonical applications of sortase-mediated ligation (SML) in protein engineering.

Table 1. Reports on sortase engineering campaigns and the resulting SML applications. Engineering campaigns were extensively conducted on Staphylococcus aureus sortase A (SaSrtA). A few examples were performed on Streptococcus pyogenes sortase A (SpSrtA) and Corynebacterium diphtheriae sortase A (CdSrtA).

Engineering strategies

Target

sortase

Engineered property

Expanded SML

Reference

Directed

evolution

SaSrtA

Variants with up to 140-fold higher kcat/Km value

than WT

Hela cell surface labeling

Chen et. al (2011)43

Variant with a further 5-fold kcat value compared to

Chen et al. (2011)43

Efficient antibody labeling/protein

PEGylation

Chen et. al (2016)44

Variants with up to 13.3-fold higher kcat/Km value

than WT

Not mentioned

Suliman et. al(2017)45

Variants with up to 78.5-fold kcat/Km (vs. WT) toward

N-terminal monoglycine

Proximity labeling for detecting cell-cell

interactions

Ge et. al (2019)46

Variants with up to 114-fold enhancement

in kcat/Km (vs. WT) in the absence of calcium

Efficiently intracellular cyclization of

eGFP

Gianella et. al (2016)47

Variants showed reprogrammed substrate

specificity (up to 51000-fold change of

specificity from

LPXTG to LAXTG or LPXSG)

Synthesis of tandem

fluorophore-protein-PEG

conjugates

Orthogonal conjugation of fluorescent

peptides onto surfaces

Brent et. al (2014)16

Reprogrammed substrate specificity (1400-fold change of specificity from LPESG to LMVGG

Labeling, censoring and

modification of endogenous myloid-β

(Aβ) protein

Podracky et. al (2021)48

Variants showed up to 6.3-fold higher kcat/Km (vs. WT)

in 45 % (v/v) DMSO

Ligation of hydrophobic substrates in

organic co-solvents

Zou et. al (2018)49

Rational

design

or

Semi-rational

design

SaSrtA

Altered (700000-fold) substrate specificity towards

NPQTN motif

Not mentioned

Bentley et. al50 (2007)

Altered substrate specificities (from LPXTG to

FPXTG, APXTG or SPXTG)

Traceless semisynthesis of histone H3

Piotukh et. al (2011)51

Schmohl et. al (2017)52

Calcium Independence (retained 35 % activity in

absence of calcium)

Protein/protein conjugations in absence

of calcium

Hirakawa et. al (2011)53

Calcium Independence

In cellulo histone H3 tail editing on chromatin

Yang et. al (2022)54

Variants harbored up to 22-fold improved catalytic

efficiency

Providing a sortase-mediated screening

platform of enzyme with minimized

background activity

Zou et. al (2018)55

Variants exhibited enhanced thermal stability

(8.0 °C improvement in melting temperature

vs. parent variant)

High processing stability for protein

labeling in batch and continuous-flow

systems

Zou et.al (2020)56

SpSrtA

Variants showed higher activity (up to 6.6-fold in kcat/Km) when compared to SpSrtA WT

Versatile N-terminal labeling/head-to-tail

cyclization

Expanded peptide-amine ligations

Zou et. al (2020)57

CdSrtA

A variant (CdSrtA3M) showed significantly enhanced

(vs. CdSrtA wild-type) activity in lysine-isopeptide formation

Dual and orthogonal protein labeling via

lysine-isopeptide and backbone-peptide

bonds.

McConnell et. al (2018)58

CdSrtAΔ

Lid removed variant CdSrtA showed a 3.6-fold

enhanced catalytic efficiency (vs. CdSrtA3M) in

lysine-isopeptide formation

Labeling >90 % of N-terminal of pilus

SpaA with peptides within 6 h

Sue et. al (2020)59

Chemical

modification

SaSrtA

Modified SaSrtA showed higher thermal stability

(11.2 °C improvement in melting temperature)

Efficient labeling of peptides under high

temperature (65 °C) or in presence of

denaturant guanidinium hydrochloride

Pelay-Gimeno et. al (2018)60

Bicyclized xS11 (based on SrtA 8 M) exhibited

improved thermal stability (12 °C improvement in

melting temperature) and high activity under

denaturing conditions

xS11 showed 41 and 20 % of its initial

activity under 3 M urea and 1 M

guanidine chloride in transpeptidation of

FRET assay, respectively

Kiehstaller et. al (2023)61

Backbone cyclized SaSrtA harbored enhanced

stability/activity in denaturing conditions

High efficiencies in protein/protein

ligations in presence of 2.5 mM urea

Zhulenkovs et. al (2014)62

Proximity-based

fusion

design

(PBSL)

SaSrtA

Improves the ligation efficiency to over 95 %

Traceless: the fused and expressed SpyTag and the

linker in the protein of interest were cleaved and

separated by SpyCatcher

PBSL labeled about a 2.5-fold more

anti-CD3 single-chain variable fragment

(ScFv) than traditional SML

PBSL results in shorter reaction time by

significantly (> 100-fold) increasing the

effective concentration of the target

protein

Wang et. al (2017)63

2.2 Engineering for Enhanced Transpeptidation Activity

Prior to 2011, almost all the SMLs were conducted using SaSrtA wild-type (SaSrtA WT). A notable drawback of SaSrtA WT is the relatively low activity (kcat ≤ 1.5 ⋅ s−1).27, 64 Consequently, either extended reaction time or large quantities of the enzyme were often used to achieve substantial yields of conjugates. Engineering of SaSrtA for enhanced activity was highly envisaged to promote an efficient SML process. In 2011, Liu et al. reported the first directed evolution campaign for enhancing SaSrtA activity.43 They developed an ultra-high screening strategy via a yeast surface display to evolve sortases, or in principle, any bond-forming enzymes. Notably, after nine rounds of evolution, variants were identified with up to 45-fold improvement in LPETG affinity (Km reduces from 7.6 to 0.17 mM) and a 3.6-fold increase in turnover number (kcat increased from 1.5 to 5.4 s−1). Subsequently, the authors demonstrated that the significant improvement in LPETG affinity of evolved variants enabled efficient SML on the cell surface of living Hela cells.43 Consequently, the penta-mutant (P94R/D160 N/D165 A/K190E/K196T, hereafter referred to as SaSrtA 5 M, Figure 3) has been used in many reports of SMLs.65

Details are in the caption following the image

Selected SaSrtA variants from evolution campaigns. Left: sites of mutations present in SaSrtA 5 M variant,43 8 M variant,44 Barton variant,45 and mgSrtA variant46 are collected and shown in stick presentation. The active cysteine residue (C184), Ca2+, and covalently bound LPAT motif (PDB: 2KID) are highlighted in green, yellow, and purple, respectively.25a The β3/β4, β4/β5, β6/β7, and β7/β8 are labeled in orange, purple, magenta, and blue, respectively. Mutations of variants are shown. 5 M: P94R/D160N/D165A/K190E/K196T, 8 M: 5M-D124G/Y187L/E189R; Barton variant: P94H/A104T/E105D/G167E/Q172H; mgSrtA: 8M-F200L. Right: kinetics of SaSrtA WT and the selected variants.

The kinetics of single mutational variants in SaSrtA 5 M (e.g. D160 N, D165 A, and K196T) revealed moderate beneficial effects on turnover (kcat) and more significant beneficial effects on LPETG motif recognition.43 Structurally, the identified residues are all located around the LPAT-binding groove (Figure 3). P94 lies at the N terminus of H1 helix, while D160/D165, K190/K196 lie in the β6/β7 and β7/β8 loops, respectively. Furthermore, residues that are closely located at the N-terminus of the LPETG motif tend to be substituted to less changed or hydrophobic amino acid (e.g. D160 N, D165 A, K196T), suggesting the hydrophobic interactions with the Leu and Pro within the LPETG motif are favored. Taken together, the localizations and substitutions of residues in 5 M indicate the conformational changes in the important β6/β7 and β7/β8 loops may be improving both substrate binding and activity.

In 2016, Chen et.al. conducted a directed evolution campaign on SaSrtA 5 M with the aim of further improving its transpeptidation activity.44 Through this campaign, several beneficial substitutions, including D124G, Y187L, and E189R, were identified and subsequently integrated into SaSrtA 5 M, resulting in a variant called SaSrtA 8 M (Figure 3). The kcat of SaSrtA 8 M was further significantly improved to 22.2 s−1, enabling more efficient ligations (versus WT or 5 M) at termini or endogenous lysine residue of proteins/antibodies.44 D124 lies in the β4/H2 loop, and the kcat of 5M-D124G is 4.5-fold higher than that of 5 M (16.5 s−1 versus 3.7 s−1).44 It is hypothesized that the D124G mutation may alter the orientation of the β4/H2 loop, affecting intermediate formation and leading to enhanced enzymatic activity. Like the K190, the Y187 and E189 lie in the middle region of the β7/β8 loop (Figure 3). Kinetic results demonstrated that the recombination of Y187L/E189R to 5M-D124G enhanced the binding affinity both of the LPETG motif and the tri-glycine nucleophile. In a later study, the same group evolved SaSrtA to enhance its activity towards the N-terminal monoglycine proteins.46 The resulting variant, named mgSrtA (Figure 3), has been successfully employed for intercellular labeling to monitor cell-cell interactions.46

More successful evolution campaigns have been reported to enhance the activity of SaSrtA.45 Notably, most of these substitutions are located within the loop regions, such as the β3/β4 loop;45 the β6/β7 loop;43 and the β7/β8 loop (Figure 3).43, 44 Considering the conservation of these loops among various sortases, applying the insights gained from engineering SaSrtA activity, particularly in these loop regions, is a reasonable approach to improving the transpeptidase activity of other sortases.

As previously mentioned, the low binding affinity for SaSrtA WT to LPETG motif (Km > 5mM) is a primary factor that resulted in poor reaction kinetics. Rather than embarking on a directed evolution campaign to enhance binding affinity, Tsourkas and colleagues ingeniously addressed this challenge by devising a proximity-based sortase-mediated ligation (PBSL) approach.63 In PBSL, SaSrtA is fused and expressed with the SpyCatcher protein (12.3 kDa) and subsequently immobilized on a cobalt resin. The protein of interest is expressed by appending an LPXTG motif and a short SpyTag (thirteen amino acids) at its C-terminus. SpyTag spontaneously reacts with the SpyCatcher, forming an irreversible isopeptide bond.66 The SpyCatcher-SaSrtA resin is used to initially isolate the protein of interest from cell lysate via isopeptide bond formation. Subsequently, PBSL on the column is initiated by the addition of the desired peptide nucleophile and Ca2+. Due to the increased local concentration of the protein of interest in proximity to the tethered SaSrtA, PBSL proceeds rapidly and efficiently (> 95 % efficiency). Following the ligation reaction, only the product is released into the solution, while both SpyCatcher-SaSrtA and SpyTag remain bound to the resin. Notably, PBSL streamlines the SML by achieving protein purification, labeling, and tag removal in a single step, thus expanding SML applications.67

2.3 Engineering of Altered Substrate Specificities

SML of proteins with broad substrate specificity, rather than relying on the canonical oligoglycine/LPXTG pair, is one of high interest in protein modification. The ability to join different substrate specificities offers a way to achieve multiple and orthogonal protein modifications. However, despite diverse motifs present in various sortases (Table 2), their activities are in general very low (kcat in general ≤ 0.02 s−1 except for SaSrtA)57, 68 and insufficient for applicable SMLs (low yield; long conversion times). Among the different sortases, SaSrtA stands out for its remarkably high activity level.68 Instead of attempting to enhance the activity of a sortase with a different sorting motif starting from low kcat values, numerous studies have attempted to alter the sequence-specificity of SaSrtA.

Table 2. Specificity of acyl donor and acceptor of sortases (wild-type and engineered variants). SrtA-LS: The chimeric β6/β7 loop swap SaSrtA; SaSrtB; SpSrtC2: Streptococcus pyogenes sortase C2; SavSrtE: Streptomyces avermitilis sortase E; Pilin domain: WXXXVXVYPKH (X means any amino acid residue); SpnSrtA: Streptococcus pneumoniae SrtA; SpnSrtAfaecalis: SpnSrtA with the β7-β8 loop residues from Enterococcus faecalis; SpnSrtAlactis: SpnSrtA with the β7-β8 loop residues from Lactococcus lactis; SpnSrtAoralis: SpnSrtA with the β7-β8 loop residues from Streptococcus oralis; SpnSrtAsuis: SpnSrtA with the β7-β8 loop residues from Streptococcus suis; CgSrtE: Corynebacterium glutamicum sortase E.

Sortase

Recognized acyl donor

Acyl acceptor

Reference

SaSrtA

Wild-type

LPXTG

(sparing activity to MPKTG, IPKLG,

LPRAG, or MPXTG)

N-glycine, lysine (ϵ-amine, in

pilin domaine), unbranched

(at α-carbon) primary amines,

hydrazine

Kruger et. al (2007)69

Piotukh et. al (2011)51

Glasgow et. al (2016)37a

Li et. al (2014)38

SrtA-LS

NPQTN, LPXTG

N-glycine

Bentley et. al (2007)50

SaSrtA F21

FPKTG

N-glycine

Piotukh et. al (2011)51

SaSrtA F40

APKTG, DPKTG, SPKTG,

N-glycine

Piotukh et. al (2011)51

SaSrtA A1-22

APXTG

N-glycine

Schmohl et. al (2017)52

SaSrtA F1-21

APXTG, FPXTG

N-glycine

Schmohl et. al (2017)52

SaSrtA r4M

LPXTG, LAETG

N-glycine, unbranched (at α-carbon)

primary amines

Heck et. al (2014)72

Zou et. al (2018)49

SaSrtA 5 M

LPXTG, LAETG, LPEAG, LPECG, LPESG

N-glycine

Brent et. al (2014)16

SaSrtA 2 A-9

LAETG

N-glycine

Brent et. al (2014)16

SaSrtA 4S-9

LPEAG, LPECG, LPESG

N-glycine

Brent et. al (2014)16

SrtAβ

LMVGG, LPVGG

N-glycine

Podracky et. al (2021)48

SaSrtA 7 M

LPXTG, LPAAG, MPATG, LPALG, VTAS

Glycine

Li et. al (2017)73

SaSrtA

LPATG

N-glycine

Yang et. al (2022)54

SpSrtA wild-type

LPXTG, LPETA, LPRLG, LPRAG

N-glycine, N-alanine, N-serine (sparing activity to

N-asparagine, N-valine,

N-threonine, or N-cysteine)

Schmohl et. al (2017)68

SpSrtA M3

LPXTG, LPETA, LPRLG,

N-glycine, N-alanine, N-serine

(sparing activity to N-cysteine),

branched or unbranched

(at α-carbon) primary amines

Zou et. al (2020)57

SaSrtB wild-type

NPQTN

N-glycine

Bentley et. al (2007)50

SpSrtC2 wild-type

QVPTG

N-glycine

Barnett.et. al (2004)74

SavSrtE wild-type

LAXTG

N-glycine

Das et. al (2017)75

CdSrtA3M

LPRTG

Lysine (ϵ-amine, in pilin domain)

McConnell et. al (2018)58

CdSrtAΔ

LPLTG

Lysine (ϵ-amine, in pilin domain)

SpnSrtA

LPATA, LPATG, LPATS

hydroxylamine

Piper et. al (2021)70

SpnSrtAfaecalis

LPATA, LPATF, LPATG LPATL, LPATS, LPATV

Hydroxylamine or N-alanine,

N-serine, and N-valine

Piper et. al (2021)70

SpnSrtAlactis

LPATA, LPATF, LPATG LPATL, LPATS, LPATV

hydroxylamine

Piper et. al (2021)70

SpnSrtAoralis

LPATA, LPATG, LPATS

hydroxylamine

Piper et. al (2021)70

SpnSrtAsuis

LPATA, LPATF, LPATG LPATM, LPATS, LPATY

hydroxylamine

Piper et. al (2021)70

CgSrtE

LAHTG

N-glycine

Susmitha et. al (2023)76

The specificity of SaSrtA at each position of the five-amino-acid sorting motif was thoroughly analyzed by McCafferty et. al.69 The results demonstrated that the LPXTG motif is the most favored recognition motif of SaSrtA, but it is not completely stringent. Activity toward different motifs such as MPXTG, LAXTG, LPXAG, LPXSG, LPXVG, or LPXLG (Table 2), has been observed. The latter finding offers promising opportunities for protein engineers to alter or expand the substrate specificity/scope of SaSrtA, particularly through the engineering of loop regions such as the β6/β7 and β7/β8 loops.

McCafferty et al. reported the first specificity engineering campaign, wherein they swapped the β6/β7 loop from SaSrtB (recognizing an orthogonal NPQTN motif) into SaSrtA.50 The resulting chimeric enzyme (SrtLS, Table 2) conferred the ability to recognize NPQTN-containing substrates and cleave the amide bond between threonine and asparagine, although it was unable to perform the transpeptidation stage of the reaction.50

Another directed evolution campaign targeting the β6/β7 loop of SaSrtA for altered recognition motif specificity was reported in 2011.51 A phage display-based “panning” method was used to screen a simultaneously randomized library containing six amino acid residues on the β6/β7 loop. Finally, the variant F40 was identified (Table 2), which exhibited a comparable transpeptidase activity towards an orthogonal APKTG motif.51 Altered recognition motif specificity or substrate specificity represents a gain-of-function that was exploited for the traceless semi-synthesis of histone H3.51 Building on this work, a second library containing nine amino acid residues on the β6/β7 loop was generated and screened. This led to the generation of a variety of SaSrtA variants, such as SaSrtA A1-22 and SaSrtA F1-21, which displayed significantly enhanced activity for SML of APKTG/FPKTG and oligoglycine.52

In parallel, Liu and colleagues reprogrammed the substrate specificity of SaSrtA 5 M through directed evolution using the developed yeast display screening platform.43 They generated two families of SaSrtA variants that recognized orthogonal LAXTG and LPXSG motifs with comparable high catalytic efficiencies to the ‘standard’ LPXTG motif.16 The versatility of these evolved SaSrtA was demonstrated in site-specific modifications of endogenous fetuin A in human plasma, N- and C-termini dual functionalization of two therapeutically relevant fibroblast growth factor proteins (FGF1 and FGF2), and the orthogonal conjugation of fluorescent peptides onto different surfaces.16 More recently, the Liu lab used the yeast display screening platform once again to reprogram the specificity of SaSrtA to selectively recognize an endogenous LMVGG motif in Alzheimer's disease (AD)-associated amyloid-β (Aβ) protein.48 They identified a sortase variant, SrtAβ, with a > 1400-fold change in substrate preference for LMVGG over LPESG motif. SrtAβ efficiently conjugates a hydrophilic peptide to Aβ, significantly delaying the initiation of detectable aggregation, which could contribute to the development of new AD treatments.

In another study, Amacher et. al reported another loop-swapping study for chimeric sortases.70 They designed eight chimeric SrtAs by replacing the β7/β8 loop of Streptococcus pneumonia sortase A (SpnSrtA) with loops from other SrtAs from different species. They found some of these chimeric SrtAs retained the activity and substrate specificity of the wild-type protein from which the loop sequence was derived (e.g., SaSrtA), while other chimeric SrtAs (e.g., SpnSrtAfaecalis and SpnSrtAlactis) can recognize a range of residues in the final position of the substrate motif such as LPATA, LPATF, LPATG, LPATL, LPATS, and LPATY (Table 2).

The screening of the acyl acceptor revealed that SaSrtA stringently recognizes protein with N-terminal glycine residue.68 Interestingly, specificity engineering of acyl acceptor has barely been reported for SaSrtA. Sortase A homologs from other bacteria show a broad scope of acyl acceptor (Table 2).68 A typical example is the sortase A from Streptococcus pyogenes (SpSrtA), which exhibits a comparable activity towards peptide with N-terminal alanine and serine in comparison to the canonical N-terminal glycine. Several studies have demonstrated dual modifications of proteins using SaSrtA and SpSrtA with orthogonal motifs.40, 71 However, the low activity of SpSrtA often requires large SpSrtA quantities. Enhancing the activity of SpSrtA for different acyl acceptors through engineering would therefore expand applications of SML.

Recently, we reported the first rational design campaign of SpSrtA.57 After conducting a sequence and structure-based analysis, substitutions were introduced in positions near the active sites or within the β6/β7 and the β7/β8 loops. One variant, SpSrtA M3 (E189H/V206I/E215A) was identified and exhibited up to 6.6-fold improved activity towards N-terminal alanine or serine compared with wild-type (Table 2). Furthermore, the specificity of SpSrtA M3 for amine compounds was investigated. Unlike SaSrtA, which only recognizes unbranched primary amine at α-carbon,37a SpSrtA M3 accepts both unbranched and branched primary amine as an acyl acceptor (Table 2).57 This expanded specificity could facilitate peptide-amine biorthogonal conjugations, for instance in vivo protein labeling and cell imaging.

2.4 Engineering of Calcium Independency

Specific modifications of intracellular proteins hold significant potential in applications such as tumor therapies, tissue engineering, cell image, and biosensing.77 The efficacy of SML lies in its high specificity and accessibility to the incorporated LPXTG or oligoglycine sequences, making it a broadly applicable tool for protein modifications. Importantly, its applicability extends beyond in vitro settings, as evidenced by emerging in vivo applications.37a However, the most widely used sortase, SaSrtA, is dependent on calcium for its activity and typically requires a high Ca2+ concentration (≥ 3 mM) to be highly functional.14, 28, 78 This poses a significant challenge for in vivo SML since the cytoplasmic concentration of calcium ions is approximately 100 nM, significantly lower than the optimal required concentration.47 Therefore, the development of SaSrtA variants with calcium-independent activity is crucial to broaden in vivo applications of SML.

The first study of Ca2+-independent SaSrtAs was reported in 2012.53 Structural analysis revealed that four negatively charged residues, E105, E108, D112, and E171, are responsible for Ca2+ binding in SaSrtA (Figure 4a). Sequence alignment of SaSrtA with six other Ca2+-independent sortase A in the β3/β4 loop region demonstrated that K105 and Q108 (in the order of SaSrtA sequence) are highly conserved in all sortases except in SaSrtA (Figure 4b). Consequently, site-directed mutagenesis was performed for these two positions. Two variants, E105K/E108A and E105K/E108Q, were identified and exhibited approximately 35 % catalytic efficiencies compared to the SaSrtA WT (in the presence of 10 mM Ca2+) in absence of Ca2+ (Table 3).53 Later, the identified substitutions at positions 105 and 108 were further recombined to SaSrtA 5 M and 8 M (Figure 3) and generated Ca2+ independent 7 M79 and 7+[80] variants. SaSrtA 7 M efficiently catalyzed the fusion of tyrosine kinases and maltose-binding protein in the cytoplasm of E. coli.79 SaSrtA 7+ exhibited comparable activity in the presence or absence of Ca2+, surpassing the functionality of SaSrtA 7 M in both scenarios (Table 3).80

Details are in the caption following the image

Rational design of Ca2+-independent SaSrtA. (a) The Ca2+ binding pocket between the β3/β4 loop (orange) and β6/β7 (magenta) loop. The residues that bind to Ca2+ are highlighted in stick presentation. (b) Amino acid sequence alignment of different sortases within the β3/β4 loop region.53 The residues E105 and E108 which were mutated for the engineering of Ca2+-independency are highlighted in color. EfSrtA: sortase A from Enterococcus faecalis V583; LpSrtA: sortase A from Lactobacillus plantarum WCFS1; LlSrtA: sortase A from Lactococcus lactis subsp. lactis Il1403; LmSrtA: sortase A from Listeria monocytogenes EGD-e.

Table 3. Kinetics of engineered Ca2+-independent SaSrtA variants. Mutations of variants are shown. 5 M: P94R/D160N/D165A/K190E/K196T; 7 M: 5M-E105K/E108Q; 7+: 7M-D124G/Y187L/E189R; R15-78: Q60L/K67R/P94R/E105V/E108G/D124N/K138I/D160V/D185V/E189K/K196N/E204V.

SaSrtA variant

kcat (s−1) in absence of calcium

Km LPXTG (mM) in absence of calcium

kcat/Km

(s−1.mM−1)

Reference

WT

0.019

7.28

0.004

Gianella et. al (2016)47

E105K/E108A

0.18

1.55

0.116

Hirakawa et. al (2011)53

E105K/E108Q

0.16

1.40

0.114

Hirakawa et. al (2011)53

5 M

0.47

4.56

0.103

Gianella et. al (2016)47

7 M

1.85

0.32

5.781

Hirakawa et. al (2015)79

7+

Higher than 7 M

Not determined

Higher than 7 M

Joeng et. al (2017)80

R15-78

0.57

1.94

0.294

Gianella et. al (2016)47

In 2016, a compartmentalization-based ultra-high screening method was developed to evolve SaSrtA with high activity in absence of Ca2+ ions.47 After 15 rounds of evolution, a variant R15-68 was identified, showing a 114-fold enhancement in kcat/Km (vs. SaSrtA WT) in the absence of Ca2+ (Table 3).47 Interestingly, the substitutions E105V and E108A were identified in the variant R15-68, aligning with the previous study for calcium independency.53 Additionally, substitutions P94R, D124N, D160V, E189K, and K196N were also obtained in the variant R15-68, which are in agreement with the previous studies for improved catalytic efficiency.44 The intracellular activity of the variant R15-68 was further validated by catalyzing a head-to-tail cyclization of green fluorescent protein (GFP) in cytoplasm.

More recently, Wu et. al conducted an engineering campaign of SaSrtA for Ca2+-independent ligation activity specifically towards histone H3.54 By deleting positive charges and introducing negative charges around the substrate binding pocket of SaSrtA 5 M variant, specifically R94S and A165D (which was reverted back to the wild-type), a hexa-mutant 6 M was generated. This 6 M variant exhibited a 13.2-fold improved catalytic efficiency for ligation of the histone H3L (26–39) peptide in comparison to the 7 M variant.81 Notably, the 6 M was successfully utilized in editing the H3L tail on native chromatin in isolated nuclei and living cells.

2.5 Engineering of Enhanced Robustness

SML has been studied for the labeling of proteins in batch and continuous-flow systems.15 Flow-based systems offer several advantages over one-pot SML, including the elimination of purification steps for LPXTG substrates, a low glycine nucleophile concentration to minimize the back reaction, and immediate product release. However, the storage and process robustness of SaSrtA remains a significant challenge for cost-effective applications in flow systems. Therefore, there is a need for engineered SaSrtA with enhanced robustness in respect to process conditions, such as elevated temperatures, protease resistance, and tolerance toward denaturing agents.

In 2014, the first study aiming to improve the resistance of SaSrtA against denaturing agents was reported.62 The researchers implemented a head-to-tail backbone cyclization of SaSrtA using an intein-mediated posttranslational modification. The cyclized SaSrtA, referred to as CsrtAΔN59, exhibited high resistance and activity in the ligation of peptides and proteins in the presence of up to 2.5 M urea.62

In 2018, Grossmann et al. introduced an approach called in situ cyclization of proteins (INCYPRO) to stabilize SaSrtA through intramolecular crosslinks.60 In detail, they introduced three substitutions D111C, E149C, and K177C to the sortase enzyme (referred to as SaSrtA S7) and achieved bicyclic stabilization upon reaction with a triselectrophile t1. The resulting bicyclic enzyme, named SaSrtA S7-t1, exhibited significantly improved thermal stability (melting temperature Tm=70.6 °C vs. SaSrtA WT (Tm=59.4 °C)) and increased enzymatic activity (8.7-fold) at 65 °C.60 SaSrtA S7-t1 also demonstrated remarkable resistance to hydrolytic conditions, retaining 40 % activity in the presence of 1 M GdnHCl, whereas the SaSrtA WT was completely inactivated.60 More recently, the authors further utilized INCYPRO to engineer the SaSrtA 8 M (Figure 3) and generate a bicyclized variant xS11, which showed a 12 °C improvement in Tm and high resistance under denaturing conditions.61

In 2020, we conducted an engineering campaign to enhance the robustness of a highly active variant SaSrtA rM4 (P94S/D160N/D165A/K196T).43 The variant rM4 exhibited a 140-fold improved activity but showed a 10.8 °C reduced Tm compared to the SaSrtA WT (48.6 °C (rM4) vs. 59.4 °C (WT)).56 The latter was attributed to the β6/β7 loop of SaSrtA, which is reported to have high mobility and be involved in substrate binding and ligation.25a, 49 To address the lowered Tm value, which often correlates to process robustness, we screened eleven single site-saturation mutagenesis libraries (SSM: 159, 161, 162, 163, 164, 166, 167, 168, 169, 170, and 172) lie in the C-terminus of β6 strand and the β6/β7 loop of SaSrtA rM4. Subsequent recombination of beneficial mutations resulted in a variant called M6 (rM4-R159N/K162P, Figure 5a). The M6 variant has a 6.0 °C increase in melting temperature (54.6 °C). Furthermore, by subjecting M6 to head-to-tail backbone cyclization, we generated a cyclized variant CyM6 with a melting temperature at 56.1 °C. In comparison to rM4, Cy6 harbored up to a 4.6-fold increase in resistance and gains up to a 2.6-fold increase in yield of peptide and primary amine conjugates under denaturing conditions.56

Details are in the caption following the image

(a) Structural presentation of mutational residues in SaSrtA (PDB: 2KID) for enhanced thermal stability (R159N, K162P) and solvent resistance (R159G, D165Q, D186G, and K196V). (b) Mutational residues (D81G, W83G, and N85A) in the lid region of CdSrtA (PDB: 5K9A) for enhanced activity in lysine-isopeptide bond formation. The active Cys residue, β6/β7 loop, and β7/β8 are labeled in green (stick presentation), magenta, and blue, respectively.

As aforementioned, the relatively high Km value (> 5 mM)55, 82 of SaSrtA often requires high concentrations of peptide nucleophiles employed in SML to achieve high bioconjugation yield. However, the latter can be limited by the solubility of peptides. To overcome this limitation, organic solvents such as dimethyl sulfoxide (DMSO) and dimethylformamide (DMF) are commonly added as co-solvents to improve the solubility of hydrophobic molecules. It has been previously reported that the activity of SaSrtA is significantly reduced at co-solvent concentrations ≥ 30 % v/v.14 In 2018, we carried out a directed evolution campaign to gain the resistance and activity of SaSrtA in high concentrations of DMSO (≥ 45 % v/v). Two SaSrtA variants M1 (R159G) and M3 (D165Q/D186G/K196V) with increased resistance (2.2-fold) and activity (6.3-fold) in 45 % (v/v) DMSO were identified (Figure 5b). We also tested the activity of these sortase variants in other co-solvents. Interestingly, M1 retained the high resistance profiles in DMF and methanol, while M3 showed remarkable enhanced specific activity (vs. WT) in 30 % (v/v) DMF, 30 % (v/v) ethanol, and up to 50 % (v/v) methanol.49 Notably, the position R159, which lies in the C-terminus of the β6 strand (Figure 5a), has also been reported for contributing thermal stability in the previous study.56 This finding suggests that mutations at position 159 may affect the flexibility/mobility of the β6/β7 loop. The D165Q is located at the N-terminus of the 310 helices (within the β6/β7), which forms upon the substrate binding. Additionally, D186G and K196 V are situated at the N and C-termini of the β7/β8 loop, respectively (Figure 5a). Collectively, these results suggest that alterations in these loops not only impact the activity but also the stability of SaSrtA. Proof of concepts was successfully reported for M3 by ligating a hydrophobic peptide and biohybrid conjugation of primary amines with up to 94 % conversion in the presence of DMSO (45 %, (v/v)) as a co-solvent.49 These findings highlight the promising potential of SML for versatile bioconjugation in synthetic applications beyond natural aqueous solvents.

2.6 Engineering for Enhanced Activity in Lysine-Isopeptide Bond Formation

SML is usually restricted to: (1) modifications that take place at termini of the protein of interest, and (2) only a single molecule can be grafted to the protein harboring the recognition motif. In nature, Gram-positive bacteria such as Corynebacterium diphtheriae and Actinomyces naeslundii, assemble pilin monomers to pilin polymer by isopeptide bond formation catalyzed by sortase A.83 Given that the sortase A-protein thioester undergoes a nucleophilic attack by the ϵ-amino group of a lysine residue in a pilin domain (WXXXVXVYPKH; where X can be any amino acid residue) (Scheme 2a).83b

Details are in the caption following the image

(a) Sortase-mediated isopeptide formation between an LPXTG tagged probe and the ϵ-amino group of a lysine residue in the pilin domain sequence WXXXVXVYPKH (X represents any amino acid residue).58, 84 (b) Reducing ‘the reversible reaction’ of sortase-mediated ligation (SML) by introducing a rigid β-hairpin structure around the LPXTG recognition site in the fusion product.86 (c) Reducing the reversible activity of SML by incorporating Ni2+-assisted inactivation to prevent competitive nucleophilic attacks.87 POI: Protein of Interest.

In light of these observations, Roy and his co-workers investigated the capability of SaSrtA in isopeptide SML.84 By using a synthesized model peptide within the referred sequences, they showed that SaSrtA attached LPXTG peptide substrates to the side chain of lysine residues in the pilin domain and form cyclized and branched oligomers.84 As a step forward for more efficient isopeptide ligation, engineering campaigns of Corynebacterium diphtheria sortase A (CdSrtA) were reported in 2018.58, 85 After obtaining a high-resolution crystal structure of the CdSrtA,85 substitutions were introduced to unmask the lid of the active site. A variant CdSrtA3M with triple substitutions (D81G/W83G/N85A) was finally identified and showed up to 10.5-fold improved yield (vs. WT) and 95 % conjugation efficiency in SpaA isopeptide modification (Figure 5b).58 Noteworthy, the ϵ-amino group in lysine is a distinct acyl acceptor in comparison to oligoglycine. Therefore, CdSrtA3M and SaSrtA were used to orthogonally label the ubiquitin-like modifier (SUMO) protein at its N- and C-termini in a one-pot reaction.58 In further studies, the inhibitory polypeptide appendage was overcome by removing a lid that normally masks the active site. This resulted in a more active variant, CdSrtAΔ, that showed more than a 3.6-fold increase in catalytic efficiency compared to the variant of CdSrtA3M.59

2.7 Reducing Reversible Reaction

SML is susceptible to a reversible reaction due to competition between the released product, which acts as a substrate-competing nucleophile due to the free primary amine. To address this challenge, Nagamune and colleagues introduced a rigid and stable β-hairpin structure around the LPXTG recognition motif, resulting in a product unrecognized by Sortase A (SrtA) (Scheme 2b).86 They demonstrated that a fusion protein consisting of thioredoxin (Trx) and maltose binding protein (MBP), incorporating a GSKKWTWTWLPATGGWTWTWQESS-based β-hairpin, exhibited significantly enhanced resistance towards sortase A-mediated cleavage in the presence of an excess amount of triglycine. Meanwhile, the ligation of Trx-GSKKWTWTWLPATGG and GGWTWTWQESS-MBP resulted in a 70 % conversion to the fusion product, whereas the control only achieved a conversion of less than 50 %. In 2015, Antos et al. reported a metal-assisted sortase-mediated ligation (MA-SML) strategy to improve the ligation efficiency of SML.87a This approach involved a minor modification of the substrate motif, replacing the last glycine with histidine (LPXTGGH). During the SML process, the resulting product GGH, which includes a Ni2+-binding peptide and an active nucleophile, was inactivated by the addition of Ni2+, thereby reducing competitive nucleophilic attacks and minimizing reversibility (Scheme 2c). By employing MA-SML, ligation conversions of up to 91 % were achieved using equimolar ratios of the glycine nucleophile. Recent developments in MA-SML have further expanded its versatile applications for the ligation of diverse cargos with high yields.87b

In addition to the previously mentioned strategies for reducing reversibility, several other methods have been explored to inhibit or minimize reversibility in SML. However, these methods often require the use of non-natural substrates (see next chapter).

2.8 Development of Unnatural Sortase-Mediated Ligations

The initial investigation into unnatural SMLs employing unnatural substrates was conducted in 2012.88a The authors synthesized an array of depsipeptides bearing an LPET motif and used them as substrates for N-terminus labeling of proteins through SML. The resulting side-product, alcohols, are of poor nucleophilicity for sortase and therefore the reverse reaction in SML is minimized (Scheme 3a). As demonstrated, the SML reaction is of high efficiency and its widespread utility for the N-terminal modification of proteins was reported.88b In addition to the remarkable ligation efficiency, this research also provided evidence that sortases can recognize unnatural motifs beyond the typical amide bond backbone structure. This observation highlights the potential of SML for a novel field of SML with noncanonical ligations purposes.

Details are in the caption following the image

Unnatural sortase-mediated ligations (SMLs) (a) Using depsipeptide substrates for SML results in nonnucleophilic hydroxyacetyl byproducts.88 (b) Concurrent inactivation of the nucleophilicity of byproduct through diketopiperazine (DKP) formation.89 (c) Preventing the reversible reaction of SML by generation of a protein hydrazide that is no longer recognized by sortase.38 (d) Achieving irreversible activity by thioester-assisted SML.90 POI: Protein of Interest.

In 2014, an irreversible ligation of modified substrates (LPETGG-isoacyl-Ser and LPETGG-isoacyl-Hse, Scheme 3b) to amine nucleophiles of interest with SML was reported.89 The irreversibility was achieved due to the concurrent inactivation of the SrtA-excised peptide fragment through diketopiperazine

Liu et al. introduced a novel and irreversible SML process through hydrazinolysis of proteins (Scheme 3c).38 The finally generated protein hydrazide is no longer recognizable for sortases and therefore results in an irreversible reaction with minimized hydrolysis. Under optimized conditions, yields > 95 % with ‘protein hydrazides’ were achieved within 1 h. The utility of this process was exemplified in the semi-synthesis of deoxy-D-ribose 5-phosphate aldolase (DERA), the functionalization of ubiquitin, and the fluorescent labeling of anti-EGFR (epidermal growth factor receptor) nanobodies. Of particular interest, SaSrtA WT exhibited comparable activity in hydrazinolysis of LPXAG compared to the LPXTG motif while previous studies indicated only partial activity in conjugation to LPXAG using triglycine as the nucleophile.68

More recently, Liu et al. reported another irreversible SML method called thioester-assisted SrtA-mediated ligation (Scheme 3d).90 This technique efficiently enables bioconjugation between a protein thioester and a N-terminal Gly protein. Notably, the method demonstrates tolerance towards a wide variety of LPXT-derived sequences as substrates, thereby overcoming the two primary limitations of the SrtA technique: irreversibility and strong preference for LPXTG motif.90 The availability of peptide/protein thioesters through many recombinant methods or chemical synthesis further facilitates the thioester-assisted SrtA-mediated ligation, enabling the efficient functionalization of proteins from both termini with expanded recognition motifs beyond canonical LPXTG. These advancements solved the yield/back reaction challenges in SML.

3 Recent Advancements: Integrating Sortase-Mediated Ligation with Cutting-Edge Chemical and Biological Tools

In the preceding chapters, numerous innovative applications of SML have been discussed by utilizing both engineered variants and exploring noncanonical ligation pathways. However, it is necessary to note that these instances do not yet encompass an exhaustive list. In the past few years, remarkable advances of SML have been reported for vaccine engineering,91 multi-fragment assemblies,92 proximity-based ligation with shorter reaction times and greater efficiency,63, 67 high-strength force spectroscopy,93 biomaterials33b, 94 high-throughput screening,55, 95 ligation of d-peptides,96 and more. In this chapter, several recent examples are highlighted to demonstrate the expanded versatility of SML in combination with other chemical and biological methodologies, such as click chemistry, incorporation of unnatural amino acids (UAAs), and CRISPR-based genome editing (Figure 6). These combinations allow us to address forefront research challenges in fields such as biosensors, biorthogonal chemistry, and therapeutics.

Details are in the caption following the image

Recent advances in integrating sortase-mediated ligation (SML) with state-of-the-art chemical and biological tools. (a) SML integrated with Click chemistry (Huisgen cycloaddition as an example) for terminus labeling of a protein (C-terminus as an example). (b) Site-specific functionalization of proteins through the combination of unnatural amino acid incorporation, bioorthogonal Staudinger reduction, and SML.102 (c) Site-specific functionalization of proteins in living cells through the combination of CRISPR-Cas9-based genomic editing and SML.

3.1 Combination of Sortase-Mediated Ligation and Click-Chemistry

Click chemistry reactions, such as Huisgen cycloaddition and Staudinger ligation, which incorporate small abiotic functional groups, are well recognized for their remarkable reaction efficiency, unparalleled chemoselectivity, and biocompatibility with mild reaction conditions. These features make click chemistry a valuable tool for biomolecule functionalization, particularly for proteins, both in vitro and in vivo.97 The emergence of click chemistry led to the development of biorthogonal chemistry which was jointly awarded the Nobel Prize in Chemistry in 2022. The combination of bioorthogonality in click chemistry with the sequence specificity of SML provides exceptionally powerful tools for protein bioconjugation. In 2011, Ploegh et al. first demonstrated the application of SML for grafting click handles onto the N- or C-terminus of proteins. Subsequently, these clickable proteins were conjugated via a strain-promoted cycloaddition in an aqueous environment at neutral pH, resulting in the formation of noncanonical N-to-N and C-to-C fused dimers.98 The same sortase-click strategy was then further used to generate a C-to-C-fused antibody with bispecific anti-influenza virus activity and many other applications.99

Another notable application of the sortase-click strategy was carried out by Kondo et.al.100 They developed a hydrogel by combining streptavidin and branched poly(ethyleneglycol) to efficiently immobilize biomolecules. In their approach, tetrameric streptavidin molecules containing four LPETG tags were linked to azido-GGG peptides using sortase A. Subsequently, by introducing dibenzylcyclooctyne-modified branched poly(ethyleneglycol) and azido-modified branched poly(ethyleneglycol) as flexible spacers, a streptavidin-based hydrogel was formed through copper-free click chemistry. By leveraging the strong interaction between streptavidin and biotin, the hydrogel functioned as a carrier for immobilizing biotinylated proteins. To showcase its practicality, the researchers immobilized glucose dehydrogenase and applied a coating of this gel onto a glassy carbon electrode for glucose fuel cells.100

Recently, we have developed a versatile toolbox for surface functionalization of a broad range of materials, including polymers, metals, and silicon, using the sortase-click strategy.101 In this study, an evolved sortase variant M3 was used to ligate the hydrophobic click reagent dibenzocyclooctyne (DBCO) to the C-terminus of a universal material binding peptide called LCI. Subsequently, LCI-DBCO adhered to different materials such as polypropylene, polystyrene, gold, stainless steel, and silicon. By conducting the click reaction between surface-bound LCI-DBCO and a fluorescent Cy-3 azide, the samples coated with LCI-DBCO exhibited up to 3.9-fold higher fluorescence intensity compared to the corresponding controls.101 These results highlight the versatility of the surface functionalization approach, paving the way for novel coatings with unique properties. The combination of material-specific binding peptides and sortase conjugation opens up a new field for sortase-mediated functionalization of materials/surfaces decorated with material binding peptides.

3.2 Combination of Sortase-Mediated Ligation and Unnatural Amino Acid Incorporation

Incorporation of UAA-bearing functional groups at specific positions in a target protein offers a powerful tool for deciphering biorthogonal chemistry. Leveraging the capabilities of UAAs, significant advancements have been made in site-specific protein labeling and functionalization, both in vitro and in vivo systems.103 The combination of UAAs and SML techniques for site-specific protein engineering holds significant potential, but its full power has not been fully realized. In 2019, the Lang group reported the combination of UAA, bioorthogonal Staudinger reduction, and SML to develop a versatile tool termed Sortylation for the controlled ubiquitylation of proteins (Figure 6b).102 Specifically, they performed site-specifical incorporation of a UAA, AzGGK, into target proteins using genetic-code expansion methodology. Subsequently, the AzGGK side chain, carrying a terminal azide moiety, was converted in vivo to GGK through Staudinger reduction. The resulting GGK, bearing a sortase-accepted primary amine moiety was then ligated to an LPLTG- or LALTG-tagged ubiquitin using the evolved SaSrtA 5 M variant (Figure 3).43 The versatility of Sortylation was shown in ubiquitylation/SUMOylation of different proteins in living cells such as E. coli and HEK293T cells. Based on Sortylation, the Lang lab further developed a modular toolbox, namely Ubl-tools, to stepwise generate polymeric ubiquitin architectures using evolved SaSrtAs variants that recognize orthogonal motifs.104

3.3 Combination of Sortase-Mediated Ligation and CRISPR-Based Genome Editing

In the past decade, we have witnessed the impact of CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) in biological research and medicinal applications.105 A prerequisite to use SML is the presence of a sortase recognition motif such as LPXTG within the target protein. For this reason, in most cases, mutagenesis methods are required for the incorporation of the motif at gene level. CRISPR technology offers several significant advantages over traditional PCR-based mutagenesis methods such as simplicity in handling (no requirement of isolated template, PCR-based in vitro cloning), allowing simultaneous targeting of multiple sites within the genome, offering high flexibility for gene knockouts, insertions, and other complex modifications.106 All these features make CRISPR an extremely powerful tool for in vivo applications in general.107

The first proof of principle of such a CRISPR-Sortase strategy was carried out in 2014 by Lodish et. al, in which LPETG motif was inserted at the C terminus of the murine endogenous Kell, a type-II membrane protein of red blood cell, by CRISPR-Cas9. The generated Kell-LPETG exhibited no inhibition of erythroid differentiation and was retained on the plasma membrane of mature red blood cells. Further studies demonstrated the LPXTG-tagged red blood cell can be performed with SML such as biotin probes with an efficiency of 81±28 % as determined by flow cytometry.108 Based on this work, the authors further covalently attach disease-associated autoantigens to red blood cells as a means of inducing antigen-specific tolerance. Results proved that transfusion of red blood cells bearing self-antigen epitopes alleviated and prevented signs of disease in experimental autoimmune encephalomyelitis, as well as maintaining normoglycemia of type 1 diabetes in a mouse model.109 All these results highlight the immense potential of applying the CRISPR-Sortase strategy for prophylaxis and therapy of autoimmune diseases.

In 2018, Muzykantov and his colleagues developed a straightforward approach to produce site-specifically modified antibodies using the CRISPR-Sortase strategy. They utilized CRISPR-Cas technology to genetically incorporate an LPETGG motif into the C-terminus of the third immunoglobulin heavy chain constant region (CH3) within a hybridoma cell line. This resulted in the production of antibodies that were readily capable of site-specific bioconjugation by SML. By performing SML, fluorescent and radioactive cargoes were conjugated to the antibodies. Notably, these immunoconjugates exhibited almost double the specific targeting in the lung compared to chemically conjugated maternal mAb. Furthermore, there was a concurrent reduction in uptake in the liver and spleen. The approach described in this study offers a straightforward method for cost-effective production of homogeneous, effective, and scalable antibody conjugates for a broad range of therapeutic and diagnostic applications.110

More recently, Scheeren et. al developed a versatile CRISPR/HDR methodology to efficiently engineer the constant immunoglobulin domains to generate recombinant hybridomas, which secrete antibodies such as secreting antigen-binding fragments, isotype-switched chimeric antibodies, and Fc-silent mutants in preferred formats, species, and isotypes. Furthermore, by incorporating the sortase recognized motif in the antibodies, site-specifically attachment of cargo to these antibody products via chemoenzymatic modification.111 Based on the work, an expanded genomic engineering toolbox to enable duel modification of antibodies on the HC and LC loci of the mouse IgG1 (mIgG1) hybridoma was reported. In brief, two orthogonally sortase A recognition motifs were incorporated in HC and LC of Fab, which resulted in a dual-tagged Fab using CRISPR/Cas9. Subsequently, two distinct cargos were sequentially conjugated to the dual-tagged Fab with two evolved sortase variants with corresponding specificities in a site-specific manner. The latter case shows a versatile use of multifunctional conjugates for therapeutic applications.112

4 Summary and Outlook

Given the remarkable progress in protein engineering, biorthogonal chemistry, and bioinformatics, significant advancements have been achieved in site-specific bioconjugation methodologies over the past two decades, facilitating a broad spectrum of biotechnological and therapeutic applications.2a, 113 This review aims to provide a comprehensive overview of the versatility and utility of sortase-mediated ligation (SML), which stands as the most extensively employed methodology in this field. However, it is essential to acknowledge that sortases were not initially the most widely utilized enzymes for site-specific bioconjugation. Most identified wild-type sortases possess certain limitations, including suboptimal catalytic efficiency, limited robustness, dependence on calcium ions, reversibility, and restricted specificity. Fortunately, many of these challenges have been effectively or partially addressed (Table 1). Through directed evolution campaigns, the catalytic efficiency of SaSrtA has been significantly improved by approximately 100-fold to 16.7 s−1⋅mM−1 (Figure 3).43, 44 This improvement has enabled rapid and efficient in vitro ligation, thereby expanding the application range of SML (Table 1 and Figure 6).114 By implementing directed evolution and loop engineering,16, 48, 51, 70 the specificity of sorting motifs has been partially altered and reprogrammed, resulting in a broadened motif scope (Table 2). These newly acquired recognition specificities have demonstrated substantial utility in targeting previously unexplored proteins48 and facilitating the simultaneous labeling of a single protein bearing orthogonal motifs.94a, 104 The robustness and process stability of SaSrtA have been remarkably enhanced through chemical modification and directed evolution, thereby enabling its versatile utilization in site-specific bioconjugation under challenging conditions such as organic co-solvents,49 elevated temperatures,60 and chaotropic agents.61 These advancements highlight the potential of SML in industrial applications such as flow-based enzymatic ligation platforms.15 By employing semi-/rational design53 and directed evolution,47 key residues and substitutions have been identified to boost the activity of SaSrtA in the absence of Ca2+, thereby unlocking the potential of SML for in vivo applications (Table 3).37a, 81 Through substrate engineering and modification, the reversibility of SML can be partially or fully minimized (Scheme 2 and 3). Furthermore, protein engineering has significantly improved the activity of sortases in isopeptide bond formation,59 although the overall efficiency still lags behind that of canonical terminal ligations (Table 1).

Despite remarkable progress in the field, challenges remain to be addressed in the widespread utilization of SML. One of the main limitations lies in the restricted range of sorting motifs. Ideally, modification should occur readily with minimal or no genetic manipulation of the target proteins, particularly for in vivo applications. To overcome these limitations, further advancements are expected in the coming years by harnessing genomic mining and protein engineering techniques.115 Specifically, exploring other site/sequence-specific sortases is promising, as numerous housekeeping sortases are abundantly found in Gram-positive bacteria, yet their characterization remains incomplete. Understanding the mechanisms by which sortase recognizes its protein substrate(s), combined with loop engineering and directed evolution, can pave the way for the discovery, engineering, and reprogramming of more versatile sortase enzymes.25a, 70, 116 These enzymes would enable efficient labeling of a broader range of motifs at protein termini.116, 117 An ideal outcome would involve the development of a platform of a library containing various species and variants of sortases. Each sortase (or variant) would exhibit high sequence specificity in recognizing one or a few corresponding motifs, thereby empowering researchers to carry out site-specific bioconjugations for multiple targets with minimal or even no genetic manipulation.

Protein functionalization using endogenous residues exhibits remarkable advantages and potential in contrast to termini. This is attributed to the larger selection of available sites and the capability of achieving multiple labeling. Currently, the majority of SML applications have been focused on the termini of target proteins. However, the untapped potential lies in SML of endogenous lysine residues through the formation of isopeptide bonds. Challenges associated with this respect stem from two main factors. Firstly, the catalytic activity is relatively low, with a kcat < 4×10−4 S−1.59 Secondly, the large endogenous recognition motif, WXXXVXVYPK, presents difficulties in genetic engineering and carries a high risk of protein inactivation or defunctionalization. While protein engineering has partially addressed the first challenge,58, 59 catalytic efficiency for isopeptide bond formation remains four orders of magnitude lower than the most active SaSrtA variants used for termini ligation (kcat/Km of 2.7 vs. 16722 (s−1 ⋅ M−1)).44, 59 Regarding the second challenge, recent progress has demonstrated proximity-based sortase-mediated isopeptide ligation at lysine residues in antibodies, but it relies on the specific attachment of sortase to the target and provides only partial site-specificity.67 Undoubtedly, further improvements, such as shorter pili motifs with higher specificity, can potentially be achieved through directed evolution of sortase or the discovery of new functional sortase species, as mentioned earlier.

In the preceding chapter, we discussed and summarized the emerging advancements of SML in several interdisciplinary research fields. The advancements have been achieved by integrating other powerful cutting-edge technologies such as click chemistry, unnatural amino acid (UAA) incorporation, and genomic editing, resulting in numerous innovative applications. For instance, the utilization of sortase-click technology empowered the synthesis of noncanonical N-to-N and C-to-C fused protein dimers.98 The combination with material binding peptides and sortase-click technology has opened new avenues of SML application in material science.101 The integration of UAA, bioorthogonal Staudinger reduction, and SML has led to the development of a novel methodology called “Sortylation”, proving to be a powerful tool in biorthogonal applications.102, 104 The incorporation of SML and CRISPR-based genomic editing tools has been successfully employed for treatment of autoimmune diseases in mouse model and in the efficient production of versatile antibody conjugates for a broad therapeutic and diagnostic purposes.110-112

Nevertheless, despite these achievements, the full potential of these integrations, particularly for in vivo applications, remains to be fully realized. We anticipate that the convergence of SML with other methodologies will continue to drive breakthroughs and practical applications, especially in the field of in vivo research. The advancements in SML and its integration with innovative techniques offer vast prospects for interdisciplinary research in different fields. In conclusion, this review highlights the significant impact of enzyme engineering on sortases and emphasizes their fascinating potential in advancing of site-specific bioconjugation research. As SML continues to evolve and integrate with other cutting-edge approaches, it will undoubtedly open up new avenues for research and practical applications across multiple disciplines.

Acknowledgments

Open Access funding enabled and organized by Projekt DEAL.

    Conflict of interests

    The authors declare no conflict of interest.

    Data Availability Statement

    The data that support the findings of this study are available from the corresponding author upon reasonable request.

    Biographical Information

    Zhi Zou studied biochemistry and molecular biology at the East China University of Science and Technology. He obtained his Ph.D. in 2019 from the RWTH Aachen University under the supervision of Prof. Ulrich Schwaneberg. During his doctoral studies, he specialized in protein engineering of sortases and the development of sortase-mediated methodologies for site-specific protein bioconjugation and surface functionalization. In 2020, he joined the research group of Prof. T. R. Ward at the University of Basel as a post-doctoral fellow. In this role, he has focused on the development of artificial metalloenzymes for new-to-nature biocatalysis and subsequent in vivo applications.

    Biographical Information

    Yu Ji received her bachelor‘s and master‘s degree in chemistry from the Beijing University of Chemical Technology. She obtained her Ph.D. degree with distinction (summa cum laude) from the RWTH Aachen University in 2020 with research focuses on directed evolution of enzymes for high-value compounds synthesis. Now she works as a project leader at RWTH-Aachen University with research focuses on enzymatic plastic degradation.

    Biographical Information

    Ulrich Schwaneberg graduated in chemistry in 1996 and obtained his Ph.D. in 1999 under the supervision of Prof. R. D. Schmid at the University of Stuttgart. Following a postdoctoral stint at Caltech, where he worked in the laboratory of the Nobel laureate Prof. Frances H. Arnold, he was appointed as a professor at Jacobs University Bremen in 2002. In January 2009, he relocated to RWTH Aachen University, where he currently serves as the head of the Institute of Biotechnology. Additionally, since 2010, he has been coappointed to the scientific board of directors at the DWI Leibniz Institute for Interactive Materials. Furthermore, Ulrich Schwaneberg actively collaborates with Prof. Bergs at the competence center Bio4MatPro and holds the position of speaker for the RWTH profile area Molecular Science & Engineering. He is also a cofounder of the companies SeSaM Biotech & Aachen Proteineers. His primary focus lies in protein engineering, wherein he aims to comprehend fundamental structure-function relationships. Through this work, he provides tailored proteins as building blocks for the biological transformation of material science and production, along with customized enzymes for bio- and biohybrid catalysis.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.