Exploring the diversity of plant proteome
Edited by: Sixue Chen, University of Florida, USA
Abstract
The tremendous functional, spatial, and temporal diversity of the plant proteome is regulated by multiple factors that continuously modify protein abundance, modifications, interactions, localization, and activity to meet the dynamic needs of plants. Dissecting the proteome complexity and its underlying genetic variation is attracting increasing research attention. Mass spectrometry (MS)-based proteomics has become a powerful approach in the global study of protein functions and their relationships on a systems level. Here, we review recent breakthroughs and strategies adopted to unravel the diversity of the proteome, with a specific focus on the methods used to analyze posttranslational modifications (PTMs), protein localization, and the organization of proteins into functional modules. We also consider PTM crosstalk and multiple PTMs temporally regulating the life cycle of proteins. Finally, we discuss recent quantitative studies using MS to measure protein turnover rates and examine future directions in the study of the plant proteome.
INTRODUCTION
The enormous complexity of the plant proteome involves differences in protein abundance, domain structure, conformation, modifications, interactions, stability, and localization. This complexity completely obliviates the old “one gene, one protein, one function” model. Indeed, genome sequencing of the model plant Arabidopsis thaliana revealed over 35,000 protein-coding genes; however, the difference between low- and high-abundant proteins spans several orders of magnitude (Mergner et al., 2020), and alternative splicing of RNAs generates thousands of different protein isoforms in plants (Figure 1A). Moreover, most proteins exist in various protein complexes or in several modified forms; and many thousands of modification sites are predicted to exist across the plant proteome (Figure 1A), their combinatorial interactions remain mostly unknown. Furthermore, the plant proteome is modulated by ubiquitin- and lysosomal-mediated degradation systems. These regulatory pathways lead to dynamic changes in protein conformation, protein degradation, and the formation of multiprotein complexes. The mechanisms that underlie these modifications, dynamics, and the interactions of most proteins are poorly understood at the global level.

Plant proteome functional diversity and shotgun proteomics workflow
(A) The generation of cellular proteins increases the complexity of the plant functional genome. Multiple mechanisms contribute to the generation of the complex proteome, including the increase in coding potential using alternative transcription start sites and alternative splicing. The diversity of proteins is further increased through alternative upstream open reading frames in messenger RNA (mRNA) translation, and the different efficiencies of mRNA translation itself. The attachment of posttranslational modifications (PTMs) to proteins can lead to further diversification. Finally, the dynamic assembly of protein complexes with varied compositions that can potentially perform diverse molecular functions adds additional complexity. P, phosphorylation; Ub, ubiquitin. (B) Bottom-up shotgun proteomics workflow. In shotgun proteomic studies, protein samples are purified and digested into peptides. After separation using high-performance liquid chromatography (HPLC), the peptides are ionized and transferred into the vacuum of mass spectrometry (MS) through electron spray ionization (ESI). In the MS analyzer, data-dependent acquisition (DDA) or data-independent acquisition (DIA) methods can be used for peptide analysis. The interpretation of fragment spectra and protein database searching are performed using software programs such as Mascot, MaxQuant, or Skyline. (C) MS-based quantitation in proteomics. The peptides can be determined by label-free quantitation at the MS1 and MS2 levels by determining the precursor and fragment ion intensities, respectively, as an estimate of the amount of each protein (left and center). In the labeling-based quantitative approach, samples are labeled with different stable isotopes at the peptide or protein level. The quantification is then performed via reporter ions in the tandem mass spectra (right).
Proteomics enables the system-wide characterization of protein abundance, function, and interactions in plant cells, providing a comprehensive understanding of complex biological processes. Many non-mass spectrometric (non-MS) proteomic techniques such as protein microarray analysis and high-throughput protein crystallization, have improved our understanding of protein abundance and structures (Li et al., 2020; Wang et al., 2020); however, these approaches are limited in dynamic range and resolution. Mass spectrometry-based proteomics has emerged as a powerful approach to study the interactions, modifications, and localizations of proteins in biological systems. With its rapidly expanding analytical tools, peptide-based “shotgun” proteomics has become a major method for studying the system-wide changes that occur in response to the external environment and developmental events in plants, changes that include posttranslational modifications (PTMs), localization, and protein turnover (Chen et al., 2010; Lopez-Torrejon et al., 2013; Larance and Lamond, 2015; Marx et al., 2016; Valdes-Lopez et al., 2019).
In this review, we highlight the advances made using MS technologies to investigate molecular diversity on a systems level, with a focus on PTM and organellar proteomics. We describe recent applications of proteomic methods, which have led to the assignment of the functions of thousands of PTM sites and various subcellular localizations in plants. We discuss how exploring the diversity of the cellular proteome could provide new insight into plant biology.
BOTTOM-UP PROTEOMICS IN THE META-DATA ERA
Advances in high-throughput omics methodologies for the global analysis of genes or proteins have reshaped the types and scope of biological experiments that can be performed. Traditional approaches, such as site-directed mutagenesis, immunoblotting, and cell imaging, only assess the functions of single proteins, whereas omics technologies enable high-throughput analysis of samples in response to control and time-series perturbations. System-wide omic studies generally rely on gathering meta-data and using external databases for data processing, mathematical modeling, and annotation mapping to elucidate a posteriori the molecular foundations that explain biological observation (Chen and Weckwerth, 2020).
Current MS methods have been used for systematic studies of the complexity of the proteomes of plant systems at all analytical levels, including analysis to generate genetic maps or comparative analysis to detect condition-specific changes. Due to the high accuracy, resolution, and sensitivity, MS-based proteomics can be broadly used for targeted or untargeted biological analyses (Figure 1B). Thus, numerous bottom-up proteomic studies have been performed for large-scale, quantitative, and qualitative analyses of plant proteins, yielding large lists of peptides in specific samples. Quantitative proteomic approaches have enhanced the characterization of protein abundance and PTMs; several comprehensive reviews have covered the working principles, and readers are referred to these reviews for aspects of both labeled and label-free quantitation methods (Schulze and Usadel, 2010; Cox and Mann, 2011; Ankney et al., 2018; Ludwig et al., 2018; Chen and Weckwerth, 2020).
MS technologies have been widely used in both small-scale mechanistic and data-driven systematic studies. These techniques were initially used for single-protein analysis; however, as exemplified by liquid chromatography tandem MS (LC-MS/MS)-based methods, these techniques can now be used to perform high-throughput proteomics studies to generate novel biological insights from mega-data analysis. These methods have different performance characteristics that are suited to different applications, as described in this review.
HIGH-THROUGHPUT ANALYSIS OF THE PLANT PROTEOME USING MS
Tools for in-depth, reliable identification of proteins are necessary for mechanistic, hypothesis-driven studies, as well as global omics investigations. An accurate, systematic, quantitative proteomic method is a crucial prerequisite for the development of all MS techniques (Figure 1B, C). Over the past decade, the most widely used method for proteome discovery was data-dependent acquisition (DDA), leading to striking advances. Data-independent acquisition (DIA) method is an emerging MS technique for proteomic analysis, which greatly enhances the proteome coverage and improves the quantification accuracy (Ludwig et al., 2018). In these proteomic analyses, most proteins are accessible to instruments during sample measurements; however, a variety of protein samples do not generate a sufficient amount of detectable MS signals and therefore remained unidentified. This is mostly because many more high-abundant proteins are present in the samples than can be targeted to MS fragmentation or detection during analysis due to the complex nature of the samples.
Prefractionation can be used to reduce sample complexity, which simplifies downstream analysis and enhances the detection of low-abundant proteins, thereby increasing the proteome coverage. Analyzing the sub-proteome of a specific organ can increase the proteome coverage because fewer proteins are expressed in specific organs compared with the whole-plant proteome. An early proteomic study aimed at compiling a comprehensive proteome profile identified over 10,000 proteins in Arabidopsis through extensive LC-MS analysis of fractionated proteins from different organs, developmental stages, and cell cultures (Baerenfaller et al., 2008). A similar global analysis was performed for the Medicago truncatula proteome, leading to the identification of key regulators that control gene expression in its nitrogen-fixing bacterium and revealing organ-specific networks regulating symbiosis (Marx et al., 2016).
These studies achieved in-depth proteomic analysis and determined the cellular abundance of significant fractions in the plant proteome. However, most predicted proteins cannot yet be reliably detected using MS due to its limited dynamic range. This shortcoming is partly because peptides do not ionize with equal efficiency, making it challenging to detect some proteins (and their associated PTMs). However, proteome coverage can be highly enhanced by combining extensive peptide prefractionation with improved peptide fragmentation methods and various enzyme digestion steps.
In addition, higher efficiency peptide separations have increasingly been achieved at higher LC operating pressures by extending the LC gradient time and using less hydrophobic chromatographic materials, such as C4-columns for the separation of hydrophobic peptides. For example, ultra-high-performance LC using optimized columns or LC gradients enabled the detection of many more low-abundant peptides with high MS-spectra quality (Howard et al., 2012). A recent study using robust analytical and fractionation approaches led to the construction of a quantitative Arabidopsis proteome map covering over 18,000 proteins (Mergner et al., 2020). This is a substantial increase over the percentage of protein-coding genes for which the corresponding proteins were reported in UniProt, achieving more in-depth proteome coverage than reported in an earlier study. Similar high-throughput analysis was also reported in maize (Zea mays) proteome (Walley et al., 2016).
However, due to the existence of multiple nearly identical proteins, standard proteomics analysis can reduce the number of peptides that are uniquely matched to a protein, thereby reducing peptide recovery. To address this problem when performing peptide mapping of highly redundant genomes, a novel statistical method was recently developed that allows mass spectral observations to be interpreted in terms of protein orthogroups rather than individual proteins (McWhite et al., 2020). This comprehensive proteomic technique combined with co-fractionation MS led to the generation of millions of peptide mass spectra and systematically uncovered multiple protein complexes in numerous tissues from diverse species (McWhite et al., 2020). Furthermore, this study dramatically increased the protein recovery of unique peptide mapping for species with highly redundant proteomes, such as wheat (Triticum aestivum), providing a mechanistic framework for cross-species functional annotation of the proteome.
Although these methodologies have only been performed by a few specialized laboratories, they will enable MS-based proteomic studies to become increasingly applicable to everyday plant biological research, including in laboratories that still use classic antibody or transgenic epitope tagging-based techniques. Furthermore, these studies indicate that MS methods can be used to turn everyday proteomic experiments into deep investigations, enabling the detection of novel molecules and connections.
CHARACTERIZING PTMS AND CELL SIGNALING
Comprehensive phosphoprotein analysis using MS
Protein phosphorylation plays pivotal roles in plant growth and development; phosphorylation adds tremendous complexity to the plant proteome by affecting many properties including protein localization, turnover rates, and interactions (Figures 2, 3). Mass spectrometry-based methodologies are well suited for the study of PTMs because modified peptides result in mass shifts and can be identified at a specific amino acid resolution through the production of fragmented peptide ion spectra. The large-scale identification of phosphoprotein is challenging due to transient changes in the modification of signaling proteins, the low stoichiometry of modified peptides in a proteome, as well as the heterogeneity of the phosphorylation-bearing peptides of a given protein.

The functional diversity and complexity of protein posttranslational modifications (PTMs)
Cellular proteins can be modified through the addition of chemical groups, polypeptides, or complex molecules. The most commonly studied PTMs in plant cells are shown (center). PTMs enable various dynamic signaling processes in plant cells, including alterations in protein conformation (through phosphorylation) and subsequent allosteric regulation; changes in enzyme activity; crosstalk resulting from the same proteins being targeted by more than one type of PTM; alterations in the subcellular localization of proteins; changes in protein binding and interaction; and alterations in protein turnover. Ac, acetylation; P, phosphorylation; Ub, ubiquitination.

Large-scale proteomics for unraveling post-translational modification (PTM) mediated cellular signaling networks
(A) Location of PTMs in the peptide chain. After a peptide has been identified using liquid chromatography (LC)-mass spectrometry (MS), the amino acid in the peptide chain to which the PTM is attached must be determined. Ac, acetylation; P, phosphorylation; Ub, ubiquitination. (B) Life cycle of PTMs. Proteins are modified by multiple PTMs during the whole processes of their lifetimes. The time of protein modifications are shown in color lines. (C) Model for the systematic regulation of protein phosphorylation in plant growth and development. Protein phosphorylation regulation of plant development including leaf initiation, growth, and maturation. In addition, stress-induced signaling can be transduced by calmodulin or other receptors to regulate the phosphorylation of calcium-dependent protein kinase (CDPK) or sucrose nonfermenting-related kinase (SnRK). SnRK and CDPK positively regulate stress responses in plants by phosphorylating downstream regulators such as transcription factors to promote them moving to the nucleus, where they interact with transcription factors to reduce their stabilities (Ding et al., 2015), whereas herbivory usually leads to the biosynthesis of jasmonic acid and its activation of phospho-signaling networks (Zander et al., 2020).
These challenges can be addressed by PTM enrichment (Table 1). Ideally, all phosphorylation-modified peptides can be enriched during chromatographic separation, but in practical, purification methods can be more or less specific, ranging from ~100% specificity for Thr/Ser-phosphorylated proteins to ~5% specificity for acetylated proteins. In studies of phosphorylated proteins, metal oxide affinity chromatography (MOAC) and immobilized metal affinity chromatography (IMAC) of the phospho-group are the preferred approaches for purifying modified peptides, but many other strategies can be employed (Table 1; Bian et al., 2016; 2018). Phosphorylated Tyr residues can be purified using phosphor-Tyr specific antibodies, which are well suited for analyzing Tyr phosphorylated proteins and peptides obtained from tryptic digestions (Sugiyama et al., 2008). In addition, multidimensional fractionation of protein samples is useful for enhancing the depth of phosphoproteome coverage, but this technique usually requires large amounts of starting material and long measurement times. Recent developed methods, such as tandem MOAC and EasyPhos, capture unexpected signaling targets and are powerful techniques for achieving in-depth phosphoproteome coverage within hours and with fewer sample preparation steps (Hoehenwarter et al., 2013; Humphrey et al., 2018; Chen and Hoehenwarter, 2019).
PTM | Resin type | Metal ion or binder beads | Binding efficiency (%) | Comments | Examples |
---|---|---|---|---|---|
Phosphorylation | TiO2 | Ti4+ | 70–90 | More efficiently for singly phosphorylated peptides | Beckers et al., 2014; Chen and Hoehenwarter, 2015 |
Phosphorylation | ZrO2 | Zr4+ | 50–80 | More efficiently for acidic phospho-sites | Chen et al., 2010 |
Phosphorylation | IMAC | Fe3+, Ga3+, Ni2+, Zn2+ | 60–90 | More efficiently for multiply phosphorylated peptides | Potel et al., 2018; Wang et al., 2018 |
Phosphorylation | Al(OH)3 | Al3+ | 50 | Can be used to enrich phosphoproteins | Chen and Hoehenwarter, 2019; Wolschin and Weckwerth, 2005 |
N-glycosylation | Lectins | Concanavalin A, snowdrop lectin, and lentil lectin | 30–90 | Combinations of lectins are optimal for N-glycoproteins | Ruiz-May et al., 2014; Scheys et al., 2020 |
O-glycosylation | Lectins | WGA, Jacalin | 30–80 | More efficiently for GalNAc residues | Chalkley et al., 2009 |
Lysine acetylation | Antibody affinity | Anti–acetyl-lysine antibody | 30–80 | Can be used to enrich acetyl peptides and acetyl proteins | Walley et al., 2018 |
SUMOs | Tag affinity | Epitope-tagged SUMO | 30–90 | SUMOylated peptides can be enriched from expressed cells with decahistidine (His10)-tagged SUMO-2 | Hendriks et al., 2014 |
SUMOs | Immunoprecipitation | Anti–SUMO-antibody | 30–70 | More efficiently for in vitro analysis of SUMOylated peptides | Breucker and Pichler, 2019 |
- Note: Overlap between two different enrichment methods is less than 40%, more than one method can be used to improve specific PTM modified peptide coverage. Abbreviations: IMAC, immobilized metal affinity chromatography; PTM, posttranslational modification; SUMO, small ubiquitin-like modifier.
Phosphoproteomic analysis of signaling networks
Arabidopsis has more than 1,000 kinases, but only ~150 phosphatases (Durek et al., 2010; Zulawski et al., 2014; Bhaskara et al., 2019), highlighting the functional importance of protein phosphorylation in plants. Traditional biochemical approaches for characterizing phosphoproteins are often performed at the single-protein level using antibodies, which is laborious and without localization of modification sites. Phosphoproteomics is the tool of choice for analyzing phosphorylation signaling networks because it can reveal these modifications in an unbiased manner. System-wide phosphoproteomic analyses have led to the identification of over 1,000 phosphoproteins in various plant species, providing crucial insight into the regulation of protein functions (Wang et al., 2013; Qian et al., 2015; Valdes-Lopez et al., 2019; Van Leene et al., 2019; Zander et al., 2020).
Most phosphoproteomic studies aim to perform time-resolved quantitative mapping of changes in phosphorylation associated with a given stimulus. Over the past decade, quantitative phosphoproteomic analyses have been widely used to monitor the phosphorylation dynamics of plants under developmental changes, biotic or abiotic stresses, providing insights into diverse environmental signaling events and identifying many new research targets (Figure 3C; Chen et al., 2010; Yang et al., 2013; Stecker et al., 2014; Chen and Hoehenwarter, 2015; Minkoff et al., 2015; Qing et al., 2015; Haj Ahmad et al., 2019). A streamlined strategy was recently used to analyze in vivo signaling dynamics with high transient resolution (Vu et al., 2016; Van Leene et al., 2019). Such studies lay the foundation for identifying regulated phosphorylation sites that are functionally important for the cellular process of interest.
Global phosphoproteomic profiling combined with site-directed mutagenesis has been used to investigate the molecular functions of the corresponding sites in a protein of interest (Perraki et al., 2018; Kadota et al., 2019; Wu et al., 2019). Many phosphorylation sites are coregulated rather than occurring independently, and many function in a pattern that allows sensitive switches to be constructed. Typical phosphorylation events that occur on a specific amino acid residue of a protein will trigger a particular cellular process. One example is the phosphorylation of activation loops in a kinase domain, resulting in the regulation of a downstream pathway. Thus, quantitative phosphoproteomics has been used for proteome-wide mapping of the substrates of particular protein kinases, such as the sucrose nonfermenting-related kinase (SnRK), mitogen-activated protein (MAP) kinase, calcium-dependent protein kinase (CDPK), or receptor-like kinases (Cox and Mann, 2012; Hoehenwarter et al., 2013; Umezawa et al., 2013; Wang et al., 2013; Chen and Hoehenwarter, 2019; Wu et al., 2019).
Functional phosphoproteomic studies revealed a critical link between Ser/Thr-protein 8 kinase signaling and the fine-tuning of cyclic electron flow in plastids (Reiland et al., 2011). A similar functional analysis of two nuclear protein kinase mutants uncovered an important role for phosphorylation in the repair of DNA damage (Roitinger et al., 2015). In addition, phosphopeptide sequences can be bioinformatically analyzed to detect overrepresented phosphorylation motifs, yielding clues to help establish kinase and substrate relationships and providing insight into the extent of kinase activity.
Intriguingly, a meta-analysis of phosphoproteomics data provided evidence that phosphorylation motifs are compartmentalized into groups specific to a particular subcellular organelle in plants (van Wijk et al., 2014). This powerful motif analysis indicated that, although most phosphorylation sites exhibit unique dynamic behavior, quantitative phosphorylation profiles can be categorized into functional groups, such as groups corresponding to early responses at plasma membrane receptors and late responses of nuclear transcription factors.
The roles of nonphosphorylation PTMs in signaling
Other PTMs are also important in plant signaling, for example, N-linked glycosylation plays an essential role in protein folding within the lumen of endoplasmic reticulum, and protein sorting and trafficking in the endomembrane. Although most proteomics studies in plants have focused on analyzing phosphorylation involved in signaling events, in principle, MS could be used to analyze any PTM, such as acetylation, ubiquitination, glycosylation, and sumoylation; however, global insights into these PTMs are limited. This is because most PTMs are particularly dynamic and labile, posing many challenges for their detection using MS. For example, O-linked glycosylation (O-GlcNAc) can generate several types of glycosylated protein modifications, including nucleotide modifications and the addition of large glycans. N-linked glycosylation is more commonly analyzed than O-GlcNAc. Hydrophilic enrichment followed by high mass accuracy MS employing complementary fragmentation techniques (higher-energy collision dissociation and electron transfer dissociation) proved to be an effective method for N-glycopeptide identification (Zeng et al., 2018). The study of N-linked glycosylation was recently enhanced using a newly developed method that elegantly combines filter-aided sample preparation with lectin affinity chromatography (Scheys et al., 2020). Thousands of N-GlcNAc modified sites have been identified using these enrichment methods, and the functions of these modifications have been interpreted (Bi et al., 2021).
Affinity purification is the preferred method for analyzing the polypeptide or chemical modifications in order to enrich PTM-bearing peptides (Table 1). Moreover, the development of innovative peptide fragmentation technologies has greatly enhanced the identification of PTMs compared with conventional techniques (Marx et al., 2013). The confident identification of modified peptides is much more difficult compared to unmodified peptides, and the unambiguous localization of PTM sites is challenging, since additional possibilities must be considered using search programs (Figure 3A). It is important to rigorously determine the false-discovery rates used for PTM determination (Elias and Gygi, 2007; Tyanova et al., 2016). High mass accuracy and the generation of high-quality fragmentation spectra also facilitate the confident localization of PTM sites.
Large-scale quantitative analysis of specific types of PTMs in plants has already provided numerous biological insights. High-resolution MS combined with isobaric tags-based quantitation led to the identification of over 2,000 acetylated peptides on more than 900 proteins in maize plants (Walley et al., 2018). This acetylomic study revealed that the acetylation of nonhistone proteins can be modulated by changing the activities of histone deacetylases in response a cellular stimulus. Similarly, quantitative PTM proteomics combined with density gradient centrifugation-based separation revealed over 5,000 acetylation sites in Arabidopsis (Liu et al., 2018). In plant cells, Lys is most frequently modified by ubiquitination, the modified peptides can be efficiently purified and analyzed using linkage-specific methods (Rose and Mayor, 2018). Effective strategies have also been developed to enrich sumoylated proteins, poly-ubiquitin chains, and small ubiquitin-like modifier (SUMO), expanding our knowledge of ubiquitin proteasome systems on a broad scale (Hendriks et al., 2014; Guo et al., 2017; Breucker and Pichler, 2019). Such studies have uncovered the importance and functional complexity of PTMs in regulating numerous cellular processes.
Multiple PTMs: Protein interaction and crosstalk
Proteomic approaches have been used to systematically characterize multiple PTMs simultaneously. PTMs influence protein synthesis and functions over their lifetimes, and have increasingly been recognized as critical directors of the functional diversity of the proteome and major determinants of plant phenotypes (Figure 2). Various studies have revealed extensive interplay and crosstalk between multiple PTMs (Friso and van Wijk, 2015); for example, the integration of ubiquitination and phosphorylation affects the activities and stabilities of inducer of CBF expression transcription factors (Ding et al., 2020) and the phyB–phytochrome interacting factor module (Sadanandom et al., 2015). Previous studies reported that over 800 proteins are modified by at least three different types of PTMs and over 100 by four different types of PTMs (Millar et al., 2019). Emerging evidence indicates that modification at one amino acid site can interact with another modification on neighboring residues (Figure 2). For example, the oxidation of Met can inhibit the phosphorylation of neighboring Ser/Thr residues (Korkuc and Walther, 2017).
To gain broad insight into multiple PTMs, a systematic study was performed to investigate their functional associations, leading to the construction of a network of PTMs that regulate the molecular functions of multiple proteins (Minguez et al., 2012; Hartl et al., 2017). Another high-throughput proteomic study found that phosphorylation and lysine acetylation are critical co-regulators of nitrogen fixation in legumes (Marx et al., 2016). Recent phosphoproteomic and acetylomic studies also demonstrated that multiple PTMs co-regulate diurnal changes in plants (Uhrig et al., 2019). These studies indicated that multiple PTMs are involved in a sophisticated communication process in which protein functions are sequentially and exclusively regulated by antagonistic mechanisms. Various methods have been developed to purify modified peptides and enable the integrated analysis of different PTMs (Swaney et al., 2013; Uhrig et al., 2019); however, current MS approaches are available only for few types of modifications. Therefore, it remains challenging to comprehensively detect crosstalk among PTMs in plants. More reliable methods are needed to study the interactions and crosstalk among multiple PTMs.
DECHIPHERING CELL BIOLOGY USING ORGANELLAR PROTEOMICS
Single-organelle proteome profiling
The organization of plant cells into different sub-organelles is vital for all biological processes. Capturing the organellar proteome, which involves comprehensively determining the subcellular localizations of proteins and their dynamics, is therefore essential for obtaining a system-wide view of cellular organization (Table 2). Organellar proteomic analysis has emerged as a major method for studying protein localization, organelle composition, dynamics, and function across numerous plant species (Tanz et al., 2013; Liang et al., 2018; Pan et al., 2018; Goto et al., 2019). Conventional organelle proteomic studies have relied on centrifugation-based cell fractionation coupled with MS analysis; such studies have led to the characterization of all major organelles in plant cells (Figure 4A; Table 2; Majeran et al., 2012). Whereas it is relatively easy to obtain pure fractionated proteins from some plant organelles, such as mitochondria, chloroplasts, peroxisomes, and nuclei, the proteins from many endomembranes are difficult to enrich without substantial contaminations from other organellar proteins. For instance, proteins in the endo-/exocytic pathways might traffic between endomembrane systems and their final destination. Furthermore, some organelles interact with cytosolic biomolecules to carry out defined functions. The use of a control sample for organelle proteomics allows the specificity of purified proteins in the sample of interest to be analyzed and is required to determine whether a particular protein in a sample indicates specific enrichment or is a contaminating abundant protein (Bouchnak et al., 2019). The advent of quantitative proteomics allows the characterization of the distribution patterns of organelle-specific proteins among partially purified fractions generated using various separation methods, permitting the discrimination of genuine organelle-specific proteins versus contaminants (Chen and Heazlewood, 2021; Figure 4B).
Organellar proteomic strategy | Method | Organellar separation technique | Number of subcellular compartments purified | Number of LC-MS measurement | Quantification method | Instrument requirements | Purity of organelle proteins | Strength |
---|---|---|---|---|---|---|---|---|
Traditional organelle proteomics | Target organelle purification | Ultra-centrifugation | 1 | 1 | − | − | + | Simple, slow |
Single-organelle proteomic profiling | Partial purification | Density gradient centrifugation | 1 | > 2 | Label-free | MS2 | ++ | Relatively rapid, sensitive |
Multiple organelle proteomic profiling | PCP | Velocity gradient centrifugation | > 2 | > 2 | Label-free | MS2 | +++ | No labeling reagents, sensitive |
LOPIT | Differential /density centrifugation | > 2 | 1 | TMT, iTRAQ | MS2/MS3 | +++ | Very sensitive, deep coverage from one experiment |
- Abbreviations: iTRAQ, isobaric tags for relative or absolute quantitation; LC-MS, liquid chromatography-mass spectrometry; LOPIT, localization of proteins by the isotope tagging; PCP, protein correlation profiling; TMT, tandem mass tag multiplexing; −, no quantitation or instrument methods required; +, level of purity.

Analysis of organelle proteins using organellar proteomics
(A) Strategy for traditional organelle proteomics. Plant materials are lysed, and target organelle is purified in a gradient ultra-centrifugation, proteins are analyzed with liquid chromatography (LC)-mass spectrometry (MS). The identified proteins include genuine constituents of the target organelle (green) and co-enriching contaminants (gray and blue), which cannot be objectively distinguished. (B) Strategy for single-organelle proteome profiling. Plant materials are lysed, and target organelle fractionation is carried out (as in part (A)) to enrich a target organelle (green), followed by quantitative MS of the enriched fraction and one or more of the subfractions from the enrichment protocol (such as neighboring fractions on a gradient or crude fractions). For each protein, an abundance distribution profile is obtained. Proteins associated with the target organelle (green) have similar distribution profiles and can be discriminated from contaminants (gray and red) using statistical analysis. Contaminants are recognized as such, but are not necessarily resolved into distinct classes. This method can be extended to encompass multi-subcellular compartments simultaneously. (C) Strategy for multiple organelle proteome profiling. Plant materials are lysed, and most organelles are partially separated simultaneously (six fractionations are shown to illustrate the principle). There is no truly organellar “purification” in this workflow, and the organelles have largely overlapping distributions. All organellar fractions are subjected to labeling-based quantitative proteomic analysis (TMT or iTRAQ). The resulting data are interpreted using a principle component analysis (PCA), and the annotation of the PCA plot with established organelle markers reveals the protein localizations. This workflow takes advantage of multiplex-labeling to reduce MS analysis time and improves quantification accuracy. TMT, tandem mass tag; iTRAQ, isobaric tags for relative or absolute quantitation.
Organelle profiling involves purifying a target organelle and identifying and quantifying proteins across the differentially enriched subfractions. Proteins associated with target organelles have similar distribution profiles following cluster analyses and can therefore be distinguished from contaminants. This powerful approach was implemented using label-free quantitative MS for correlation profiling of proteins in lipid droplets and centrosomes, as well as global organelle analyses (Andersen et al., 2003). Conceptually related studies of mitochondria-enriched fractions from Arabidopsis cells enabled the identification of the membrane-spanning proteome of plant mitochondria (Brugiere et al., 2004). However, this approach is more commonly used to address targeted research questions due to its focus on specific organelles.
Proteome profiling in multiple organelles
Single-organelle proteomics techniques can be extended to study all subcellular organelles simultaneously (Table 2). To achieve this, multiple organelles are separated by non-aqueous fractionation, allowing different subcellular compartments to be purified from freeze-dried material. All fractions from the gradient are analyzed using label-free quantification methods, yielding a quantitative organelle map for each protein. These strategies were used for organelle proteome analysis of Arabidopsis leaves, resulting in the mapping of over 1,000 proteins to 12 fractions (Arrivault et al., 2014). This technique was successfully used to for the study of organelles, sub-organelles, the distribution of proteins, and their associations with metabolites.
Methods that do not require the complete isolation of individual organelles have also been developed (Chen and Heazlewood, 2021; Figure 4C). The subcellular localization of proteins by the isotope tagging (LOPIT) technique is a powerful method for generating dynamic organellar maps from complex biological mixtures (Foster et al., 2006; Mulvey et al., 2017). This strategy involves the partial purification of organelles using density gradient separation, followed by quantitative proteomic analysis in the gradient using tandem mass tag (TMT) or isobaric tags for relative and absolute quantitation (iTRAQ) labeling. LOPIT coupled with highly accurate MS proteomic approaches was successfully used to analyze up to 10 subcellular fractions in the same experiment (Baers et al., 2019). LOPIT approaches rely on the protein correlations within these subcellular fractions using stable isotope tagging coupled with multivariate data analysis to assign similar protein fractionation profiles according to their distributions (Geladaki et al., 2019). Organelle-specific profiling is obtained by analyzing the distributions of proteins with pre-established organelle markers to reveal the cluster identities. This method generates highly reproducible organellar maps, allowing dynamic mapping of perturbation-induced changes in protein subcellular localization in plants.
An early organellar proteomics study involving the use of LOPIT to analyze Arabidopsis membrane fractions revealed novel Golgi-localized proteins involved in cell wall biosynthesis (Nikolovski et al., 2012). Similar studies established a high-confidence dataset of trans-Golgi network proteins in Arabidopsis root tissue (Groen et al., 2014). Importantly, LOPIT enables the simultaneous quantification of various purifications in a single experiment, reducing technical variability and alleviating the issues of missing values from different MS runs. These organelle-profiling strategies were used to map thylakoid membrane proteins in Cyanobacteria, yielding a detailed organellar map of the cell (Baers et al., 2019). LOPIT was recently used to determine protein distribution on the Golgi cisternae, revealing a continuum of transmembrane features guiding proteins to different locations within the Golgi stack (Parsons et al., 2019). These organellar proteomic tools are essential for better understanding the functional roles of endomembrane proteins and how they operate sequentially.
Importantly, organellar proteomics can be used to study the subcellular localizations of proteins at peptide-level resolution, reveal the subcellular localizations of proteolytically processed peptides and different protein splice isoforms, and uncover PTM-induced changes in protein localization. Modified and unmodified peptides often have different subcellular localizations (such as phosphorylated vs. non-phosphorylated transcription factors; Figure 3), resulting in different clusters in peptide-level maps. In addition to studying proteins in organelles, organellar proteomics has been used to study proteins in large cellular structures, such as cytoskeletal proteins (Chuong et al., 2004; Derbyshire et al., 2015). Overall, these studies demonstrated that high-resolution mapping of organelle proteins via organellar proteomics, combined with gene-knockout techniques, is a powerful approach for the systematic analysis of cellular mechanisms.
QUANTITATIVE PROTEOMICS TO ANALYZE PROTEIN TURNOVER RATES
Mass spectrometry-based quantification is of central importance for proteomic studies, allowing relative or absolute protein abundance to be determined. This technique is also well suited for characterizing protein turnover rates. Earlier methods for measuring protein turnover rates involved the quantification of protein abundances after blocking protein biosynthesis using a translational inhibitor (Vogtle et al., 2009); however, such studies are not accurate for measuring protein turnover rates because inhibitor treatment can be stressful and have pleiotropic effects on the cell.
Isotope-based quantitative proteomics was recently developed to allow old and newly synthesized protein populations to be differentiated without using a translational inhibitor. This technique has emerged as a major method for protein turnover studies. However, stable isotope labeling with amino acids (SILAC) is generally best suited for auxotrophic organisms (Ong, 2012). The incomplete incorporation of the label has restricted the use of this technique in plants due to the biosynthesis and interconversion of amino acids. By contrast, metabolic labeling (13C, 2H, and 15N) offers exceptional performance for this purpose in plants. One study followed the degradation rates of selected proteins using 13CO2 in soil-grown Arabidopsis, revealing different turnover rates of free amino acids in the light and dark (Ishihara et al., 2015).
15N-based metabolic labeling has been demonstrated to be an effective method for stable isotope labeling to determine the degradation or turnover rates of plant mitochondrial proteins (Nelson et al., 2013). The relative isotope abundances of a given peptide can be extracted from proteomic data, and different methods can be used to calculate protein turnover rates. A high-throughput quantitative proteomic study involving 15N metabolic labeling and the use of statistical machine learning tools in Arabidopsis leaves determined the degradation rates of 1,228 proteins corresponding to over 60,000 peptides (Li et al., 2017b). This analysis revealed that protein degradation rates are positively or negatively correlated with the leaf growth rate. Similar studies using progressive 15N labeling and blue native-polyacrylamide gel electrophoresis separation revealed that specific subunits of mitochondrial complex I have much higher turnover rates than the same proteins in fully assembled complexes (Li et al., 2013). Conversely, several subunits of mitochondrial complexes showed increased degradation rates and decreased abundances in the knock-out materials of Lon1 protease, indicating that Lon1 functions as a chaperone during protein complex assembly (Li et al., 2017a). These proteomic studies revealed the molecular mechanisms that plants use to balance and fine-tune protein biosynthesis and degradation to meet various environmental or developmental needs.
INTEGRATIVE OMIC STUDIES
Various omics techniques, such as transcriptomics, proteomics, and metabolomics, have been readily used to investigate different types of cellular molecules. Combinations of these approaches can generate diverse types of omics data from the same samples, allowing detailed investigations into the mechanisms involved in different cellular processes. When applying multi-omics strategies to given conditions, the diagnostic identification of candidate genes, proteins, or metabolites involved in a particular process can be performed (Shi et al., 2014; McLoughlin et al., 2018).
Systematic multi-omics tools have increasingly been used to analyze the diverse functions of proteins in many plant species (Kushalappa and Gunnaiah, 2013; Angione and Lio, 2015; Cerny et al., 2015; Lecourieux et al., 2020). Multi-omics combined with targeted mutant validation provide a robust framework for functional investigations of novel signaling components (Zander et al., 2020). These studies demonstrated that a set of genes (or proteins and metabolites) involved in a cellular process will generally be coregulated and thus co-expressed under the control of a shared molecular system.
Integrating genomics and proteomics enables functional-genomic and meta-proteomic approaches, whereas integrating metabolomics and phosphoproteomics link biochemical activity profiles to expressed genes and proteins. Previous reports took a multi-omics systems biology approach linking phosphoproteomics to metabolomics to uncover a complex metabolic network between protein phosphorylation, enzyme activity, photosynthesis, and sugar metabolism (Chen and Hoehenwarter, 2015). Software that enables the projection of various data onto maps can be used to support these systematic analyses (Boekel et al., 2015). However, integrating different types of experimental data is challenging due to huge differences in these technological approaches and the diversity of biological components, including post-transcriptional regulation and PTMs. Hence, a technique involving 2D annotation enrichment was designed to establish a correspondence between omics spaces (Cox and Mann, 2012). Several bioinformatics tools have been developed to integrate proteomics with other data to study signaling pathways (Chen et al., 2012; Boekel et al., 2015; Van Leene et al., 2019). Overall, multi-omics studies have become a reality. Such studies are largely driven by the availability of robust, high-speed sequencing technologies for transcriptomes and translatomes, as well as the use of high-resolution MS for the comprehensive characterization of metabolomes and proteomes.
CONCLUDING REMARKS
Multiple factors influence the plant proteome, making it much more diverse than previously predicted based on genome annotation. Protein abundance, subcellular localization, PTMs, and interactions are dynamically modulated during cellular responses to various biological conditions. The complexity of these signaling components requires the use of high-throughput proteomic technologies to study the underlying molecular mechanisms and to examine protein–protein interactions. Mass spectrometry allows researchers to obtain an in-depth view of all facets of the proteome. In combination with bioinformatics tools, MS-based interactomics has become a powerful tool for determining protein–protein interactions in plants (Zhang et al., 2019). We anticipate that further advances will allow in-depth analysis of single cell proteomics and robust computational processing, making proteomics analysis even more useful.
ACKNOWLEDGEMENTS
We apologize to colleagues whose work could not be cited due to space limitations. The work was supported by grants from National Natural Science Foundation of China (31470345 to Y.C.; 32070300 to S.D.).