Proteins in the cell
The history of protein science is largely the history of test tubes. Our first glimpses of the molecular basis of cellular function came with the development of techniques to purify a protein with a particular interesting activity away from all other biomolecules present in a crude cell extract. Once a protein was purified to homogeneity, its enzymatic activity, binding activity, structural features and so forth could be studied exhaustively using the in vitro ensemble techniques pioneered by Gerty and Carl Cori, Arthur Kornberg, Albert Lehninger and other giants of protein science.1 This test tube approach was essential for establishing who does what in the cell, and how—in some cases providing hydrogen bond-level resolution. However, by focusing so intensely on dilute, homogeneously distributed, highly purified proteins we sacrificed the spatial and temporal heterogeneity that defines the cell. As a result, much remains unknown about how proteins work—and work together—in their distinct cellular microenvironments.
The 16 articles in this special issue, “Proteins in the Cell,” explore our emerging understanding of the effect of the cellular environment on protein function. To be clear, this special issue is not intended to be a comprehensive sampling of the field. Rather, these articles highlight some common themes but also provide an excellent introduction to the diversity of cell-based protein studies. As with all emerging areas, this includes a significant focus on the development of enabling new technologies (both experimental and computational) designed to answer previously impenetrable questions.
An early peek into the extent to which the cellular environment might modulate protein function came from the discovery of moonlighting proteins, individual polypeptide sequences that perform two or more entirely distinct yet physiologically relevant functions. Constance Jeffery (https://doi.org/10.1002/pro.3645) provides a fascinating review of our current (but still growing) understanding of the wide range of moonlighting protein functions, as well as the cellular features like subcellular compartmentalization and cofactor availability that can regulate switching of a single protein between its distinct functions. Such dual-acting proteins highlight the limitations of test tube experiments to predict protein behavior in the cell, a recurring theme throughout this issue and likewise highlighted in Charles Sander's review of our current—and still quite limited—understanding of the role of amyloid precursor protein amyloidogenesis in Alzheimer's disease etiology (https://doi.org/10.1002/pro.3606).
New experimental methods will be essential to deepen our understanding of protein function in the cell, as many traditional test tube approaches do not provide the specificity, signal-to-noise ratio, nor temporal resolution required to interrogate protein function within intact cells. Cathy Royer reviews recent advances in super-resolution fluorescence microscopy that are helping to overcome some of these hurdles (https://doi.org/10.1002/pro.3630). Likewise, Fu and Chang review the promise and potential of site-specific incorporation of photo-crosslinking probes as another approach to visualize proteins in their native environment (https://doi.org/10.1002/pro.3627). Gary Pielak and coworkers quantify protein abundance in a “tunable” expression system and use these results to speculate on what types of experiments will or will not be amenable to current in-cell NMR methods (https://doi.org/10.1002/pro.3637).
Many classic test tube techniques used to interrogate protein function rely on measuring small perturbations from a thermodynamically equilibrated system. Yet a living cell is never at equilibrium: in the cell, kinetics is king. This shift in emphasis has required the development of new computational approaches to model complex nonequilibrium systems, two of which are shared here: (i) Karen Fleming and coworkers develop a kinetic model (https://doi.org/10.1002/pro.3641) to help explain how the promiscuous and fleeting binding of “holdase” chaperones effectively shepherds transmembrane β-barrel proteins across the periplasm to the outer membrane of Gram-negative bacteria, including Escherichia coli. (ii) The kinetic model developed by Lila Gierasch, Evan Powers and coworkers is used to examine the relationship between protein folding rates and a “critical time scale” that will characterize a protein lifetime in a given environment (https://doi.org/10.1002/pro.3639).
Even in a relatively simple cell like E. coli, most proteins function as multimeric assemblies.2 This means that adjustments to intracellular protein concentration can provide a sensitive mechanism to regulate protein assembly—and by extension, protein function. Such concentration adjustments could be made either by adjusting protein production or by selective degradation. But with so many different proteins in the cell, what signals are used to achieve the specificity necessary to degrade only a single protein of interest? Tomita & Matouschek review our current understanding of sequence features that lead some proteins to be inherently or situationally more prone to degradation by the proteasome than other proteins in the eukaryotic cytosol (https://doi.org/10.1002/pro.3642). The article by Tania Baker and coworkers explores the extent to which difference in protein assembly itself can modulate the ability of the multimeric E. coli protease Lon to efficiently digest different substrates (https://doi.org/10.1002/pro.3553). Completing the circle, Duran and Lucius use a classic in vitro biophysical technique, analytical ultracentrifugation, to identify a distinct dodecameric assembly state of the (usually hexameric) E. coli protease ClpAP on ATP concentration (https://doi.org/10.1002/pro.3638).
No collection of articles devoted to our molecular understanding of cell function would be complete without articles focusing on the contributions of ATP, that essential energetic lever used to catalyze thermodynamically unfavorable cellular reactions. Two are included here: Heedeok Hong and coworkers investigated the unusual ATP concentration dependence of the E. coli protease FtsH, discovering how this enables FtsH to do the hard work of extracting transmembrane proteins from the lipid bilayer (https://doi.org/10.1002/pro.3629). In yeast, Jeff Brodsky and coworkers investigate a similarly energetically challenging job, extraction of misfolded proteins from the endoplasmic reticulum, catalyzed by Hsp104 (https://doi.org/10.1002/pro.3636). Following on this important theme of sorting proteins into, across, and out of cellular membranes, van der Sluijs and coworkers introduce us to the Canopy protein family, a relatively poorly described set of four to five proteins important for supporting proper secretion of a wide variety of eukaryotic proteins (https://doi.org/10.1002/pro.3635).
Evolution provides an important lens through which to understand the development and diversity of protein function. Harms and coworkers (https://doi.org/10.1002/pro.3644) use ancestral sequence reconstruction to recreate protein progenitors of extant versions of Toll-like receptor 4 and trace the development of its functional characteristics. Shakhnovich and coworkers use a chimera approach to test the contributions of sequence “blocks” with different ancestral lineage on protein function and cellular phenotypes (https://doi.org/10.1002/pro.3646).
As the articles above highlight, one of the challenges of understanding protein function in vivo is the sheer diversity of proteins themselves. Crucial to developing a comprehensive understanding of protein function will be the development of experimental methods that can be broadly applied, not just to a few “well behaved” model systems but across the breadth of a proteome. Hideki Taguchi and coworkers supply one option here: a system to evaluate folded status after in vitro translation of essentially any protein of interest (https://doi.org/10.1002/pro.3624).
My hope is that the 16 articles collected here highlight to the reader that we have arrived at a new era for protein science, an era leading us to reexamine textbook paradigms for protein function in a new light. Only within the cellular environment can we develop a fully predictive understanding of protein function. By developing new methods and model systems to interrogate proteins in the cell, we can revolutionize our understanding of cell function at the molecular level, much like the advent of in vitro ensemble protein measurements >50 years ago and single-molecule techniques ~20 years ago, both of which ushered in profoundly new levels of understanding of protein and cellular function.