Personal Perspective

Full Access

Bringing proteomics into the clinic: The need for the field to finally take itself seriously

Corresponding Author

Lennart Martens

Department of Medical Protein Research, VIB, Ghent, Belgium

Department of Biochemistry, Ghent University, Ghent, Belgium

Correspondence: Professor Lennart Martens, Department of Medical Protein Research, VIB, Ghent University, A. Baertsoenkaai 3, B-9000 Ghent, Belgium

E-mail:[email protected]

Fax: +32-92649484

Search for more papers by this author

Lennart Martens,

Corresponding Author

Lennart Martens

Department of Medical Protein Research, VIB, Ghent, Belgium

Department of Biochemistry, Ghent University, Ghent, Belgium

Correspondence: Professor Lennart Martens, Department of Medical Protein Research, VIB, Ghent University, A. Baertsoenkaai 3, B-9000 Ghent, Belgium

E-mail:[email protected]

Fax: +32-92649484

Search for more papers by this author

First published: 02 May 2013

https://doi.org/10.1002/prca.201300020

Citations: 13

Share a link

Email
Wechat
Bluesky

Abstract

Proteomics has fast become a standard tool in the life sciences, with increasingly sophisticated approaches and instruments delivering ever growing numbers of identified and quantified proteins. Yet despite the enormous technological progress, and the triumphant papers published on whole-cell proteomes being collected and analyzed, proteomics has so far failed to enter the clinic for routine applications. This is a peculiar contradiction, and one that warrants some closer study. I here argue that for proteomics to make a difference in the clinic, it needs to stop shirking responsibility, and to mature into an analytical, transparent, and reproducible discipline that also invests in the consolidation of its technology rather than only focusing on the next big leap forward. A key enabling factor in this maturation process is quality control and quality assurance, with bioinformatics, in its least noticeable but most influential form, as a key underlying technology.

Abbreviations

CPTAC: Clinical Proteomic Tumour Analysis Consortium
QC: quality control

Proteomics has matured considerably in technological terms. The advent of improved instrumentation 1 and methods 2 has enabled the field to dig ever deeper into the proteomes of cells and tissues, identifying and quantifying thousands of proteins in the process 3-5. As a result of this continuous increase in analytical power, several exultant papers on the technology have been published 6-8. However, growth in a field does not (and should not) only consist of raw analytical power. Indeed, besides such technological development, a field can also mature as a respected supplier of information to downstream research, and it can become an accredited, production-grade discipline in terms of accuracy, precision, and reproducibility. The former, where proteomics can for instance function as a key data supplier for biological model development, has recently been addressed in an editorial on publication guidelines in the field, where it is pointed out that proteomics datasets should be aimed at such predictive modeling efforts in order for them to be considered of real value 9. The latter type of growth on the other hand, concerning the maturation of proteomics as a quality-assured analytical platform, seems to be less fashionable at the present time. Yet, for a field that has entered its second decade as a high-throughput means to detect and analyze entire proteomes, it seems that consolidation efforts to establish the mass spectrometer as a reliable and reproducible tool to analyze hundreds to thousands of samples with consistent sensitivity and specificity are already overdue. That is not to say that efforts have not been undertaken, as recent publications on across-site reproducibility illustrate 10-12. Yet despite the effort spent, all these studies are comparatively small scale, relying on relatively simple samples and on a high degree of coordination and crosstalk between the partners involved. It is also worth noting that some of these studies consider discovery experiments, while others look at validation experiments using targeted proteomics. These two types of experiments typically come with different robustness requirements, where discovery experiments can be more tolerant of limited robustness if follow-up studies are cheap and fast anyway. As such, a variable threshold for reproducibility should be taken into account, much as variable thresholds for statistical significance can be chosen a priori when applying a test. Overall, however, as a field, we are quite far away from demonstrating that we can run anything such as a set of 20 patient-derived samples reliably at 50 different sites in either discovery or validation type experiments, even if standardized instrumentation and protocols could be used. Note that there is no implicit statement here that we cannot in principle perform this feat with current technology; it is just that as a field we have never seriously invested in building up the required analytical rigor. The NCI CPTAC (Clinical Proteomics Tumor Analysis Consortium) project probably has come closest to such an effort, and it is telling that one of the main papers to come out of this project has been the definition of a set of automatically extracted quality control features that can be used to assess system performance 13. The CPTAC project work also showed the importance of consistent and standardized sample handling, and the need to limit preanalytical effects as much as possible, issues that are of course not unique to proteomics, and that need to be taken carefully into account in any analytical endeavor. Since the publication of the quality control paper by the CPTAC consortium, a follow-up implementation has been published to generalize the allowed input formats and fix some minor flaws 14, with another, more focused approach to automated quality control recently published as well 15. Compared to an earlier effort to perform online quality scans to halt a system when its performance dropped below a threshold 16, these more recent papers all discuss extensive a posteriori metrics meant to be used exclusively for quality control and reporting. It cannot be emphasized sufficiently how important such a development is for the long-term growth of the field of proteomics, and the authors of the above-mentioned papers should correspondingly command the community's respect for their role as intrepid trailblazers in this neglected area. Indeed, where previous reporting on documentation and quality filtering methods has covered the details of the experiment 17, or the specifics of protein or peptide identification 18, 19, there has of yet been no movement on behalf of the journals to mandate concise, standardized quality control reports alongside proteomics data. It is worth noting in this context that the mandatory accompaniment of experimental data with a standardized quality control report is not an uncommon practice in other, closely related fields. The sibling field of protein structure analysis for instance, has for a long time already mandated every structure file to be processed through publicly available, standardized online tools that produce standardized quality control reports (see http://www.rcsb.org/pdb/static.do?p=software/software_links/analysis_and_verification.html) as a prerequisite for structure deposition. It is a legitimate question then, why a similar set of tools, standards, and stringent guidelines do not yet exist for MS-based proteomics.

I believe that one of the main issues is the lack of confidence that proteomics researchers have in their own capabilities. Despite the headline success stories that proteomics analyses generate, very few labs actually accumulate a continuous and comprehensive set of metrics that show their overall analytical performance over time. That is not to say that no control is performed; rather, such procedures are in the eye of the beholder, and are usually based on ad hoc information gleaned from the overall results, or on incidental data from instrument reporting. The way I see to break this rather unproductive predicament is to move forward in small, incremental steps. First of all, the key concept upon which to base all other steps is informed self-confidence. For this, there is a need for freely available, open source, automated, and easy-to-use software to extract quality control metrics from MS datasets. These tools (there can, and probably should be, more than one, as is the case for protein structure validators) then need to be put in place in proteomics labs, but initially only for internal use. As noted above, several such tools already exist, although they can likely be improved further. When researchers are confronted with standardized, easily inspected and analyzed reports on their own performance, corrective measures can readily be taken where required, and overall self-confidence will grow as the metrics show good performance. Notice how easy a sell this should be: free software, easy to use, that provides information to make you more confident when things go right, and that helps you take immediate corrective action if things go wrong. An ideal partner to such quality control (QC) introspection will be the availability of standardized and proven (through peer replication and review) key protocol steps. Interesting efforts in this direction have also shown up in the literature, with a recent example by the Sickmann group where trypsin digest is scrupulously evaluated, even going to the effort to compare the performance of trypsin batches obtained from different vendors 20. The next step up the ladder then, introduces the concept of transparency, where the in-house quality control becomes public, through association with publicly shared data in repositories such as PRIDE 21. Obviously, there will be more hesitation for this step, even if researchers feel quite confident of their own data. Will the community accept the achieved levels of quality, or will the data look bad in comparison to others? I firmly believe that this sort of transparency in data quality reports will actually show that proteomics is a technique that works well in the hands of many, rather than showing that it is an arcane art best left to only a select few specialists. Note that this does leave ample room for specialization in terms of the complexity of the analytical task that can be handled, yet robust, analytical proteomics will never be cutting-edge proteomics. It will rather be last year's proteomics, but done carefully and consistently. The positive news here is that the required bioinformatics infrastructure to support such transparency is already very much in place, and that little additional investment is needed to produce a working system where mandatory deposition of QC data along with experimental data can be safely instated by journals in the field.

Once a system is in place that records and shares standard QC reports from labs all over the world, it will be time to advance another step. A variety of labs should then be enlisted to perform across laboratory comparisons on one or more reference samples. Here too, trailblazing efforts have already been carried out, including the famous ABRF Proteomics Research Group studies that brought us the UPS (universal proteomics standard) samples, the HUPO (Human Proteome Organisation) test samples study (rather unfortunately, and incorrectly colloquially referred to by some as the HUPO “irreproducibility study”) 22, the NCI CPTAC yeast samples 11, or even standard samples of phosphorylated peptides 23. By working toward a consistent, high-quality analysis of such a standard (or set of standards), the field would effectively be moving toward support for accreditation testing, where a prescribed standard of analytical prowess becomes formalized and routinely tested. Such a level of analytical rigor would break the last conceptual barriers toward running proteomics analyses routinely in the clinic. At the same time, it needs to be pointed out that this last step constitutes a large amount of work, and that it is implicitly difficult to reconcile with continuous technology development. Indeed, it is extremely hard to standardize and consolidate to such high levels when the underlying instrumentation and methodologies remain in constant flux. It thus becomes important to recognize whether the development of the field is best characterized by a hyperbole, indicative of continuous asymptotic growth, or whether it manifests itself as a series of sigmoids, where periods of intense and rapid development are alternated by relatively stable periods of only incremental advances. Perhaps needless to say, the latter scenario is much more conducive to consolidation efforts, while being less appealing to instrument vendor marketing departments. It is also understood that such a level of consolidation is decidedly less appealing than continuous development, and that it is correspondingly harder to obtain funding for these efforts. Yet, despite this hurdle, sufficient examples already exist where large-scale funders with important stakes in delivering on this technology have provided clear vision with respect to funding efforts to push quality control in the field of proteomics. Notable examples include the EU-funded standardization and QC efforts in the FP6 ProDaC grant, and the FP7 ProteomeXchange and PRIME-XS grants, the above-mentioned NCI CPTAC program, and the Wellcome Trust's support for quality control at the PRIDE database 24.

As a field, I therefore believe we should now take a moment to reflect upon our future, and the role and relative importance of quality control in our growth strategy. Allowing proteomics to mature to the point that it delivers in the clinic is not a pipe dream, but it is a lot of work. And while this work may not seem cutting edge, and does not hold the appeal of the next big leap forward, it will be essential for the long-term viability of the field. The basic premise after all is simply that the field takes up its responsibility as a unique and powerful analytical tool in the life sciences, and vouches to continue to do so well into the future.

Acknowledgments

The author acknowledges the support of Ghent University (Multidisciplinary Research Partnership “Bioinformatics: from nucleotides to networks”), and the PRIME-XS and ProteomeXchange projects funded by the European Union 7th Framework Program under grant agreement numbers 262067 and 260558, respectively.

The author has declared no conflict of interest.

References

1Hu, Q., Noll, R. J., Li, H., Makarov, A. et al., The Orbitrap: a new mass spectrometer. J. Mass Spectrom. 2005, 40, 430–443.
10.1002/jms.856
CAS PubMed Web of Science® Google Scholar
2Gevaert, K., Van Damme, P., Ghesquière, B., Impens, F. et al., A la carte proteomics with an emphasis on gel-free techniques. Proteomics 2007, 7, 2698–2718.
10.1002/pmic.200700114
CAS PubMed Web of Science® Google Scholar
3de Godoy, L. M. F., Olsen, J. V., Cox, J., Nielsen, M. L. et al., Comprehensive mass-spectrometry-based proteome quantification of haploid versus diploid yeast. Nature 2008, 455, 1251–1254.
10.1038/nature07341
CAS PubMed Web of Science® Google Scholar
4Baerenfaller, K., Grossmann, J., Grobei, M. A., Hull, R. et al., Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics. Science 2008, 320, 938–941.
10.1126/science.1157956
CAS PubMed Web of Science® Google Scholar
5Schwanhäusser, B., Busse, D., Li, N., Dittmar, G. et al., Global quantification of mammalian gene expression control. Nature 2011, 473, 337–342.
10.1038/nature10098
CAS PubMed Web of Science® Google Scholar
6Cox, J., Mann, M., Is proteomics the new genomics? Cell 2007, 130, 395–398.
10.1016/j.cell.2007.07.032
CAS PubMed Web of Science® Google Scholar
7Nilsson, T., Mann, M., Aebersold, R., Yates, J. R. III, et al., Mass spectrometry in high-throughput proteomics: ready for the big time. Nat. Methods 2010, 7, 681–685.
10.1038/nmeth0910-681
CAS PubMed Web of Science® Google Scholar
8Paik, Y., Jeong, S., Omenn, G. S., Uhlen, M. et al., The chromosome-centric human proteome project for cataloging proteins encoded in the genome. Nat. Biotechnol. 2012, 30, 221–223.
10.1038/nbt.2152
CAS PubMed Web of Science® Google Scholar
9Calvete, J. J., Updating JPROT's publication standards for large-scale proteomic studies: towards hypothesis-driven interpretation of predictive biological models. J. Proteomics 2012, 76, 1–2.
10.1016/j.jprot.2012.08.022
CAS PubMed Web of Science® Google Scholar
10Addona, T. A., Abbatiello, S. E., Schilling, B., Skates, S. J. et al., Multi-site assessment of the precision and reproducibility of multiple reaction monitoring-based measurements of proteins in plasma. Nat. Biotechnol. 2009, 27, 633–641.
10.1038/nbt.1546
CAS PubMed Web of Science® Google Scholar
11Paulovich, A. G., Billheimer, D., Ham, A. L., Vega-Montoto, L. et al., Interlaboratory study characterizing a yeast performance standard for benchmarking LC-MS platform performance. Mol. Cell Proteomics 2010, 9, 242–254.
10.1074/mcp.M900222-MCP200
CAS PubMed Web of Science® Google Scholar
12Wong, C. C. L., Cociorva, D., Miller, C. A., Schmidt, A. et al., Proteomics of Pyrococcus Furiosus (Pfu): identification of extracted proteins by three independent methods. J. Proteome Res. 2013, 12, 763–770.
10.1021/pr300840j
CAS PubMed Web of Science® Google Scholar
13Rudnick, P. A., Clauser, K. R., Kilpatrick, L. E., Tchekhovskoi, D. V. et al., Performance metrics for liquid chromatography-tandem mass spectrometry systems in proteomics analyses. Mol. Cell Proteomics 2010, 9, 225–241.
10.1074/mcp.M900223-MCP200
CAS PubMed Web of Science® Google Scholar
14Ma, Z., Polzin, K. O., Dasari, S., Chambers, M. C. et al., QuaMeter: multivendor performance metrics for LC-MS/MS proteomics instrumentation. Anal. Chem. 2012, 84, 5845–5850.
10.1021/ac300629p
CAS PubMed Web of Science® Google Scholar
15Pichler, P., Mazanek, M., Dusberger, F., Weilnböck, L. et al., SIMPATIQCO: a server-based software suite which facilitates monitoring the time course of LC-MS performance metrics on Orbitrap instruments. J. Proteome Res. 2012, 11, 5540–5547.
10.1021/pr300163u
CAS PubMed Web of Science® Google Scholar
16Xu, H., Freitas, M. A., Automated diagnosis of LC-MS/MS performance. Bioinformatics 2009, 25, 1341–1343.
10.1093/bioinformatics/btp155
CAS PubMed Web of Science® Google Scholar
17Taylor, C. F., Paton, N. W., Lilley, K. S., Binz, P. A. et al., The minimum information about a proteomics experiment (MIAPE). Nat. Biotechnol. 2007, 25, 887–893.
10.1038/nbt1329
CAS PubMed Web of Science® Google Scholar
18Carr, S., Aebersold, R., Baldwin, M., Burlingame, A. et al., The need for guidelines in publication of peptide and protein identification data: working group on publication guidelines for peptide and protein identification data. Mol. Cell Proteomics 2004, 3, 531–533.
10.1074/mcp.T400006-MCP200
CAS PubMed Web of Science® Google Scholar
19Martens, L., Hermjakob, H., Proteomics data validation: why all must provide data. Mol. Biosyst. 2007, 3, 518–522.
10.1039/b705178f
CAS PubMed Web of Science® Google Scholar
20Burkhart, J. M., Schumbrutzki, C., Wortelkamp, S., Sickmann, A., Zahedi, R. P., Systematic and quantitative comparison of digest efficiency and specificity reveals the impact of trypsin quality on MS-based proteomics. J. Proteomics 2012, 75, 1454–1462.
10.1016/j.jprot.2011.11.016
CAS PubMed Web of Science® Google Scholar
21Martens, L., Hermjakob, H., Jones, P., Adamski, M. et al., PRIDE: the proteomics identifications database. Proteomics 2005, 5, 3537–3545.
10.1002/pmic.200401303
CAS PubMed Web of Science® Google Scholar
22Bell, A. W., Deutsch, E. W., Au, C. E., Kearney, R. E. et al., A HUPO test sample study reveals common problems in mass spectrometry-based proteomics. Nat. Methods 2009, 6, 423–430.
10.1038/nmeth.1333
CAS PubMed Web of Science® Google Scholar
23Savitski, M. M., Lemeer, S., Boesche, M., Lang, M. et al., Confident phosphorylation site localization using the Mascot Delta Score. Mol. Cell Proteomics 2011, 10, M110.003830.
10.1074/mcp.M110.003830
CAS PubMed Google Scholar
24Griss, J., Foster, J. M., Hermjakob, H., Vizcaíno, J. A., PRIDE Cluster: building a consensus of proteomics data. Nat. Methods 2013, 10, 95–96.
10.1038/nmeth.2343
CAS PubMed Web of Science® Google Scholar

Citing Literature

Volume7, Issue5-6

Special Issue:Focus on Cancer Proteomics

June 2013

Pages 388-391

Bringing proteomics into the clinic: The need for the field to finally take itself seriously

Abstract

Abbreviations

Acknowledgments

References

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Bringing proteomics into the clinic: The need for the field to finally take itself seriously

Abstract

Abbreviations

Acknowledgments

References

Citing Literature

References

Related

Information