Raman spectroscopy of protein pharmaceuticals
Abstract
Recent advances in optical and spectroscopic technologies have enabled a plethora of Raman spectrometers that are suitable for studies of protein pharmaceuticals. Highly sensitive Raman spectrometers have overcome the handicap of the fundamentally weak Raman effect that hampered their applications to protein pharmaceuticals in the past. These Raman spectrometers can now routinely measure protein therapeutics at the low concentration of 1 mg/mL, which is on par with other spectroscopic methods such as CD, fluorescence and FTIR spectroscopies. In this article, various Raman techniques that can be used for protein pharmaceutical studies are reviewed. Novel Raman marker of proteins discovered from fundamental studies of protein complexes are examined along with established Raman spectra and structure correlations. Examples of Raman spectroscopic studies of protein pharmaceuticals are demonstrated. Future applications of Raman spectroscopy to protein pharmaceuticals are discussed. © 2007 Wiley-Liss, Inc. and the American Pharmacists Association J Pharm Sci 96: 2861–2878, 2007
INTRODUCTION
Raman spectroscopy was born nearly 80 years ago when Sir C.V. Raman discovered the inelastic light scattering phenomena of molecules (Raman effect, named after him), with K.S. Krishnan in 1928 using their eyes as the detector.1 It measures changes in the scattered light frequencies or vibrational energy differences between the ground and excited vibrational states of molecules when they are interacting with a beam of light. The result of the measurement is a Raman spectrum of the molecules containing a large number of bands that correspond to the molecular configuration. The Raman spectral pattern is considered as the fingerprint of the molecules. As a versatile technique, Raman spectroscopy has found wide applications in physics, chemistry, biology and material science.2 Materials ranging from organics, polymers, semiconductors, ceramics, and biological molecules including DNA, proteins, and carbohydrates can all be studied with Raman spectroscopy. The principle and practice of Raman spectroscopy have been documented in numerous monographs.2-7 Its applications to conventional pharmaceuticals have been reviewed recently.4
The principle of Raman scattering is briefly illustrated in the electronic-vibronic energy diagram in Figure 1. In the infrared absorption spectroscopy, the molecules absorb infrared radiation and transit to the excited vibrational state from the ground state and generate an IR absorption spectrum. The Raman scattering is a two-photon process involving the light and molecule interactions. The molecule in a laser beam first absorbs a photon and transits to the virtual states and then emits a photon that has exchanged energy with the molecule in vibration. If the energy of the emitted photon remains the same as the incoming photon, it results in a Rayleigh scattering, which is millions times stronger than that of Raman scattering. If the energy of the emitted photon is weaker than the incoming photon, which means the molecule in the ground state gained energy from the photon and transited to the excited state. This results in the Stokes Raman scattering, which is measured in a conventional Raman experiment. On the other hand, if the incoming photon gained energy from the molecule that is already in an excited vibrational state, it will emit a stronger photon while the molecule in the excited state falls back to the ground state, this process is called anti-Stokes Raman scattering. The intensity of anti-Stokes Raman scattering is relatively weaker than that of the Stokes Raman scattering because the populations of molecules in the excited vibrational state are always smaller than that of populations in ground state according to the Boltzmann distribution law.

The energy diagram of vibrational transitions between the vibrational energy levels corresponding to the processes of infrared absorption, Rayleigh scattering and Stokes and anti-Stokes Raman scattering. E0 and E1 are the electronic ground and excited states. υ0 and υ1 are the vibrational ground and excited states.
Figure 2 depicts a typical Raman experimental set-up in a visible dispersive Raman spectrometer based on a reflective grating with Czerny–Turner configuration. The laser beam is directed onto the sample cuvette with a focusing lens. The scattered Raman signals are collected with a collection lens, which focuses them through a slit to reach the first mirror of the spectrograph. The mirror reflects the signals onto the grating, which disperses the Raman signal. They are then collected by the second mirror of the spectrograph, which focuses the Raman signals onto the CCD detector, and are processed by a computer.

A typical Raman experiment set-up based on a single grating spectrograph with Czerny–Turner configuration.
Despite the fact that Raman spectroscopy had been employed for fundamental studies of proteins since 1960s with the advent of laser, the use of Raman spectroscopy to study protein pharmaceuticals was hampered in the past by the fact that the old Raman spectrometers required concentrated protein samples of 20–50 mg/mL, due to the low instrument sensitivity.5, 6 Moreover, the early protein therapeutics in general were developed at low concentration dosage. For instance, both the red blood cell regulator rhEPO and the neutropenia therapeutic rhGCSF were formulated in dosages of ∼1 mg/mL.8 Raman spectroscopy, therefore, remains an underutilized technique for the characterization of protein pharmaceuticals unlike other spectroscopic technique such as CD, fluorescence and FTIR9 that are employed routinely.
The recent development of highly sensitive Raman spectrometers have completely changed the landscape of Raman spectroscopic fields, thanks to the series of innovations in optical and spectroscopic technologies including thinned back-illuminated CCD detector,10 holographic notch filter,11 UV edge filter,12 fiber optical probe,13 fast spectrograph based on single holographic trans-missive grating,11 and continuous-wave UV laser14 over the last two decades. Modern Raman spectrometers that incorporate these novel optical components have dramatically increased the instrument sensitivity by many orders of magnitudes. The newly developed high sensitive UV resonance Raman and visible dispersive Raman spectrometers can measure protein solutions at concentration of 1 mg/mL.2-4 Figure 3 shows the high quality Raman spectrum of a monoclonal antibody at 2 mg/mL obtained with an excitation wavelength of 532 nm on a dispersive Raman spectrometer. Figure 4 displays a UVRR spectrum of GCSF at 0.6 mg/mL excited at 229 nm. This instrument sensitivity is nearly equivalent to those of circular dichroism, fluorescence and FTIR spectroscopies.

Visible Raman spectrum of a monoclonal antibody (2 mg/mL) in aqueous solution at neutral pH excited with 532 nm laser line.

UVRR spectrum of GCSF (0.6 mg/mL) in aqueous solution at pH 7 with 229 nm excitation line.
The purpose of this article is to provide an overview of the recent progress in Raman instrumentation that has overcome the previous detection limits, and demonstrate the application of Raman techniques to protein pharmaceuticals. The plethora of Raman spectrometers available for protein pharmaceutical characterization will be discussed, which includes FT-Raman, dispersive visible Raman, Raman optical activity, UV resonance Raman, surface enhanced Raman spectroscopy, and Raman microscopy. We will also examine some novel Raman markers of proteins that were discovered recently on studies of protein–DNA complex, along with previously established Raman spectra-structure correlations of proteins. These Raman features provide the foundation to investigate the local environments, hydrogen-bonding, orientation, and interactions of side chains such as aromatics, free sulfhydryl, and disulfide bonds in proteins that play key role to protein stability and bioactivity. Finally, we will review examples of Raman spectroscopic applications to protein pharmaceuticals in the native and denatured states, conformation change, protein–protein interactions, protein aggregation and protein drug product characterization. The protein therapeutics that have been investigated by Raman spectroscopy encompasses recombinant proteins from E. coli and mammalian cells, pegylated protein, glycoproteins, and monoclonal antibodies.8
The Merits of Raman Spectroscopy for Protein Biopharmaceuticals Studies
FTIR spectroscopy has been a routine tool for protein pharmaceutical analysis for many years,9 yielding information on the secondary structure and conformational changes of proteins as well as intermolecular interactions. As a complementary vibrational spectroscopic technique, Raman spectroscopy can provide similar information on the secondary structure of the protein backbone while offering more structural details on chromophores, aromatics, and other side chains in proteins that reflect the tertiary structure. Moreover, there are a number of advantages of Raman spectroscopy over FTIR for protein pharmaceutical characterization from the perspective of fundamental physical chemistry.
First, the vibrational selection rules of molecules result in the definitive advantage of Raman over FTIR to probe the structure of side chains in proteins. The molecular groups that reside in the protein side chains, such as CC, SS, CS, SH groups, are rich in π-electron and, thus, have larger polarizability. Molecular groups that have large polarizability and totally symmetric vibrational modes are Raman active but may not be IR active and show no IR bands. In the case of the SH group in proteins, although it has weak signals in both the Raman and IR spectra, its IR signal is extremely weak and it requires for very concentrated samples to measure it.15 Second, water, the natural medium for proteins, is a weaker Raman scatter. The Raman spectrum of a protein in aqueous solution can be measured directly without further sample handling, while water absorbs strongly infrared radiation, and thus interferes significantly in the IR spectrum of proteins in aqueous solutions. Third, Raman scattering is an inelastic light scattering effect that can be excited with any single wavelength beam from deep UV to near infrared. This wavelength selectivity offers enormous opportunities for measuring the Raman spectra of molecules by selecting the appropriate excitation wavelength at disposal. It is only limited by the availability of the laser emission lines in the entire UV-visible region. Fourth, if the excitation laser wavelength falls into the UV-visible region where there are absorption bands of chromorphores in the molecules, it will result in a resonance Raman enhancement effect.7, 16, 17 For instance, the UV resonance effect can enhance the Raman signal by 100–10000folds, depending on the nature of the chromophores and the excitation wavelength. In the deep UV region, this will allow UVRR to detect aromatic molecules below the ppm level in solution or to detect a single molecule when combined with the surface enhancement Raman effect. Finally, Raman experiments require as little sample as 1 ng and the Raman spectrum can be obtained from samples in almost any physical state whether they are liquids, solutions, crystals, powders or amorphous solids.
The pitfall of Raman spectroscopy is that it is a fundamentally weak physical phenomenon involving the second order perturbation in terms of quantum mechanics. It was estimated that of a million incoming photons interacting with molecules there is only approximately one scattered Raman photon available for analysis. In addition, in the visible light region, fluorescence emitted in the samples from unknown origin may overwhelm the Raman signal as the quantum yield of fluorescence is much higher than that of Raman scattering. Recent progress in optical technology and spectroscopic instrumentation have overcome the handicap of the fundamental weak Raman effect. Fluorescence interference also becomes less problematic for protein pharmaceuticals as they are purified through multiple HPLC columns to very high purity to be injected into the human body. Even if there is some minor residual fluorescence in the samples; they can be reduced to insignificant levels by letting the sample sitting in the laser beam for a while before starting to collect the Raman spectrum. Fortunately, for FT-Raman18 and UVRR, they can virtually eliminate fluorescence interference by using the excitation wavelength at 1064 nm in FT-Raman and below 250 nm in UVRR,19 respectively.
ADVANCE IN RAMAN SPECTROMETERS
The renaissance of Raman spectroscopy has resulted in a plethora of Raman spectrometers that each offer unique features for protein studies.18-25 In the following section, we will briefly describe the various Raman spectrometers available for protein pharmaceutical applications. There are numerous other exotic Raman techniques, which are beyond the scope of this article and further details can be found in the relevant literatures.2 The salient feature of FT-Raman is that it can alleviate the fluorescence interference to measure the Raman spectra of molecules that was previously not possible due to strong fluorescence background.18 Dispersive Raman based on a CCD detector and a transmissive grating spectrograph offers the highest instrument sensitivity in the visible region.20 Raman optical activity spectroscopy that uses right and left circularly polarized light has the unique capability to probe molecular chirality. It has found many applications in protein, DNA and carbohydrate investigations as they are naturally chiral molecules.21, 22 UV resonance Raman16, 17, 19 is able to selectively probe chromophores and aromatic structure in complex molecules by allowing the selection of the excitation wavelength of interest. Surface enhanced Raman spectroscopy has reached the ultimate sensitivity and can probe a single molecule.23, 24 Raman microscopy provides microanalysis of pigments and of microparticles as well as chemical mapping of heterogeneous materials.25 It is a particularly powerful tool for solid state characterization with applications to samples such as lyophilized products and protein crystals.26
FT-Raman
FT-Raman was invented two decades ago and has become a routine tool for chemical analysis and conventional pharmaceutical applications.18 It employs the 1064 nm near infrared laser as the light source. The scattered Raman signal is processed by the Michelson interferometer and then Fourier transformed into a Raman spectrum in the exactly same approach as in FTIR. The key optical component in FT-Raman is the Rayleigh scattering filter, which differentiates the scattered Rayleigh light from the scattered Raman signal. It completely blocks the Rayleigh scattering but allows the Raman signal to pass through to the FT-Raman optical bench for analysis.
The major advantage of FT-Raman is its application to samples that may emit strong fluorescence when they were excited with a visible laser beam. This is due to the fact that the fluorescence quantum yield is significantly diminished in the near infrared region using a 1064 nm laser. Another advantage of FT-Raman is the precision of the Raman band frequencies as they are calibrated against a reference laser frequency in the interferometer. This offers the accurate spectrum subtraction for complex or mixed samples. FT-Raman also covers the entire vibrational spectral region (4000–400 cm−1) in a single scan. The pitfall of FT-Raman for protein pharmaceutical study is its poor instrument sensitivity. This results from the fundamental physics of light scattering and the low quantum yield of detector in the near infrared region. The Rayleigh law of light scattering states that the scattered light intensity is inversely proportional to the fourth power of the incoming light wavelength (Is ∼ Io/λ4). The longer the excitation wavelength is, the weaker the scattered intensity. For instance, comparing the two excitation wavelengths at 1064 nm in the near infrared and 532 nm in the visible region, the scattered Raman intensity for the same vibrational band would be 16 times less using the 1064 nm wavelength than the 532 nm line (I532/I1064 = (1064 nm/532 nm)4 = 24 = 16). It generally takes 2 more hours to acquire a good Raman spectrum of protein samples at 30–50 mg/mL by FT-Raman. Nevertheless, FT-Raman has been applied to study protein pharmaceuticals in a number of cases but it is certainly handicapped by its low instrument sensitivity and is not suitable to characterize protein pharmaceuticals at low concentrations.
Visible Dispersive Raman
In contrast to the FT-Raman, dispersive Raman uses a grating as the central optical element to analyze the scattered Raman signal. A dispersive Raman spectrometer consists of primarily a visible laser, a holographic grating spectrograph and a CCD detector shown in Figure 2. The samples are illuminated with a major laser beam in the visible region. Commonly employed lasers include the argon ion laser that emits the powerful 514.5 nm and 488.0 nm lines, the solid state laser at 532 nm, and the He–Ne laser at 633 nm, as well as the 785 nm diode laser, which can partially reduce fluorescence interference as it is near the edge of the infrared region. Conventionally, a dispersive Raman spectrometer employs two or even three gratings arranged in a complex configuration for the purpose of obtaining high spectral resolution, and completely suppresses the Rayleigh scattering wing at low frequency for solid state and crystal studies.3 Nowadays, however, the majority of dispersive Raman spectrometers employ only a single grating as the dispersive component because the single grating spectrograph has a much higher overall throughput than the double grating spectrograph or triplemate and still provide good spectral resolution with higher groove density grating.3, 20
Two major types of dispersive visible Raman spectrographs are available commercially. The first is based on a traditional single reflective holographic grating together with a CCD detector and a laser. The second type employs a high throughput transmissive grating spectrograph with fixed collection optics.3, 20 Each design of the spectrograph has its own advantages and shortcoming. The single reflective grating spectrograph has the flexibility to collect the entire vibrational Raman spectrum by scanning the grating, but suffers lower throughput as the reflective grating has only about 50% reflection efficiency versus more than 95% for transmissive grating in the visible region. It may also have better resolution when a narrower slit is used. The transmissive grating offers the best throughput but it can only measure a spectrum segment of about 2000 cm−1 at fixed configuration. The spectral resolution and coverage are predetermined by the configuration of the spectrograph with no capability to scan to different spectral regions. A novel design of a transmissive grating Raman spectrometer is to configure the transmissive grating with a multi track CCD detector, which can simultaneously measures two segments of the Raman spectrum that are diffracted at different angles. It projects one segment of the Raman spectrum onto the up track CCD and the other segment onto the lower track CCD and then splices them together.3 This overcomes the shortages of the narrow spectral range of a fixed grating configuration. Dispersive Raman based on this design offers the best instrument sensitivity and spectral resolution available so far.20
Raman Optical Activity
Raman optical activity spectroscopy (ROA)21, 22, 27-32 measures the Raman intensity difference spectrum of chiral molecules that interact with the right and left circularly polarized light. This is a special Raman technique that was discovered 30 years ago but was hampered by its extremely weak signal because ROA is generally 10000 times weaker than the parent Raman intensity. Therefore, ROA demands the highest sensitive dispersive Raman spectrometer to detect the weak signal. In addition, ROA needs extra optical components to convert the incoming laser light or scattered Raman photons into right and left circularly polarized. The necessary polarization modulation of light is achieved either in the incident light beam27, 29 or in the scattered Raman photon modes.30 The first commercial ROA spectrometer (Biotools Inc., IL) implemented the scattered circular polarization (SCP) strategy with a dual optical fiber collection31 together with a double track CCD detector, in which each CCD track records one of the circularly polarized scattered Raman signals. The two right and left circularly polarized Raman spectra were than subtracted to generate the Raman difference spectrum.
An ROA spectrum can provide the stereochemistry information on chiral molecules. It contains Raman bands related to every part of the chiral molecule, each having a sign that depends on the absolute configuration of the sub structure. The most important structural information obtained from ROA measurement is as follows: First and foremost is the absolute configuration of the molecule, which is indicated by the plus or minus sign of the ROA bands. It thus offers the capability to determine the absolute configuration of a chiral molecule in solution. This is particularly attractive for applications in the pharmaceutical chemistry as often only one enantiomer is the active pharmaceutical ingredient in small molecule drug development. The second piece of information is the magnitude of the ROA intensity: the Δ value, it is a dimensionless quantity defined as Δ = (IR − IL/IR + IL). Its value varies from 10−2 to 10−5 depending on the molecular structure. The ROA Δ value is a good measure of the rigidity or flexibility of the chiral molecules.28 Generally, a rigid molecule has a larger ROA intensity while a flexible molecule has a smaller Δ value. The third important information is contained in the entire pattern of the ROA spectrum, which reflects the overall three dimensional structure of a complex molecule such as protein, DNA and carbohydrate. To completely decipher the ROA pattern of large chiral molecules such as proteins remains a formidable challenge in this emerging field. Nevertheless, ROA measurement combined with chemometrics has shown promising results to correlate the ROA pattern to protein structure by systematically analyzing the ROA spectra of known protein structures determined by X-ray crystallography using principle component analysis.
Raman Microscopy25, 26, 33-35
Raman microscopy is the result of marriage between a Raman spectrometer with an optical microscope. Most Raman microscopes combine a dispersive Raman spectrometer with an optical microscope either via an optical fiber to transmit the Raman signal from the optical microscopic stage to the Raman analyzer34 or integrate the two systems completely as a stand alone instrument. Often one or two excitation laser lines are available and they can be selected on a Raman microscope depending on the sample need. A 785 nm line is preferred to reduce fluorescence background, and a shorter wavelength line at 633 or 532 nm is chosen for better instrument sensitivity with samples that have little fluorescence. The Raman microscope is particularly powerful for microanalysis and chemical mapping of materials in solid states.35 Raman microscopy is an essential technique for forensic analysis in the pharmaceutical industry in conjunction with FTIR-microscopy.35 Raman microscopy has also been shown to be particularly powerful for studies of protein crystals26 and to determine the orientation of side chain groups in protein complexes.36
UV Resonance Raman
UV resonance Raman spectroscopy measures the Raman spectrum of the chromophore in a large molecule that absorbs in the UV region. It has certain advantages over other types of Raman spectroscopies. First, of course, it has the resonance enhancement effect, which can increase the Raman signal by 100–10000 fold. Second, it provides the choice of excitation wavelength that would generate the strongest Raman intensity, for instance, to probe a virus that is assembled with protein and DNA, one can choose the favorable excitation wavelengths of 244 and 257 nm for DNA and 229 and 238 nm for protein aromatics.37-40 Many overlapped weak Raman bands of the aromatic species are well resolved as a large part of the molecule is not resonance enhanced. The third advantage is that it is free of fluorescence interference below 250 nm.19 Finally, since UVRR employs a short wavelength UV laser, it naturally inherits the advantage of the Rayleigh light scattering law that favors short wavelengths. For instance, comparing the excitation wavelength of 514 nm in the visible region and 257 nm in the UV region, even without resonance enhancement, the Raman scattering intensity of a molecule would be 16 times stronger at 257 nm than at 514 nm assuming equivalent efficiency of the Raman spectrometer using the same laser power. The drawbacks of UVRR are that it requires an expensive CW-UV laser and more complicated instrumentation. Another major concern is that the samples may be compromised by the UV radiation if higher laser power is required to measure a weak signal. Fortunately, recent advances in UV Raman instrumentation has enabled measurements of protein samples at 1 mg/mL using less than 1 mW laser power focused at the sample position without compromising the sample.
Surface Enhanced Raman
Surface enhanced Raman spectroscopy (SERS) has gained significant interest in the last decade.23, 24 The Raman scattering signals of molecules were dramatically enhanced once the molecules were adsorbed on silver colloid substrates. The Raman spectrometers used to measure SERS are exactly the same as for visible Raman or UVRR measurements except that they employ a nano size substrate made of silver or gold in the range of 10–150 nm for sampling. The maximum value of the SERS enhancement factor can be on the order of 106 to 108. Surface enhanced resonance raman scattering (SERRS) uses an excitation wavelength in resonance with an absorption band of a molecule. The combined surface enhancement and resonance enhancement factors would be on the order of approximately 1010 to 1011 for some chromophores adsorbed on the colloid silver. This tremendous boost of signal enabled the measurement of Raman spectrum of single molecules.41 SERS has been exploited to study amino acids, peptides, and small proteins.23, 24 It could in principle be applied to study large protein pharmaceuticals as well.
SPECTRAL AND STRUCTURE CORRELATIONS
The correlation between the Raman band frequencies, intensities and protein structure was necessary to allow the analysis of protein pharmaceuticals by Raman spectroscopy. These correlations are the result of extensive research in both theoretical calculation of model compounds of N-methylacetamide and small peptides42, 43 as well as experimental studies of proteins in the last four decades.5-7, 42-46 The main contributors to the Raman spectrum of a protein are the vibration modes of the polypeptide backbone overlapped with bands of the side chain groups of 20 different amino acids. A typical protein exhibits about 30 Raman bands in the fingerprint region between 2000 cm−1 and 400 cm−1 plus a few more bands in the interval of 4000–2500 cm−1 that are due to the vibrational modes of localized groups such as NH, OH, CH3, CH2, and SH. The NH, OH, and SH groups can form hydrogen bonding with proton acceptors and may show broadened band profiles depending on the status of their hydrogen bonding. The correlation of the Raman spectra and protein structure may be conveniently divided into the polypeptide backbone modes and side chain modes. The former can be used to monitor the secondary structure of protein while the later is exploited to probe the hydrogen bonding, local environment, and intermolecular interactions of the side chains.
The Secondary Structure of the Polypeptide Backbone
The correlation between the Raman band frequency and the secondary structure of the protein arises from the fact that the hydrogen bonding of the polypeptide bond is different in α-helix, β-sheet, or disordered structures. The Raman bands that are related to the peptide linkage (OCNH) are designated as the amide bands based on the normal mode calculation of a model compound N-methylacetamide for peptide bond.43 The conformationally sensitive amide I band is contributed mainly by the CO stretching mode of the peptide linkage. It can form a hydrogen bond with the NH groups of the peptide bond of another chain (inter-chain) or of the same chain at different sequence positions (intra-chain). If the polypeptide backbone is in α-helix, in which the hydrogen-bonds are formed between the CO and NH on the same chain, the Amide I band occurs at ∼1655 cm−1. Both rhGCSF and rhEPO are proteins that contain primarily α-helix structure and have the typical Amide I band at 1655 cm−1. The Amide I band would be at ∼1670 cm−1 when the polypeptide backbone is predominantly β-sheet, in which the hydrogen bond is formed between the CO and NH groups from adjacent chains arranged either in parallel or anti-parallel modes. For loose β-sheet and disordered structure, it occurs at about ∼1640 cm−1. Typical β-sheet protein include monoclonal antibody, which show characteristic Amide I band at 1670 cm−1.
The Amide III band is also considered to be sensitive to the protein secondary structure and exhibits a particularly medium band at the interval of 1235–1250 cm−1 for β-sheet structure. On the other hand, this band occurs at the interval between 1300 and 1340 cm−1 in α-helix. Most monoclonal antibodies exhibit a medium band at ∼ 1245 cm−1 as they contain primarily β-barrel structure. A weak band at ∼950 cm−1, originated from the skeleton stretching mode was proposed as a good indicator of α-helix.6 Quantitative analysis of the secondary structure components from Raman spectra have been proposed based on the Amide I band deconvolution and band fitting.44 Recent development of deep UVRR can directly probe the peptide backbone and found that the Amide III bands are extremely sensitive to protein backbone conformation. Table 1 summarizes the correlation between the amide band frequency and the secondary structure of proteins.
Band Frequency (cm−1) | Amide Bands | Vibration Mode | Secondary Structure |
---|---|---|---|
1680 | Amide I | H-bonded CO stretch | β-Turn |
1670–1680 | Amide I | H-bonded CO stretch | β-Sheet and β-barrel |
1650–1655 | Amide I | H-bonded CO stretch | α-Helix |
1640 | Amide I | H-bonded CO stretch | Loose β-sheet |
1300–1340 | Amide III | NH and CH bend | α-Helix |
1260 | Amide III | NH and CH bend | Disordered |
1235–1250 | Amide III | NH and CH bend | β-Sheet |
930–950 | Skeleton | NCαC stretch | α-Helix |
Raman Marker of Aromatics
The Raman markers of side chain groups of protein derive from the localized vibrational mode of specific amino acid groups. Raman marker bands of aromatics in protein are relatively strong and are well recognized in visible Raman spectra;5-7, 45, 46 however, they are more or less overlapped with amide bands and other side chain groups. Protein aromatics are best resolved by UVRR as it selectively excites only the tryptophan and tyrosine in protein with a deep UV laser.7, 16, 17, 19, 47-52 Phenylalanine shows an intense band at 1002 cm−1 in visible Raman but would not be seen in the UVRR spectrum of a protein that contains tryptophan and tyrosine. The phenylalanine bands are overwhelmed by the Trp and Tyr bands as their Raman cross sections are almost 10 times stronger than those of Phe.19
The Raman frequency and intensity of the tryptophan and tyrosine bands have been extensively examined by normal mode calculation and experiments.53, 54 The correlations between the structure and spectrum have been well established for the majority of the bands.53-57 For instance, the relative intensity of the Fermi doublet of tryptophan at 1360 and 1340 cm−1 has been suggested to be a good indicator of the hydrophobic/hydrophilic environments of the tryptophan indole ring.54 If the relative intensity ratio of the Fermi doublet I1360/I1340 is smaller than 1.0, the tryptophan indole ring is considered to be in a hydrophilic environment or exposed to aqueous medium, on the other hand, if it is greater than 1.0, they are considered to be in a hydrophobic environment or in contact with aliphatic side chains. The Trp band frequency of the W3 mode near 1550 cm−1 has been correlated to the dihedral angle between the indole ring and the peptide bond plane.55 It has been noted that the ROA sign of this band is sensitive to the orientation of the indole ring and may provide a probe of quasi-absolute configuration of tryptophan in protein.58 The frequency of the band near ∼875 cm−1 has been found to be sensitive to the hydrogen bond of the NH group of the indole ring.58
One of the new findings of tryptophan Raman markers by UVRR is that some tryptophan band frequencies are sensitive to the cation–π interaction between tryptophan and positively charged side chains in proteins. Five distinct bands at 763, 1228, 1370, 1560 and 1776 were identified as indicators of the specific tryptophan interaction with cation.19 At least two of the marker bands at 763 and 1370 cm−1 are also distinguished in the visible Raman of protein,59 which can be used to probe the cation–π interaction by visible Raman. The cation-π interaction has been recognized as an important inter-molecular interaction in protein, receptor, DNA and macromolecular complexes.60 They have been systematically surveyed based on X-ray diffraction structure. It was found that they exhibit anomaly spectroscopic characteristics of Trp in fluorescence spectroscopy, in solid state NMR, and now in UVRR and visible Raman spectroscopy.60-63 The cation–π interaction plays a critical role in protein stability and DNA–protein binding that involves the aromatic moieties in biologi macromolecules and other positively charged groups. Table 2 summarizes the Raman markers of Trp and Phe that are sensitive to the local environments and hydrogen bonding as well as inter-molecular interactions.
Band Frequency (cm−1) | Vibrational Mode | Local Environment | |
---|---|---|---|
Normal | Cation-π | ||
757 | (763) | W18 | Cation–π Interaction |
880 | W17 | Hydrogen bonding | |
1002 | F1 | Intense in hydrophobic state | |
1240 | (1235) | W10 | Cation–π interaction |
1340 | W7 Fermi Doublet | ||
1360 | (1370) | W7 Fermi Doublet | Sensitive to local environment and Cation–π interaction |
1552 | (1556) | W3 | Sensitive to W orientation and cation–π interaction |
1767 | (1773) | W18 + W16 | Cation–π interaction |
Another new insight into the Raman markers of protein aromatics was gained from Raman studies of virus assemble.45, 65 It was noted that the conventional Fermi doublet of tyrosine at 850/830 cm−1 became a singlet at 850 cm−1 in the filamentous virus fd and Pf1.37, 38, 64 This was attributed to the extremely hydrophobic environment where the tyrosine residues are located in these filamentous viruses. Detailed analysis with the tyrosine model compound p-cresol in vapor by Raman spectroscopy confirmed that the tyrosine Fermi doublet will become an unusual pair of bands at 839/812 cm−1 when it is not hydrogen bonded in an extremely hydrophobic environment.65 Moreover, the 839/812 cm−1 bands exhibits an intensity ratio of 6.7, which reflects a far larger value than previously proposed maximum of 2.5. Table 3 lists the tyrosine Fermi doublet and correlates its relative intensity ratio to the donor or acceptor state of the tyrosine phenoxyl groups.
Apart from the Raman band frequencies that are sensitive to hydrogen bonding, cation–π interaction and indole ring orientation, information on the local environment of aromatic rings can be inferred from the Raman cross-section values of bands from UVRR experiments. Extensive UVRR experiments of the aromatic amino acids and proteins have established that the Raman cross sections of aromatic bands are significantly larger in hydrophobic environment than in water.7, 37-40, 48-52 The Raman cross sections of the aromatic bands in buried hydrophobic environments can be two to three folds larger than those that are exposed to solvents. Chi and Asher51, 52 have shown that the higher Raman cross section of aromatics in a hydrophobic environment is the result of a red shift of the Trp and Tyr absorption bands in the deep UV region. Other factors may also contribute to the difference. For instance, buried aromatics are less susceptible to UV radiation degradation than those aromatics that are exposed at the protein surface. The Raman cross section of aromatics in protein thus may be used to predict their hydrophobic or hydrophilic environments. Table 4 gives the major Raman cross section values of the aromatic amino acids obtained from UVRR spectroscopy at four different excitation wavelengths 229, 238, 244 and 257 nm, respectively.19
Frequency (cm−1) | Raman Cross Section (mbarns/Molc sr) | Assignments | |||
---|---|---|---|---|---|
229 nm | 238 nm | 244 nm | 257 nm | ||
Tryptophan | |||||
757 | 2.3 | 10 | 18 | 400 | W18 |
880 | 2.3 | 4.0 | 3.0 | 130 | W17 |
1010 | 4.5 | 14 | 34 | 470 | W16 |
1131 | 4.5 | 5.0 | 5.0 | — | W13 |
1238 | 6.6 | 4.0 | 25 | 102 | W10 |
1340 | 12 | 15 | 27 | 180 | W7 Fermi doublet |
1360 | 8.0 | 7.5 | 20 | 195 | W7 Fermi doublet |
1460 | 6.0 | 5.5 | 10 | 77 | W5 |
1550 | 16 | 18 | 38 | 528 | W3 |
1575 | 27 | 10 | — | — | W2 |
1616 | 39 | 36 | 45 | 200 | W1 |
1767 | — | — | — | 52 | W18 + W16 |
Tyrosine | |||||
830 | — | 5.0 | 6.0 | 44 | 2Y16a |
850 | — | 5.0 | 8.0 | 61 | Y1 |
1178 | 4.8 | 12 | 24 | 254 | Y9a |
1209 | 3.6 | 6.0 | 14 | 106 | Y7a |
1613 | 7.9 | 20 | 53 | 587 | Y8a |
Phenylalanine | |||||
1002 | — | 2.8 | 3.0 | 6.2 | F1 |
1030 | — | 0.5 | 0.5 | 1.0 | F18a |
1205 | — | 1.2 | 1.2 | 3.6 | F7a |
1602 | — | 2.5 | 2.0 | 8.0 | F8a |
The Free Sulfhydryl and Disulfide Bonds
The majority of proteins contain free sulfhydryls, disulfide bonds or both. The Raman bands of free SH groups in proteins are extremely weak in the visible Raman spectrum. Nevertheless, the SH groups in proteins can be detected with high sensitive Raman spectroscopy. Furthermore, they are easily recognized as there are no other bands present in this frequency interval between 2500 and 2700 cm−1. The band frequencies of free sulfhydryls have been correlated to the hydrogen bonding status of the sulfhydryl groups based on both normal mode calculation and experimental studies of model compounds.66-68 The SH stretching mode appears at 2585 ± 5 cm−1 when the free cysteine is not hydrogen bonded. The frequency of the SH group is lowered by 25–60 cm−1 for strong SH donors, 10–25 cm−1 for moderate SH donors, and 5–10 cm−1 for weaker donors. However, in the absence of SH donation and the sulfur atom acting as a proton acceptor, it will shift slightly to higher frequencies by 4 cm−1.66 The SH stretching mode is weakly dependent upon the conformation of the CαCβSH side chain. In contrast, the CβS stretching mode is very sensitive to the rotation about the CαCβ bond. It could differ 30–50 cm−1 for the CS stretching mode depending on the cysteine side chain conformation.
Disulfide bonds in proteins are formed when the free cysteines are oxidized and linked. Raman markers of disulfide bonds occur in the low frequency region between 700 and 450 cm−1. They are weak but discernable bands. The band frequency of both the CS stretch and SS stretch mode of the disulfide bridge are sensitive to the conformation of the CCSSCC moiety. The correlation between the band frequencies and the conformers of the disulfide bond has been well established69-73 through normal coordinate analysis and extensive experimental investigation. Sugeta et al.69-71 performed the detailed normal mode analysis of disulfide bond model compounds and concluded that the frequency of the SS stretching mode depends on the conformation of CCSSCC moiety. The GGG conformer of the disulfide bond in the CCSSCC moiety exhibits a SS stretching band at 508 cm−1 and the GGT conformer has a peak at 527 cm−1 while the TGT conformer produces a band at 544 cm−1, respectively. The GGG isomer is the most stable and the GGT, TGT isomers the next. Table 5 summarizes the Raman marker band frequencies of free sulfhydryl and disulfide bonds.
Band Frequency (cm−1) | Vibrational Mode | Local Environment and Conformers |
---|---|---|
>2585 | Free SH stretch | Exposed |
2575 | Weakly H-bonded | Partially exposed |
2565 | Moderately H-bonded | Partially exposed |
<2560 | Strongly H-bonded | Buried |
704 | CS stretch | Trans conformer |
655 | CS stretch | Gauche conformer |
540–545 | SS stretch | TGT conformer |
523–528 | SS stretch | GGT conformer |
508–512 | SS stretch | GGG conformer |
APPLICATIONS TO PROTEIN PHARMACEUTICALS
Conformation Determination
One of the most important applications of Raman spectroscopy to protein pharmaceuticals studies is to determine the protein conformation along with CD, fluorescence and FTIR spectroscopies in solution43, 44 and in solid states.75 The majority of the early studies of protein conformation by visible Raman spectroscopy took the advantage of the sensitivity of the Amide I band frequency to the protein conformation. The Amide I band profiles were deconvoluted into individual components and the composition of the secondary structures of the protein were then analyzed.44 Recently, Asher et al.76 developed a deep UV resonance Raman method to directly probe the protein conformation through the polypeptide backbone vibrations of the Amide III mode and the CCαH bending mode. These vibrations are the most conformationally sensitive modes of the polypeptide backbone and depend on the amide dihedral ψ angle. The two vibrations (the NH bending and the CCαH bending modes) of the polypeptide backbone are intimately mixed at ψ ∼ 120°, which is associated with the β-sheet and disordered structure. Conversely, these two modes are uncoupled for the α-helix conformation where ψ is at −60°. They employed this method to accurately determine the conformational change of horse holo- and apo-myoglobin upon acid-induced unfolding.52, 77 At neutral pH, the predominnat component of the secondary structure of holo- and apo-myoglobin is α-helical. They are ∼ 80% and ∼ 62%, respectively. Dramatic conformation changes upon acid denaturation of holo- and apo-myoglobin are monitored by the Amide III and CCαH bending bands in UVRR. The amount of helix present in this predominantly α-helical protein decreased from ∼ 80% at pH 7 to ∼ 20% at pH 2 accompanied by an increase in unordered structure. They estimated that the accuracy of determination of the secondary structure composition of protein by this method is better than 3%, absolute.
Protein Aggregation and Fibrillation
The unfolding and misfolding of protein structure are a central concern in maintaining protein pharmaceutical stability and activity. The denaturation of native protein structure, misfolding and then aggregation often involve a change in the secondary structure. Raman spectroscopy is particularly suited to characterize the transformation of native protein structure to protein aggregates, which are often dominated by inter-molecular β-sheet formation that acts as a nucleation motif. Paul Carey's lab has done an elegant study on the transition of native insulin structure in crystals into β-sheet by reducing the disulfide bridge using Raman microscopy.78 Insulin is a protein that contains predominantly α-helical secondary structure in the native state. Upon reduction of the disulfide bridge of insulin in crystals, dramatic changes occur for the disulfide bonds in the region between 490 and 570 cm−1, these bands disappear in the reduced insulin and a new band appears at 2573 cm−1 due to free SH stretching. Concurrently, the predominant α-helical Amide I band at 1657 cm−1 shifted to 1669 cm−1, which is the characteristic β-sheet band. In the Amide III region, the β-sheet marker at 1236 cm−1 also becomes the predominant signal, while the distinctive α-helix band at 946 cm−1 disappears. All of these band frequency changes confirm that the secondary structure of insulin in the crystal has changed from predominantly α-helix to β-sheet. It was noted that the transformation of native insulin to the β-sheet form in crystals are not reversible. They can no longer diffract X-rays, although they maintain the original crystal appearance. Furthermore, the β-sheet insulin crystals cannot be analyzed with gel electrophoresis under denaturing or non-denaturing conditions. This indicates the formation of large aggregates. These evidence suggest that a significant amount of irreversible intermolecular β-sheet structures have been formed. Raman spectroscopy has also been employed to probe the insulin fibrillation mechanism using pro-insulin as it is a good prototype for fibrillation studies of globular protein.79-82
Side Chains Structure
The side chain structure and local environment in protein therapeutics can be studied with dispersive Raman and UVRR. A good example is the study on the side chain groups of cysteines and aromatics in interleukin-1 receptor antagonist (IL-1ra).83 IL-1ra binds to the interleukin receptor to block the binding of IL-1 to the receptor but does not elicit biological response.84 It thus has therapeutic application as IL-1 plays a key role in auto-immunity and inflammation. The protein has been developed as a human therapeutic to treat rheumatoid arthritis. It has an unusual proportion of aromatic amino acids (9F, 2W and 3Y) and four cysteine residues for a small protein with a molecular weight of 17 kDa. There has been a longstanding controversy on the status of cysteines in IL-1ra by two independent X-ray crystal diffraction studies. One proposed that there is a pair of cysteines in IL-1ra to form a disulfide bridge85 while the other showed no disulfide bond at all.86 In addition, the titration of IL-1ra in aqueous solution by DTNB monitored by UV spectroscopy indicated that there were only three free cysteines in IL-1ra, in apparent contradiction with the primary sequence.
The Raman spectra of IL-1ra in solution were able to demonstrate that the four cysteines are all in the free sulfhydryl state.83 They exhibit two bands at 2565 and 2590 cm−1 for the SH groups, while no disulfide band is present at all. Moreover, the relative intensity between the two free sulfhydryl bands is approximately 3:1. The frequencies and the intensity ratio would suggest that three of the four cysteines are not hydrogen bonded and are exposed to solvent while one of the cysteines is strongly hydrogen bonded and buried, which explains why it was not accessible to DTNB titration. This resolved the controversy raised by X-ray diffraction and demonstrated that Raman spectroscopy offers certain advantages over X-ray diffraction on the determination of structure and local environments of side chain groups such as free cysteine and disulfide bonds in proteins. As is well known, X-ray diffraction of protein crystals does not have the resolution power to determine the exact position of hydrogen because the hydrogen atom has low electron density that diffracts poorly the X-ray beam. Therefore, it relies solely on inference of the distance between the sulfur atoms to determine whether there are disulfide bridges in proteins.
The Raman and UVRR spectra of IL-1ra in solution also showed normal tryptophan band frequencies at 760, 1245, 1360 and 1770 cm−1, respectively. These Raman markers would rule out the possibility of cation–π interactions between the tryptophans and cations of IL-1ra in solutions. It was proposed in an aggregation study of IL-1ra87 that there were possible cation–π interactions between the Trp-16 in one IL-1ra monomer and the Lys-93 from another adjacent monomer, again based on the X-ray diffraction structure.85, 86 The Trp band frequencies in the Raman spectrum would have shifted to 763, 1235, 1370 and 1773 cm−1 were there presence of cation–π interactions between the Trp and Lys in IL-1ra.
Site-Specific Protein Mutant
In addition to providing information on protein backbone conformation, and side chain structure of proteins by various Raman techniques, the combination of site-specific mutagenesis with Raman spectroscopy provides an extremely powerful tool to decipher the detailed side chain group interactions in proteins. The distinct cysteine profile in the P22 tailspike protein revealed by Raman spectroscopy together with site-specific mutagenesis illustrates the success of this strategy.74 The large P22 tailspike subunit has 666 protein amino acids that contain eight cysteine residues. All of them are reduced in the native protein. Moreover, they are unreactive and inaccessible to water molecules.
The Raman spectrum of the SH bands is unusually complex. In order to determine the specific contribution of each cysteine sulfhydryl to the Raman profile and to correlate the corresponding SH hydrogen bonding local environments to the Raman band frequency, site-specific mutagenesis was employed to produce single cysteine → serine mutants.88 They were then systematically analyzed for their respective Raman SH bands. By comparing the spectra of the site-specific mutants to that of the wild-type tailspike protein subunit, it revealed that the eight cysteines can be categorized into four groups with different hydrogen bonding states. Three cysteines form robust hydrogen bonds, two modest, and two weak hydrogen bonds. One of them showed the strongest hydrogen bond of a sulfhydryl observed so far in any proteins. These diverse cysteine profiles are interpreted as the outcome of the protein folding and assembly pathway of the tailspike subunit.
Protein–Protein Interactions
In the study of protein and receptor complex by isotopic editing vibrational spectroscopy, Li et al.89 employed FT-Raman to show that the formation of a protein complex between the brain-derived neutrophic factor (BDNF) and its receptor produced significant conformation change in the receptor. The Raman spectrum of the BDNF receptors exhibit several different SS band frequencies since there are a total of six intra-molecular disulfide bonds in the receptor. However, it showed only a single band from an SS bond at 516 cm−1 in BDNF even though the protein has 3 disulfide bonds. This suggests that all three disulfide bridges in BDNF are in the same GGG conformation. Upon formation of the complex between BDNF and its receptor, the two SS markers of the receptors at 500 and 540 cm−1 disappear, which suggests that the conformation of the disulfide bonds of BDNF receptor changed in the protein complex. Moreover, the disulfide bridges in the protein complex are in the GGT conformer based on the band frequency. Raman spectroscopy has also been applied to study the avidin, biotin and the avidin–biotin complex.90 It was shown that upon formation of the avidin–biotin complex, the Trp Raman band intensities are increased, including the 1360 cm−1 band, which indicates that Trp is directly involved in the avidin–biotin interaction. In addition, there is a red shift of the avidin Trp UV absorption band that suggest some of the previously exposed Trp residues are now protected from solvent in the complex.
Other Applications
Tuma et al. employed both visible Raman and UVRR to characterize the solution conformation of pegylated human tumor necrosis factor receptor.91 They showed that pegylation of the human TNF has no effect on the conformation of the proteins. However, the rotamers of the disulfide bridges of the protein in solution differ from those in the crystal structure. FT-Raman spectroscopy has recently been employed to characterize the lyophilized and spray dried form of a therapeutic antibody for both monitoring long term stability, and to compare the conformation of the protein in solid state and in reconstituted solution.75 FT-Raman had also been employed to determine the secondary structure of protein in E. coli inclusion bodies.92 During the study of conformation of wild and mutant of glial cell line-derived neurotrophic factor (GDNF) by FT-Raman, Li et al.93 showed that breaking of the inter-chain disulfide bond between the dimer caused minimal changes in the secondary structure of the protein. However, difference Raman spectroscopy did detect changes in the local environments of the tyrosine and aliphatic residues that are located in the hydrophobic patch that is in the interface between the two monomers upon reduction of the inter-chain disulfide bond.
Raman optical activity spectroscopy has been exploited to probe the heat and acid denaturation of protein.94, 95 In the study of the denaturation of α-lact-albumin by acid and heat, as the temperature was increased from 2°C to 60°C at pH 8, the two distinctive positive ROA Amide III bands in the native state coalesced into a single broad positive band accompanied by the appearance of a prominent 1240 cm−1 band in its parent Raman spectrum. A similar structural transformation was observed upon acid denaturation by titration of the protein solution from pH 8 to pH 2 at 2°C. These ROA data were interpreted as the loss of tertiary structure of the α-lactalbumin with the retention of partial β-sheet secondary structure in the unfolded state upon heat and acid denaturation. Both the ROA pattern of the α-lactalbumin at pH 2, 2°C and pH 8, 60°C resemble the disordered conformation of model polypeptides, which are typical for the molten globule state of proteins. ROA has also been successfully employed to study the conformation of immunoglobulin, glycoproteins, and enzymes as well as virus assembles.96
UVRR spectroscopy has been used extensively to probe the protein dynamics of hemoglobins7, 97, 98 and to determine the specific intermolecular interaction between side chains in proteins and complexes.36 It has also been applied to investigate the binding mode of Congo Red to Alzheimer's amyloid β-peptide99 and drug binding in enzymes as part of studies in drug–protein interactions.100 The ultra-sensitive SERS has been used to detect and quantify the neurotransmitters in brain extracts. High quality Raman spectra of dopamine and norepinephrine on colloidal silver clusters were obtained at concentrations between 5 × 10−6 and 5 × 10−9 M.24 The Raman spectrum of a single hemoglobin molecule has been successfully recorded on an optically “hot” silver nano-particle.24
CONCLUDING REMARKS
In the last two decades Raman spectroscopy has been transformed from the exclusive tool of academic research to a turn-on key technique that is equivalent to other spectroscopic methods such as FTIR, CD and fluorescence spectroscopy that are routinely employed for protein studies. It is anticipated that its application in the characterization of protein pharmaceuticals will thrive as the potential to solve real problems is realized in biopharmaceutical science and industry. The future applications of Raman spectroscopy for protein pharmaceutical will include, but not be limited to, the development of protein pharmaceutical formulations, screening for more stable therapeutic candidates, investigating the interactions of proteins and excipients in solution and in lyophilized products, monitoring protein drug stability, and differentiating product from placebo. It will also find applications in raw material and product identification, forensic analysis of contaminants in final products and investigations in counterfeit drugs.
The ever evolving technology in optics, electronics, and spectroscopy will provide more powerful Raman spectrometers in the near future. There have already been many other types of Raman spectrometers in development.2 Among them, the coherent anti-Stokes raman scattering (CARS) microscope holds promise for chemical imaging in three dimension,101 which could become a powerful tool for protein pharmaceuticals characterization in solid states. Raman spectroscopic studies of protein–DNA complexes such as virus assembly often provide surprising new insights into Raman markers of proteins that can be beneficial to other protein studies.36, 39, 64, 65 Raman spectroscopy with the aid of site-specific mutagenesis and isotope substitution should be particularly powerful for the study of protein–protein interactions or ligand–receptor interactions as they can resolve the overlapping Raman bands into the constituents. They could be used for protein therapeutic candidates screening in protein engineering. Time resolved UV resonance Raman spectroscopy will enable the study of the kinetics of protein therapeutic and target interactions on a very short time scale. Raman microscopy is powerful to perform chemical mapping on the surface of the lyophilized products. The single molecule Raman spectroscopy based on resonance Raman and the surface enhanced effect will allow for the study of the behavior of a single protein molecule in solution.
Acknowledgements
I am grateful to Dr. Linda O. Narhi who read the manuscript and Dr. Xiaolin Cao who prepared the illustrations.