0% found this document useful (0 votes)
19 views18 pages

Robertson&Murphy ChemRev 1997

Uploaded by

isrrael medina
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views18 pages

Robertson&Murphy ChemRev 1997

Uploaded by

isrrael medina
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Chem. Rev.

1997, 97, 1251−1267 1251

Protein Structure and the Energetics of Protein Stability


Andrew D. Robertson* and Kenneth P. Murphy*
Department of Biochemistry, The University of Iowa, Iowa City, Iowa 52242

Received March 3, 1997 (Revised Manuscript Received May 14, 1997)

Contents
I. Introduction 1251
II. Determining the Thermodynamics of Unfolding 1252
for Globular Proteins
A. Differential Scanning Calorimetry 1253
B. Optical Spectroscopy 1253
C. Precision and Accuracy of Thermodynamic 1254
Data
III. Correlation of Unfolding Thermodynamics with 1256
Protein Structure
A. Database of Unfolding Thermodynamics for 1256
Proteins of Known Structure
B. Relationships between Unfolding 1258
Thermodynamics and Features of Protein
Structure Andrew D. Robertson was born in Manhattan Beach, CA, in 1959. He
IV. Summary 1263 received his B.A. in Biology from the University of California at San Diego
V. Acknowledgments 1266 in 1981 and his Ph.D. in Biochemistry from the University of Wisconsin,
Madison, in 1988. After postdoctoral training at Stanford University, he
VI. References 1266 joined the faculty in the Department of Biochemistry at the University of
Iowa in 1991, where he is now an Associate Professor. His major research
interest is the relationship between protein conformation and the energetics
I. Introduction of protein stability and function. Current research is focused on the
thermodynamics and kinetics of conformational interconversions in proteins
The tendency of proteins to spontaneously adopt a at the level of individual amino acid residues.
well-defined conformation in solution has intrigued
investigators for many decades.1 The key questions
in the study of this intramolecular recognition reac-
tion are the same as those driving research into
intermolecular recognition: what are the molecular
determinants of specificity and stability? The dis-
tinction between specificity and stability has a long
history in studies of intermolecular recognition (e.g.,
ref 2). In the area of protein folding, this distinction
has only recently been articulated in print.3 In the
context of the protein folding reaction, specificity for
a given polypeptide chain is reflected in the number
of distinct and well-populated conformations adopted
by the chain.4 The majority of native proteins
studied to date adopt a specific well-defined confor-
mation. The focus of this review is the relationship Kenneth P. Murphy was born in Lafayette, IN, in 1963. He received his
between the conformations of such proteins and the B.A. in Chemistry in 1986 from Metropolitan State College in Denver,
energetics of their stability. CO, and his Ph.D. in Chemistry from the University of Colorado, Boulder,
The identities of the noncovalent interactions con- in 1990. Following three years of postdoctoral studies at the Johns
Hopkins University, he was appointed Assistant Professor of Biochemistry
tributing to the stability of the native protein con- at the University of Iowa College of Medicine in 1993. His research has
formation have been established for some time,5 but focused on understanding the relationship between structure and energetics
considerable debate persists concerning whether and in protein stability and binding using calorimetry as a primary experimental
to what extent a given type of interaction favors the technique. He was awarded the Stig Sunner Memorial Award by the
native conformation.6-12 Configurational entropy is 50th Calorimetry Conference for his contributions to this field.
widely accepted as the major phenomenon opposing
protein stability, but the proposed values of this ers agree that the hydrophobic effect plays a key role
entropy range from about 17 J K-1 mol-1 per amino in stabilizing proteins, but a clear consensus defini-
acid residue to about 50 J K-1 mol-1 per residue.6,13 tion of the hydrophobic effect has not been
In contrast, Honig and Yang propose that the major reached.14-16 Nevertheless, many researchers agree
phenomenon opposing protein stability is desolvation that the hydrophobic effect contributes approximately
of polar groups upon protein folding.8 Most research- 8 kJ mol-1 per residue, on average, to the free energy
S0009-2665(96)00383-4 CCC: $28.00 © 1997 American Chemical Society
1252 Chemical Reviews, 1997, Vol. 97, No. 5 Robertson and Murphy

of unfolding of proteins at 25 °C.6,8,17 Hydrogen denatured state, D, and the native state, N. As the
bonding in proteins has been proposed to be some- experimental data in this review deal with thermal
what destabilizing,8 an indifferent or minor stabiliz- denaturation, the denatured state is operationally
ing force,11 and a principal contributor to the stability defined as the state of the protein that exists after
of the native state.6,9,12,18 thermal denaturation. The characteristics of that
Much of the disagreement derives from the neces- state, in terms of residual structure, extent of hydra-
sity of using models to interpret the thermodynamic tion, etc., remain a source of significant speculation
data for proteins in terms of specific features of and inquiry (see, e.g., refs 39-42).
protein structure.7,9 This follows from the fact that The equilibrium between the native and denatured
the number of experimental thermodynamic observ- states is defined as
ables in proteins is vanishingly small relative to the
thousands of interactions in a typical protein: in the K ) [D]/[N] (1)
best cases, the thermodynamic data consist of the and is related to the ∆Gu as
enthalpy of unfolding (∆Hu), the entropy of unfolding
(∆Su), and the heat capacity change upon unfolding ∆Gu ) -RT ln K (2)
(∆Cp). One can thus deconvolute the energetics of
protein stability with respect to atomic-level struc- where R is the universal gas constant and T is the
ture in a number of fundamentally different ways, absolute temperature. Note that eqs 1 and 2 apply
all of which will be compatible with the primary to the equilibrium between the native and denatured
thermodynamic data. states of a protein regardless of the possible presence
One approach to increasing and simplifying the of intermediate states.
information content relative to the thermodynamic The difference in Gibbs energy is dependent on
data has been to take advantage of the well- temperature according to
documented regularities in native protein struc-
tures.17,19-23 Data for many proteins of known struc- ∆Gu(T) ) ∆Hu(T) - T∆Su(T) (3)
ture have been used to derive empirical relationships
between the energetics of protein stability and fea- where ∆Hu and ∆Su are the differences in enthalpy
tures of protein structure.24-27 Similar relationships and entropy at the same temperature at which ∆Gu
have been established using thermodynamic data for is being evaluated.
model compounds, which have served as a basis for The temperature dependence of ∆Hu and ∆Su is
interpretation of and comparison with the protein defined by the heat capacity change, ∆Cp, between
data.12,28-37 the native and denatured states. The change in heat
All approaches to understanding the molecular capacity reflects the fact that the amount of heat
basis of protein stability ultimately depend on reli- required to raise the temperature of a solution of
able experimental determinations of the thermody- unfolded protein is greater than that required for a
namics of protein unfolding for proteins of known solution of folded protein of the same concentration.
structure. The number of proteins fulfilling this This increase in heat capacity upon unfolding results
criterion as of late 1996 is more than three times that primarily from restructuring of solvent.43,44 While
tabulated by either Privalov and Gill in 198838 or ∆Cp is itself slightly temperature dependent,45 the
Spolar and co-workers in 1992.27 In seeking relation- assumption of a constant ∆Cp does not lead to
ships between stability and structure, this expanded significant errors in any other parameter.38 The ∆Gu
database presents an opportunity to test the general- can thus be described as
ity of previous observations and the validity of ∆Gu(T) ) [∆Hu(TR) + ∆Cp(T - TR)] -
conclusions derived from these observations and,
perhaps, to identify trends that were not evident in T[∆Su(TR) + ∆Cp ln(T/TR)]
the smaller collection of proteins.
The focus of this review is on relationships between ) ∆Hu(TR) - T∆Su(TR) +
protein stability and protein structure that can be ∆Cp[(T - TR) - T ln(T/TR)] (4)
established with the primary observables, the ther-
modynamic parameters derived from calorimetric where TR is any convenient reference temperature.
and spectroscopic studies and the structural models If TR is equal to Tm, the midpoint for thermal
derived from X-ray crystallography and NMR spec- denaturation, then ∆Gu is equal to zero and ∆Su is
troscopy. This purely empirical approach will rely just ∆Hu/Tm. Thus eq 4 can be rewritten as

( )
on coarse but regular features of structure such as
T
solvent-exposed surface areas, secondary structure ∆Gu(T) ) ∆Hm 1 - + ∆Cp[(T - Tm) -
content, and numbers of disulfide bonds. The ques- Tm
tions at hand are (1) how much information regarding T ln(T/Tm)] (5)
the molecular origins of protein stability can be
gleaned from the protein data alone and (2) can these where ∆Hm is the value of ∆Hu at Tm. Equation 5 is
data be used to resolve some of the controversies now generally referred to as the modified Gibbs-Helm-
in the literature? holtz equation.
Experimental data are often fit to a modified form
II. Determining the Thermodynamics of Unfolding of eq 5 in which both sides are divided by -RT.
Experimental values of ln K as a function of temper-
for Globular Proteins ature can thus be fitted to yield values for Tm, ∆Hm,
The stability of a globular protein is quantified by and ∆Cp. It must be noted however that such a fit
the difference in Gibbs energy, ∆Gu, between the assumes that the experimental values are a true
Protein Structure and the Energetics of Protein Stability Chemical Reviews, 1997, Vol. 97, No. 5 1253

In spite of the many advantages of studying protein


stability by DSC, the technique has several limita-
tions. The sample concentration for typical DSC
experiments has needed to be at least 1 mg/mL. With
sample volumes of 1-2 mL, this requires that con-
siderable protein be available for study. The high
concentrations of protein may lead to difficulties
arising from aggregation of the denatured protein or,
possibly, self-association of the native state. Accurate
DSC studies thus require an assessment of the
concentration dependence of the thermodynamics.
Even with moderate concentrations of proteins, it
is important to determine that the unfolding transi-
tion is reversible before extracting thermodynamic
properties from the data. The usual test for revers-
Figure 1. Simulated differential scanning calorimetry ibility is to perform two DSC scans on each protein
experiment for the two-state unfolding of a globular and check that the second scan gives all (or most) of
protein. The simulation assumed the following values: Tm the endotherm observed in the first scan. However,
) 60 °C, ∆Hm ) 418 kJ mol-1, and ∆Cp ) 8.4 kJ K-1 mol-1.
the presence of an endotherm upon rescanning the
measure of K, which is only true if there are no stable sample is not a test of thermodynamic reversibility,
folding intermediates, as discussed below. but rather of repeatability. Thermodynamic revers-
ibility requires that the system be at (or very near)
A. Differential Scanning Calorimetry equilibrium throughout the reaction. As the equi-
librium is being perturbed by scanning in tempera-
Differential scanning calorimetry (DSC) is a pow- ture, thermodynamic reversibility in the DSC experi-
erful technique for obtaining data on the thermody- ment is better demonstrated by showing that the 〈Cp〉
namics of unfolding of globular proteins. Excellent function is independent of scan rate. Unfortunately,
reviews on this technique are available.46,47 DSC such tests of reversibility are rarely performed.
measures the excess heat capacity, 〈Cp〉, of a protein In summary, DSC is an excellent method for
solution relative to buffer as a function of tempera- obtaining thermodynamic data on the unfolding of
ture. The 〈Cp〉 function can be analyzed to provide globular proteins and can provide unique information
the thermodynamic data. As seen in Figure 1, the on the presence and characteristics of stable inter-
maximum in 〈Cp〉 occurs near the Tm of the protein; mediates. The technique is limited, however, by the
it occurs directly at Tm only if the ∆Cp of the requirements for large quantities of protein and high
transition is zero. The area under the 〈Cp〉 curve concentrations. Commercial instruments just avail-
gives the ∆Hm of the transition, and the shift in the able within the last year have higher sensitivity and
baseline yields ∆Cp. Thus, in principle, DSC can quality data can be obtained from samples at 1/10th
provide all of the thermodynamics of unfolding for a the concentration previously required. Such instru-
globular protein in a single experiment. mentation will greatly improve the utility of this
In practice, it is difficult to obtain good data on the important technique to protein scientists.
∆Cp of unfolding from the baseline shift. Instead,
several DSC experiments are performed in which the
Tm of the protein is perturbed, usually by changing B. Optical Spectroscopy
the pH. One then plots ∆Hm as a function of Tm and
the slope of this line gives ∆Cp. This analysis The thermodynamic parameters for the unfolding
assumes that changing pH has no effect on ∆Hu as a of a number of the proteins considered in this review
function of T; rather the effect of changing pH is were determined by monitoring thermal and chemi-
assumed to be entirely on ∆Su.24 This assumption cal denaturation with spectroscopic techniques.50-53
should be good at low pH because the enthalpies of For the Arc repressor and HPr, ∆Hm, ∆Cp, and Tm
ionizing acidic groups are generally quite small.48 The were obtained by the method of Pace and Laurents54
fact that ∆Su is dependent on pH has important or that of Chen and Schellman.55 Both methods rely
implications for interpreting the unfolding data as on detection of cold-induced denaturation or desta-
discussed below. bilization to obtain estimates for ∆Cp. Thermody-
One of the most important features of DSC data is namic parameters for OMTKY3 and iso-1 cyt c were
that the analysis does not require any assumptions obtained in a manner paralleling the usual calori-
about the presence or absence of stable intermediates metric approach: data from individual thermal de-
in the unfolding process. This is in contrast to optical naturation experiments were fit to obtain ∆Hm and
methods. Consequently, ∆Hm can be readily deter- Tm and variation of pH was used to determine the
mined. Additionally, the DSC data can be treated temperature dependence of ∆Hm, which is described
as a progress curve from which one can obtain the by ∆Cp.
van’t Hoff enthalpy, ∆HvH, using the same treatment The method of Pace and Laurents entails a com-
as described for optical methods. Comparison of bination of chemical and thermal denaturation ex-
∆HvH and the calorimetrically determined ∆Hm can periments, with the aim of measuring ∆Gu over a
be used to indicate the presence or absence of stable wide range of temperatures.54 In both thermal and
intermediates. The thermodynamic characteristics chemical denaturation experiments, ∆Gu is measured
of such intermediates, if present, can also be decon- over a narrow range of values, (6 kJ mol-1, where
voluted from the DSC data.49 the spectroscopic methods are able to detect changes
1254 Chemical Reviews, 1997, Vol. 97, No. 5 Robertson and Murphy

in the relative populations of native and denatured mental precision of the calorimetric data can be no
protein.54 The temperature dependence of ∆Gu is greater than 1 part in 50. In practice, the reproduc-
then fit to eq 5, the modified Gibbs-Helmholtz ibility in protein concentration is probably closer to
equation. 5%. Previous estimates for the minimum error in
The approach of Chen and Schellman involves determining ∆Cp range from 4% to 10%.54,59 Reported
thermal denaturation over a sufficient range of errors in determining ∆Hm range from 2% to 10%.60,61
temperature to detect heat- and cold-induced dena- In principle, the spectroscopic studies of denatur-
turation in a single thermal denaturation experi- ation and van’t Hoff analysis of calorimetric data do
ment.53,55 The data are also fit to the Gibbs- not depend on knowledge of protein concentration.
Helmholtz equation (eq 5). In the cases where this What is lost in this type of analysis is valuable
approach has been used, chemical denaturants were information concerning the possible presence of stable
added in order to observe low- and high-temperature intermediates. The least precise variable in the
transitions in the same experiment. In principle, the spectroscopic studies is likely to be the spectroscopic
fitted parameters thus reflect the thermodynamics observable. Although no systematic survey of preci-
of unfolding only in the presence of denaturant. For sion in such measurements has been published,
HPr, however, the ∆Cp obtained with this approach practical experience suggests that, at best, the preci-
was identical to that obtained using other proce- sion for a given determination may be 1 part in 100;
dures.53 The ∆Cp for the mutant T4 lysozyme studied a more accurate value may be 1 part in 20. For both
by Chen and Schellman was 9.1 kJ K-1 mol-1, similar calorimetric and spectroscopic experiments, the over-
to that obtained in the calorimetric study of wild-type
all precision for any determination is probably best
protein (Table 1).
assessed by evaluation of the fitting errors.62
One major advantage in the use of spectroscopy
over DSC to determine the thermodynamics of pro- The question of accuracy in the thermodynamic
tein unfolding is that much less protein is needed in parameters of unfolding is perhaps best addressed
the spectroscopic experiments. Sample concentra- by comparing multiple determinations for the same
tions can be as low as 0.01 mg/mL and a wider range protein (Tables 1 and 2). To some extent, this will
of concentrations can be examined, which can serve control for some of the systematic errors within
as a check for self-association reactions. Two sig- laboratories that might be associated with, for ex-
nificant disadvantages with spectroscopy are the lack ample, determining protein concentrations. For nine
of direct measures for intermediates in the unfolding of the 11 proteins for which there are multiple
process and the critical role of pre- and posttransition determinations, experiments have been performed in
baselines in fitting to obtain the thermodynamic different laboratories, but usually under similar
parameters. solution conditions. For the present discussion, rela-
The concern about baselines follows from the way tive differences in thermodynamic parameters have
in which progress through the unfolding transition been evaluated by dividing the difference between
is determined: pre- and posttransition baselines are reported values by the smaller of the reported values.
extrapolated into the observable transition zone and Three determinations are available for hen lysozyme
the relative concentrations of native and denatured and RNase A, so relative differences have been
protein are determined from the distances between calculated by dividing the standard deviation of the
the observed and extrapolated spectral values.54 For mean by the mean value.
proper evaluation of fitting errors, terms for baselines The relative differences in ∆Cp values range from
should be included in any equation used to fit the zero to about 80% for whale myoglobin, and the mean
spectroscopic data.56 relative difference is 14 ( 22%. The relative differ-
Nearly all spectroscopic studies rely on the as- ence for whale myoglobin is about four times the next
sumption of a two-state unfolding reaction. Spectro- largest difference, 19% for RNase A, and the mean
scopic tests for intermediates involve using multiple relative difference excluding whale myoglobin is 7 (
probes to follow the unfolding reaction,57 but a 6%. This value is very similar to previous estimates
negative result is only consistent with, and not proof for uncertainties in ∆Cp.54,59 Interestingly, whale
of, the absence of stable intermediates. It should be myoglobin is the only protein for which the indepen-
noted that issues of repeatability and scan rate dent determinations have been made under very
dependence discussed above in the context of DSC different solution conditions: one set of experiments
apply equally to spectroscopic techniques. were performed at acid pH while the second set were
done at alkaline pH.
C. Precision and Accuracy of Thermodynamic To facilitate comparison of ∆Hm values obtained at
Data different temperatures, the reported values have
In DSC experiments with modern calorimeters, the been extrapolated to 60 °C and reported as ∆H(60)
least precise variable is probably protein concentra- in Table 2. While this procedure propagates some
tion. The sources of uncertainty in determining of the deviations in ∆Cp values into ∆H(60), the
protein concentration are the precision of a given contributions are generally small because the ex-
method, the reproducibility of the method, and trapolations are over a short range of temperature.
systematic deviations between different methods. The For the 11 proteins for which multiple determina-
results of a recent investigation into various tech- tions have been made, the relative differences in
niques for determining concentrations and extinction ∆H(60) values range from 1% for OMTKY3 to 35%
coefficients for proteins suggest that, in the best for R-lactalbumin. The mean relative difference for
cases, the reproducibility in determining extinction multiple determinations is 12 ( 10%, which is in the
coefficients is about 2%.58 Thus, the overall experi- range of estimated experimental error.35
Protein Structure and the Energetics of Protein Stability Chemical Reviews, 1997, Vol. 97, No. 5 1255

Table 1. Thermodynamics of Unfolding for Globular Proteins of Known Structure


∆Hm, ∆Cp, ∆Sm, ∆Hm, ∆Cp, ∆Sm,
T m, kJ kJ K-1 J K-1 T m, kJ kJ K-1 J K-1
name of protein pH °C mol-1 mol-1 mol-1 name of protein pH °C mol-1 mol-1 mol-1
R-chymotrypsina unknown 60 710 12.8 2573 lysozyme (holo equine; 4.5 66.2 133 2.5 393
R-chymotrypsinogenb 5 62 619 14.5 1847 transition 2)bb
R-lactalbuminc,d 5.2 25 -2.5 7.5 -8 lysozyme (hen)cc unknown 60 427 6.3 1281
R-lactalbumine,i 8 25 133 7.6 446 lysozyme (hen)dd unknown 64.05 435 6.4 1289
acyl carrier protein (apo)f 6.1 52.7 160 3.3 492 lysozyme (hen)ee 2 55 429 6.7 1307
acyl carrier protein (holo)f 6.1 64.3 266 6.4 787 lysozyme T4ff 2.84 51.2 507 10.1 1562
arabinose binding 7.4 59 840 13.2 2528 met repressorgg 7 53.2 505 8.9 1547
proteing myoglobin (horse)hh 11.2 62 409 7.6 1220
arc repressorh,i 7.3 54 297 6.7 908 myoglobin (whale)hh 9.5 85 837 15.6 2336
B1 of protein Gj 5.4 87.5 258 2.6 715 myoglobin (whale)ii 4.75 80.1 575 8.8 1628
B2 of protein Gj 5.4 79.4 238 2.9 675 OMTKY3jj 3.0 72.5 207 2.7 599
barnasek 5.5 55.1 500 5.8 1523 OMTKY3kk 4.51 85.2 240 2.6 670
barnasel 5 53.7 546 6.8 1670 papainll 3.8 83.8 904 13.7 2532
barstarm 8 69.9 292 6.2 851 parvalbuminmm 7 90 500 5.6 1377
BPTIn 4 104 317 2.0 841 pepsinnn 5.9 63 1126 18.8 3348
carbonic anhydrase Bo unknown 60 725 16.0 2218 pepsinogennn 6 66 1134 24.1 3344
CI2p 3.5 73.8 280 2.5 808 plasminogen K4 domainoo 7.4 62 315 5.2 940
cyt b5 (tryptic fragment)q 7 70 332 6.0 968 RNase T1d,pp unknown 25 249 4.9 836
cyt c (horse)r unknown 60 393 5.0 1180 RNase T1qq 5 61.2 508 4.9 1519
cyt c (horse)s unknown 60 307 5.3 922 RNaseAl 6 59 372 6.6 1121
cyt c (yeast isozyme 1)t,i 5 55.4 360 5.7 1096 RNaseArr 5.5 61.9 457 4.8 1365
cyt c (yeast isozyme 1)u 6 56.2 293 5.2 888 RNaseAss 5.47 64 481 4.8 1360
cyt c (yeast isozyme 2)u 6 54.5 282 5.2 861 ROPtt 6 71 580 10.3 1685
GCN4v 7 70 259 3.0 1512 Sac7duu 6 90.9 231 3.6 635
HPrw,i 7 (?) 73.4 248 4.9 715 SH3 spectrinvv 4 66 197 3.3 581
IL-1βx 3 53 351 8.0 1076 Staphylococcus 7 54 337 9.3 1029
lac repressor headpiecey 8 65 118 1.3 349 nucleaseww
lysozyme (human)z 4.5 80.3 579 7.2 1638 stefin Axx 5 90.8 473 7.4 1300
lysozyme (human)aa 2.8 68.8 503 6.6 1470 stefin Bxx 5 50.2 293 6.7 906
lysozyme (apo equine; 4.5 41.5 154 7.6 488 subtilisin inhibitoryy 3.07 50.2 313 8.5 966
transition 1)bb subtilisin BPN′zz 8 58.5 370 20.1 1114
lysozyme (apo equine; 4.5 66.44 124 2.6 365 tendamistataaa ∼5 93 307 2.9 838
transition 2)bb thioredoxinbbb 7 87.1 411 7.0 1139
lysozyme (holo equine; 4.5 54.73 205 7.4 624 thioredoxinccc 6.5 86.4 444.0 7.4 1235
transition 1)bb trp repressorddd 7.5 90.3 448 6.1 1232
ubiquitineee 4 90 308 3.3 848
a Tischenko, V. M.; Tiktopulo, E. I.; Privalov, P. L. Biofizika (USSR) 1974, 19, 400. b Privalov, P. L.; Khechinashvili, N. N.;

Atanasov, B. P. Biopolymers 1971, 10, 1865. c Griko, Y. V.; Freire, E.; Privalov, P. L. Biochemistry 1994, 33, 1889. d The
thermodynamics were obtained from a global fit of data and are reported at 25 °C. e Xie, D.; Bhakuni, V.; Freire, E. Biochemistry
1991, 30, 10673. f Horvath, L. A.; Sturtevant, J. M.; Prestegard, J. H. Protein Sci. 1994, 3, 103. g Fukada, H.; Sturtevant, J. M.;
Quiocho, F. A. J. Biol. Chem. 1983, 258, 13193. h Reference 50. i Determined from optically monitored thermal melts. j Alexander,
P.; Fahnestock, S.; Lee, T.; Orban, J.; Bryan, P. Biochemistry 1992, 31, 3597. k Griko, Y. V.; Makhatadze, G. I.; Privalov, P. L.;
Hartley, R. W. Protein Sci. 1994, 3, 669. l Martinez, J. C.; El Harrous, M.; Filimonov, V. V.; Mateo, P. L.; Fersht, A. R. Biochemistry
1994, 33, 3919. m Agashe, V. R.; Udgaonkar, J. B. Biochemistry 1995, 34, 3286. n Makhatadze, G. I.; Kim, K.-S.; Woodward, C.;
Privalov, P. L. Protein Sci. 1993, 2, 2028. o Tatunashvili, L. V.; Privalov, P. L. Biofizika (USSR) 1986, 31, 578. p Jackson, S. E.;
Moracci, M.; elMasry, N.; Johnson, C. M.; Fersht, A. R. Biochemistry 1993, 32, 11259. q Pfeil, W.; Bendzko, P. Biochim. Biophys.
Acta 1980, 626, 73. r Potekhin, S.; Pfeil, W. Biophys. Chem. 1989, 34, 55. s Hagihara, Y.; Tan, Y.; Goto, Y. J. Mol. Biol. 1994, 237,
336. t Reference 52. u Liggins, J. R.; Sherman, F.; Mathews, A. J.; Nall, B. T. Biochemistry 1994, 33, 9209. v Thompson, K. S.;
Vinson, C. R.; Shuman, J. D.; Freire, E. Biochemistry 1993, 32, 5491. w Reference 53. x Makhatadze, G. I.; Clore, G. M.; Gronenborn,
A. M.; Privalov, P. L. Biochemistry 1994, 33, 9327. y Hinz, H.-J.; Cossman, M.; Beyreuther, K. FEBS Letts. 1981, 129, 246. z Kuroki,
K.; Taniyama, Y.; Seko, C.; Nakamura, H.; Kikuchi, M.; Ikehara, M. Proc. Natl. Acad. Sci. U.S.A. 1989, 86, 6903. aa Herning, T.;
Yutani, K.; Inaka, K.; Kuroki, R.; Matsushima, M.; Kikuchi, M. Biochemistry 1992, 31, 7077. bb Griko, Y. V.; Freire, E.; Privalov,
G.; Van Dael, H.; Privalov, P. L. J. Mol. Biol. 1995, 252, 447. cc Cooper, A.; Eyles, S. J.; Radford, S. E.; Dobson, C. M. J. Mol. Biol.
1992, 225, 939. dd Schwarz, F. P. Thermochim. Acta 1989, 147, 71. ee Pfeil, W.; Privalov, P. L. Biophys. Chem. 1976, 4, 23. ff Connelly,
P. R.; Ghosaini, L.; Hu, C.-Q.; Kitamura, S.; Tanaka, A.; Sturtevant, J. M. Biochemistry 1991, 30, 1887. gg Johnson, C. M.; Cooper,
A.; Stockley, P. G. Biochemistry 1992, 31, 9717. hh Kelly, L.; Holladay, L. A. Biochemistry 1990, 29, 5062. ii Privalov, P. L.; Griko,
Y. V.; Venyaminov, S. Y.; Kutyshenko, V. P. J. Mol. Biol. 1986, 190, 487. jj Swint, L.; Robertson, A. D. Protein Sci. 1993, 2, 2037.
kk Swint-Kruse, L.; Robertson, A. D. Biochemistry 1995, 34, 4724. ll Tiktopulo, E. I.; Privalov, P. L. FEBS Lett. 1978, 91, 57.
mm Filimonov, V. V.; Pfeil, W.; Tsalkova, T. N.; Privalov, P. L. Biophys. Chem. 1978, 8, 117. nn Privalov, P. L.; Mateo, P. L.;

Khechinashvili, N. N.; Stepanov, V. M.; Revina, L. P. J. Mol. Biol. 1981, 152, 445. oo Novokhatny, V. V.; Kudinov, S. A.; Privalov,
P. L. J. Mol. Biol. 1984, 179, 215. pp Plaza del Pino, I. M.; Pace, C. N.; Freire, E. Biochemistry 1992, 31, 11196. qq Yu, Y.; Makhatadze,
G. I.; Pace, C. N.; Privalov, P. L. Biochemistry 1994, 33, 3312. rr Straume, M.; Freire, E. Anal. Biochem. 1992, 203, 259. ss Privalov,
P. L.; Tiktopulo, E. I.; Khechinashvili, N. N. Int. J. Pept. Protein Res. 1973, 5, 229. tt Steif, C.; Hinz, H.-J.; Cesareni, G. Proteins:
Struct., Funct., Genet. 1995, 23, 83. uu McCrary, B. S.; Edmondson, S. P.; Shriver, J. W. J. Mol. Biol. 1996, 264, 784. vv Viguera,
A. R.; Martinez, J. C.; Filimonov, V. V.; Mateo, P. L.; Serrano, L. Biochemistry 1994, 33, 2142. ww Tanaka, A.; Flanagan, J.;
Sturtevant, J. M. Protein Sci. 1993, 2, 567. xx Zerovnik, E.; Lohner, K.; Jerala, R.; Laggner, P.; Turk, V. Eur. J. Biochem. 1992,
210, 217. yy Tamura, A.; Kimura, K.; Takahara, H.; Akasaka, K. Biochemistry 1991, 30, 11307. zz Pantoliano, M. W.; Whitlow, M.;
Wood, J. F.; Dodd, S. W.; Hardman, K. D.; Rollence, M. L.; Bryan, P. N. Biochemistry 1989, 28, 7205. aaa Renner, M.; Hinz, H.-J.;
Scharf, M.; Engels, J. W. J. Mol. Biol. 1992, 223, 769. bbb Santoro, M. M.; Bolen, D. W. Biochemistry 1992, 31, 4901. ccc Ladbury,
J. E.; Wynn, R.; Hellinga, H. W.; Sturtevant, J. M. Biochemistry 1993, 32, 7526. ddd Bae, S. J.; Chou, W. Y.; Matthews, K.; Sturtevant,
J. M. Proc. Natl. Acad. Sci. U.S.A. 1988, 85, 6731. eee Wintrode, P. L.; Makhatadze, G. I.; Privalov, P. L. Proteins: Struct., Funct.,
Genet. 1994, 18, 246.

The ∆Sm values are dependent upon pH, as de- ditional source of error when comparing ∆Sm or
scribed above, and this should introduce an ad- ∆S(60) values obtained from independent studies. In
1256 Chemical Reviews, 1997, Vol. 97, No. 5 Robertson and Murphy

Table 2. Thermodynamic Parameters Used for most sets of independent determinations were made
Regression Analysisa at similar pH values (Table 1).
name of protein ∆Cp ∆H(60) ∆S(60) ∆H* ∆S*
R-chymotrypsin 12.8 709 2570 1230 4420 III. Correlation of Unfolding Thermodynamics with
R-chymotrypsinogen 14.5 590 1760 1180 3860 Protein Structure
R-lactalbumin 7.5 260 824 564 1910
7.6 400 1292 708 2400
A. Database of Unfolding Thermodynamics for
R-lactalbumin
acyl carrier protein (apo) 3.3 185 566 320 1050
acyl carrier protein (holo) 6.4 238 705 499 1640 Proteins of Known Structure
arabinose binding protein 13.2 853 2568 1390 4480
arc repressor 6.7 337 1029 608 2000 For this review, the minimal criteria for selection
B1 of protein G 2.6 187 509 292 886 of a protein for consideration are (1) ∆Hm, ∆Cp, and
B2 of protein G 2.9 182 511 299 932 Tm values have been published, (2) the unfolding
barnase 5.8 528 1609 762 2450
barnase 6.8 589 1800 864 2790 reaction is reversible, and (3) a structural model for
barstar 6.2 230 669 483 1570 the protein, or a closely related protein, has been
BPTI 2.0 229 592 310 882 deposited in the Protein Data Bank (PDB).63,64 Ther-
carbonic anhydrase B 16.0 725 2218 1370 4530 modynamic parameters for the unfolding of 49 dif-
CI2 2.5 246 706 347 1070
cyt b5 (tryp frag) 6.0 272 790 515 1660
ferent proteins are assembled in Table 1. For 11
cytochrome c (horse) 5.0 393 1180 596 1910 different proteins, at least two independent deter-
cytochrome c (horse) 5.3 307 922 523 1700 minations either from different laboratories or made
cytochrome c (yeast iso 1) 5.7 386 1180 617 2000 using alternative methods are included. The ∆Hm
cytochrome c (yeast iso 1) 5.2 312 948 523 1700 and Tm values generally correspond to values ob-
cytochrome c (yeast iso 2) 5.2 311 947 521 1700
GCN4 3.0 230 668 350 1100 tained under conditions of maximal stability and ∆Sm
HPr 4.9 183 524 379 1230 values have been calculated by dividing ∆Hm by Tm.
IL-1β 8.0 407 1250 731 2410 This database is a work in progress and the authors
lac repressor headpiece 1.3 112 330 164 518 invite corrections and additions to Table 1.
lysozyme (human) 7.2 434 1220 724 2250
lysozyme (human) 6.6 444 1300 712 2250 To put the thermodynamic parameters on a similar
lysozyme (apo equine)b 7.6 402 1610 709 2710 footing for correlation with features of protein struc-
lysozyme (holo equine)b 7.4 361 1450 661 2530 ture, ∆Hm and ∆Sm at 60 °C (∆Hu(60) and ∆Su(60))
lysozyme (hen) 6.3 427 1280 682 2190 have been calculated using the experimental values
lysozyme (hen) 6.4 409 1210 668 2140
lysozyme (hen) 6.7 462 1410 733 2380
and ∆Cp (Table 2). This temperature was chosen
lysozyme T4 10.1 595 1830 1000 3300 because it has been used in previous studies and
met repressor 8.9 566 1730 928 3030 because it is close to the mean and median Tm values,
myoglobin (horse) 7.6 394 1180 703 2280 65.5 ((2.0) °C and 62.5 °C, respectively, reported in
myoglobin (whale) 15.6 447 1210 1080 3470 Table 1. Adjustment of ∆Hm and ∆Sm from experi-
myoglobin (whale) 8.8 399 1120 754 2380
OMTKY3 2.7 173 500 283 891 mental Tm values to 60 °C means extrapolating over
OMTKY3 2.6 175 481 280 857 as much as 44 °C, but most experimental Tm values
papain 13.7 578 1590 1130 3570 are much closer to 60 °C: the mean deviation of the
parvalbumin 5.6 332 894 559 1706 experimental Tm values from 60 °C is 5.5°.
pepsin 18.8 1069 3180 1830 5910
pepsinogen 24.1 989 2910 1970 6410 When seeking patterns in diverse collections of
plasminogen K4 domain 5.2 305 909 516 1670 protein structures, two of the most widely used
RNase T1 4.9 419 1380 616 2080 regular features of protein structure are solvent-
RNase T1 4.9 502 1500 699 2210 accessible surface areas20,25,36,37,65-67 and secondary
RNaseA 6.6 379 1140 645 2090
RNaseA 4.8 462 1300 656 2000
structure.19,21,68 Tables 3 and 4 summarize these
RNaseA 4.8 448 1340 643 2040 structural features for the proteins whose thermo-
ROP 10.3 467 1350 884 2840 dynamic parameters are reported in Table 1 and 2.
Sac7d 3.6 120 316 265 837 All of the thermodynamic values reported in Table 2
SH3 spectrin 3.3 178 523 309 994 are used in the regression analyses discussed through-
Staphylococcus nuclease 9.3 392 1200 767 2540
stefin A 7.4 245 645 545 1720 out the remainder of the review. In cases where
stefin B 6.7 359 1110 630 2080 there are multiple thermodynamic entries in Table
subtilisin inhibitor 8.5 395 1220 738 2440 2, but a single structural entry in Table 3, each of
subtilisin BPN′ 20.1 400 1210 1214 4120 the experimental entries were regressed against the
tendamistat 2.9 212 565 329 985
thioredoxin 7.0 222 596 504 1600
same structural values. In those cases where mul-
thioredoxin 7.4 249 673 548 1740 tiple structure and thermodynamic entries are given,
trp repressor 6.1 263 701 510 1590 the thermodynamic entries were regressed against
ubiquitin 3.3 208 561 343 1040 structural entries in the same order in which they
a For cases in which the proteins are derived from different are given in Tables 2 and 3.
species, the order here is the same as in Table 1. ∆H(60) and For the proteins in Table 3, the reported surface
∆S(60) are the ∆H and ∆S of unfolding at 60 °C. ∆H* is the area is the sum of the differences (∆A) between the
∆H of unfolding at 100 °C and ∆S* is the ∆S of unfolding at surface of each residue in the native protein and the
112 °C. All units are as in Table 1. b Combined data for
transitions 1 and 2. solvent accessible surface area of the same type of
amino acid residue in an Ala-Xaa-Ala extended
fact, the range of relative differences in ∆S(60) tripeptide, corrected for the effects of termini. All
values, 4-36%, is similar to that for ∆H(60). The carbon atoms are classified as apolar, while all non-
mean relative difference is 15 ((9)%, which is again carbon atoms are classified as polar. Thus the total
quite similar to the mean and standard deviations change in accessible surface area, ∆Atot, is divided
seen for ∆H(60). The lack of significant additional into the change in apolar surface area, ∆Aap, and the
uncertainty in ∆S(60) may result from the fact that change in polar surface area, ∆Apol. For the native
Protein Structure and the Energetics of Protein Stability Chemical Reviews, 1997, Vol. 97, No. 5 1257

Table 3. Surface Area Changes for the Set of Proteins Used for the Regression Analysisa
PDB ∆Aap, ∆Apol, ∆Atot, PDB ∆Aap, ∆Apol, ∆Atot,
name of protein file Nres Å2 Å2 Å2 name of protein file Nres Å2 Å2 Å2
R-chymotrypsina 5CHA 237 13808 8648 22456 met repressorbb 1CMB 208 12030 8503 20533
R-chymotrypsinogenb 2CGA 245 14012 9127 23139 myoglobin (horse)cc 1YMB 153 8884 5523 14407
R-lactalbuminc 1HMLd 123 7027 4719 11746 myoglobin (whale)dd 4MBN 153 8873 5927 14800
R-lactalbumine 1ALCf 122 6773 4814 11586 myoglobin (whale) 1MBO 153 9143 5679 14822
acyl carrier proteing 1ACP 77 3346 2755 6101 OMTKY3ee 2OVO 56 2162 1874 4036
arabinose binding proteinh 1ABE 305 19374 12160 31534 papainff 9PAP 212 13071 8692 21762
arc repressori 1ARR 106 5503 4633 10136 parvalbumingg 5CPV 108 5750 4006 9756
B1 of protein Gj 1PGB 56 2712 1944 4655 pepsinhh 5PEP 326 19584 11717 31301
B2 of protein Gk 1PGX 56 2981 2117 5098 pepsinogenii 3PSG 365 22811 14298 37108
barnasel 1BNI 108 6190 4325 10515 plasminogen K4 domainjj 1PMK 78 3801 3408 7209
barnasel 1BNJ 109 6137 4281 10417 RNase T1kk 9RNT 104 5049 3828 8878
barstar,m 1BTA 89 5506 2835 8341 RNase T1ll 8RNT 104 5126 3812 8938
BPTIn 5PTI 58 2715 1956 4671 RNaseAmm 3RN3 124 5802 5468 11271
carbonic anhydrase Bo 2CAB 256 15949 10591 26540 ROPnn 1RPR 126 6195 6737 12932
CI2p 1COA 64 3368 2198 5566 Sac7doo 1SAP 66 3357 2509 5866
cyt b5 (tryp frag)q 1CYO 88 4341 3109 7449 SH3 spectrinpp 1SHG 57 3284 1994 5278
cytochrome c (horse)r 1HRC 104 5716 3788 9504 Staphylococcus nucleaseqq 1STN 136 8049 5173 13222
cytochrome c (yeast iso 1)s 1YCC 108 5669 4074 9743 stefin Arr 1CYV 98 5120 3635 8755
cytochrome c (yeast iso 2)t 1YEA 112 5630 4320 9950 stefin Bss 1STFtt 95 5217 3508 8725
GCN4u 2ZTA 62 2939 2364 5303 subtilisin inhibitoruu 3SICvv 107 4975 3568 8543
HPrv 2HPR 87 4555 3035 7590 subtilisin BPN′ ww 2ST1 275 15672 10308 25980
IL-1βw 6I1B 153 8817 5165 13982 tendamistatxx 3AIT 74 3338 2784 6122
lac repressor headpiecex 1LCD 51 2291 1622 3913 thioredoxinyy 2TRX 108 6317 3464 9781
lysozyme (human)y 1LZ1 130 7330 5548 12877 trp repressorzz 2WRP 105 6146 4122 10268
lysozyme (hen) 1LYS 129 7024 5315 12339 trp repressorzz 3WRP 101 5956 3953 9909
lysozyme (equine)z 2EQL 129 7147 5564 12711 ubiquitinaaa 1UBQ 76 4112 2606 6717
lysozyme T4aa 2LZM 164 9709 6709 16418
† The PDB file identifiers are taken from the Brookhaven Protein Data Bank.58,59 Number of residues, N , and ∆A values
res
were determined as described in the text. a Blevins, R. A.; Tulinsky, A. J. Biol. Chem. 1985, 20, 4264. b Wang, D.; Bode, W.;
Huber, R. J. Mol. Biol. 1985, 185, 595. c Ren, J.; Acharya, K. R.; Stuart, D. I. J. Biol. Chem. 1993, 268, 19292. d X-ray structure
is for the human protein. Sequence of the human protein differs from the bovine protein at 31 out of 123 residues. e Acharya, K.
R.; Ren, J.; Stuart, D. I.; C., P. D.; Fenna, R. E. J. Mol. Biol. 1991, 221, 571. f X-ray structure is for the baboon protein. Sequence
of the baboon protein differs from the bovine protein at 37 out of 123 residues. g Kim, Y.; Prestegard, J. H. Proteins: Struct.,
Func., Genet. 1990, 8, 377. h Vyas, N. K.; Quiocho, F. A. Nature 1984, 310, 381. i Bonvin, A. M. J. J.; Vis, H.; Burgering, M. J. M.;
Breg, J. N.; Boelens, R.; Kaptein, R. J. Mol. Biol. 1994, 236, 328. j Gallagher, T.; Alexander, P.; Bryan, P.; Gilliland, G. L.
Biochemistry 1994, 33, 4721. k Achari, A.; Hale, S. P.; Howard, A. J.; Clore, G. M.; Gronenborn, A. M.; Hardman, K. D.; Whitlow,
M. Biochemistry 1992, 31, 10449. l Buckle, A. M.; Henrick, K.; Fersht, A. R. J. Mol. Biol. 1993, 234, 847. m Lubienski, M. J.;
Bycroft, M.; Freund, S. M. V.; Fersht, A. R. Biochemistry 1994, 33, 8866. n Wlodawer, A.; Walter, J.; Huber, R.; Sjolin, L. J. Mol.
Biol. 1984, 180, 301. Wlodawer, A.; Nachman, J.; Gilliland, G. L.; Gallagher, W.; Woodward, C. J. Mol. Biol. 1987, 198, 469.
o Kannan, K. K.; Ramanadham, M.; Jones, T. A. Ann. N. Y. Acad. Sci. 1984, 429, 49. p Jackson, S. E.; Moracci, M.; elMasry, N.;

Johnson, C. M.; Fersht, A. R. Biochemistry 1993, 32, 11259. q Mathews, F. S.; Argos, P.; Levine, M. Cold Spring Harbor Symp.
Quant. Biol. 1972, 36, 387. r Bushnell, G. W.; Louie, G. V.; Brayer, G. D. J. Mol. Biol. 1990, 214, 585. s Louie, G. V.; Brayer, G.
D. J. Mol. Biol. 1990, 214, 527. t Murphy, M. E. P.; Nall, B. T.; Brayer, G. D. J. Mol. Biol. 1992, 227, 160. u O’Shea, E. K.; Klemm,
J. D.; Kim, P. S.; Alber, T. Science 1991, 254, 539. v Liao, D.-I.; Herzberg, O. Structure 1994, 2, 1203. w Clore, G. M.; Wingfield,
P. T.; Gronenborn, A. M. Biochemistry 1991, 30, 2315. x Chuprina, V. P.; Rullman, J. A. C.; Lamerichs, R. M. J. N.; Van Boom,
J. H.; Boelens, R.; Kaptein, R. J. Mol. Biol. 1993, 234, 446. y Artymiuk, P. J.; Blake, C. C. F. J. Mol. Biol. 1981, 152, 737. z Tsuge,
H.; Ago, H.; Noma, M.; Nitta, K.; Sugai, S.; Miyano, M. J. Biochem. 1992, 141, 111. aa Weaver, L. H.; Matthews, B. W. J. Mol.
Biol. 1987, 193, 189. bb Rafferty, J. B.; Somers, W. S.; Saint-Girons, I.; Phillips, S. E. V. Nature 1989, 341, 705. cc Evans, S. V.;
Brayer, G. D. J. Mol. Biol. 1990, 213, 885. dd Takano, T. In Methods and Applications in Crystallographic Computing; Oxford
University Press: Oxford, 1984. ee Bode, w.; Epp, O.; Huber, R.; Laskowski, M., Jr.; Ardelt, W. Eur. J. Biochem. 1985, 147, 387.
X-ray structure is for silver pheasant which differs from the turkey sequence at one residue. ff Kamphuis, I. G.; Kalk, K. H.;
Swarte, M. B. A.; Drenth, J. J. Mol. Biol. 1984, 179, 233. gg Swain, A. L.; Kretsinger, R. H.; Amma, E. L. J. Biol. Chem. 1989, 264,
16620. hh Cooper, J. B.; Khan, G.; Taylor, G.; Tickle, I. J.; Blundell, T. L. J. Mol. Biol. 1990, 214, 199. ii Hartsuck, J. A.; Koelsch,
G.; Remington, S. J. Proteins 1992, in press. jj Padmanabhan, K.; Wu, T.-P.; Ravichandran, K. G.; Tulinsky, A. Protein Sci. 1994,
3, 898. kk Martinez-Oyanedel, J.; Choe, H.-W.; Heinemann, U.; Saenger, W. J. Mol. Biol. 1991, 222, 335. ll Ding, J.; Choe, H.-W.;
Granzin, J.; Saenger, W. Acta Crystallogr., Sect. B 1992, 48, 185. mm Howlin, B.; Moss, D. S.; Harris, G. W. Acta Crystallogr.,
Sect. A 1989, 45, 851. nn Eberle, W.; Pastore, A.; Sander, C.; Roesch, P. J. Biomol. NMR 1991, 1, 71. oo Edmondson, S. P.; Qiu, L.;
Shriver, J. W. Biochemistry 1995, 34, 13289. pp Musacchio, A.; Noble, M.; Pauptit, R.; Wierenga, R.; Saraste, M. Nature 1992,
359, 851. qq Hynes, T. R.; Fox, R. O. Proteins: Struct., Funct., Genet. 1991, 10, 92. rr Tate, S.; Ushioda, T.; Utsunomiya-Tate, N.;
Shibuya, Y.; Ohyama, Y.; Nakano, Y.; Kaji, H.; Inagaki, F.; Samejima, T.; Kainosho, M. Biochemistry 1995, 34, 14637. ss Stubbs,
M. T.; Laber, B.; Bode, W.; Huber, R.; Jerala, R.; Lenarcic, B.; Turk, V. EMBO J. 1990, 9, 1939. tt Taken from the complex with
papain. uu Takeuchi, Y.; Noguchi, S.; Satow, Y.; Kojima, S.; Kumagai, I.; Miura, K.-I.; Nakamura, K. T.; Mitsui, Y. Protein Eng.
1991, 4, 501. vv Taken from the complex with subtilisin. ww Bott, R.; Ultsch, M.; Kossiakoff, A.; Graycar, T.; Katz, B.; Power, S. J.
Biol. Chem. 1988, 263, 7895. xx Billeter, M.; Schaumann, T.; Braun, W.; Wüthrich, K. Biopolymers 1990, 29, 695. yy Katti, S. K.;
LeMaster, D. M.; Eklund, H. J. Mol. Biol. 1990, 212, 167. zz Lawson, C. L.; Zhang, R.-G.; Schevitz, R. W.; Otwinowski, Z.; Joachimiak,
A.; Sigler, P. B. Proteins: Struct., Funct., Genet. 1988, 3, 18. aaa Vijay-Kumar, S.; Bugg, C. E.; Cook, W. J. J. Mol. Biol. 1987, 194,
531.

structure, the algorithm of Lee and Richards,65 as explicitly but instead are included by using slightly
implemented in the program ACCESS (Scott R. increased atomic radii for atoms covalently bonded
Presnell, University of California at San Francisco), to hydrogens.65 Consequently, hydrogens from NMR-
has been used to determine the solvent-accessible derived structures are ignored in the calculation.
surface area using a probe radius of 1.4 Å and a slice The appropriate solvent-accessible surface area for
width of 0.25 Å. The calculations use whole-atom the denatured protein is a subject of continuing
atomic radii, i.e., hydrogen atoms are not considered discussion.69 The use of a single standard model for
1258 Chemical Reviews, 1997, Vol. 97, No. 5 Robertson and Murphy

Table 4. Seconday Structure and Disulfide Bonds in


the Set of Globular Proteinsa
total
no. of helix, strand, turn, other,
name of protein disulfides % % % %
R-chymotrypsin 5 11.9 33.5 36.9 17.8
R-chymotrypsinogen 5 11.4 33.5 28.6 26.5
R-lactalbumin 4 47.2 8.9 21.1 22.8
R-lactalbumin 4 38.5 6.6 25.4 29.5
acyl carrier protein 0 26.0 0.0 42.9 31.2
arabinose binding protein 0 45.6 20.7 12.1 21.6
arc repressor 0 26.4 4.7 11.3 57.5
B1 of protein G 0 26.8 42.9 14.3 16.1
B2 of protein G 0 26.8 46.4 14.3 12.5
barnase 0 24.1 24.1 25.0 26.9
barnase 0 23.9 22.9 27.5 25.7
barstar 0 47.2 18.0 18.0 16.9
BPTI 3 20.7 24.1 13.8 41.4
carbonic anhydrase B 0 16.4 31.3 25.8 26.6
CI2 0 17.2 28.1 35.9 18.8
cyt b5 (tryp frag) 0 35.2 21.6 26.1 17.0
cytochrome c (horse) 0 35.6 3.8 28.8 31.7
cytochrome c (yeast iso 1) 0 34.3 3.7 26.9 35.2
cytochrome c (yeast iso 2) 0 39.3 3.6 22.3 34.8
GCN4 0 93.5 0.0 0.0 6.5
HPr 0 39.1 27.6 16.1 17.2
IL-1β 0 5.2 47.1 30.1 17.6
lac repressor headpiece 0 56.9 0.0 11.8 31.4
lysozyme (human) 4 43.8 10.8 33.8 11.5
lysozyme (hen) 4 41.1 9.3 35.7 14.0
lysozyme (equine) 4 43.4 9.3 31.0 16.3
lysozyme T4 0 66.5 8.5 5.5 19.5 Figure 2. Correlation of surface area changes with protein
met repressor 0 23.1 6.3 4.3 66.3 size. (a) The total change in accessible surface area, ∆Atot,
myoglobin (horse) 0 79.1 0.0 9.8 11.1 as well as the apolar, ∆Aap, and polar, ∆Apol, contributions
myoglobin (whale) 0 80.4 0.0 8.5 11.1 are plotted vs the number of residues, Nres. The lines are
myoglobin (whale) 0 83.8 0.0 8.5 7.7 the linear regressions. The slope, intercept, and R2 values
OMTKY3 3 19.6 17.9 25.0 37.5
are 104, -1200, and 0.993 for ∆Atot, 64, -1120, and 0.989
papain 3 31.1 17.9 12.3 38.7
parvalbumin 0 54.6 0.0 25.9 19.4
for ∆Aap, and 39, -84, and 0.971 for ∆Apol. (b) The change
pepsin 3 13.2 42.3 23.0 21.5 in accessible surface area per residue is plotted vs the
pepsinogen 3 20.8 38.4 21.4 19.5 number of residues. The lines are the linear regressions.
K4 frag plasminogen 3 0.0 14.1 50.0 35.9 The slope, intercept, and R2 values are 0.062, 84, and 0.376
RNase T1 2 16.3 27.9 29.8 26.0 for ∆Atot, 0.050, 47, and 0.379 for ∆Aap, and 0.012, 37, and
RNase T1 2 16.3 27.9 29.8 26.0 0.045 for ∆Apol.
RNaseA 4 22.6 33.1 23.4 21.0
ROP 0 40.5 0.0 4.0 55.6 to laboratory.70,71 All secondary structure contents
Sac7d 0 30.3 40.9 10.6 18.2
SH3 spectrin 0 5.3 47.4 19.3 28.1 reported in Table 4 were assessed using the STRIDE
Staphylococcus nuclease 0 29.4 30.1 20.6 19.9 algorithm of Frishman and Argos;71 use of a single
stefin A 0 7.1 33.7 37.8 21.4 algorithm minimizes variations resulting from the
stefin B 0 22.1 38.9 14.7 24.2 different criteria and algorithms used to derive
subtilisin inhibitor 2 15.9 33.6 34.6 15.9
subtilisin BPN′ 0 30.2 17.1 28.4 24.4
secondary structure as reported in the PDB files.
tendamistat 2 0.0 45.9 25.7 28.4 The STRIDE algorithm was designed to more
thioredoxin 1 35.2 26.9 24.1 13.9 closely mimic the secondary structure assignments
trp repressor 0 80.0 0.0 0.0 20.0 reported by investigators in the PDB files than the
trp repressor 0 83.2 0.0 5.9 10.9
ubiquitin 0 25.0 30.3 23.7 21.1
commonly used DSSP algorithm of Kabsch and
Sander.72 In general, the two algorithms yield very
a The number of residues in a secondary structure class was
similar results: a survey of 226 proteins shows the
calculated using STRIDE and converted to percentages. The highest level of disagreement for an individual pro-
total helix percentage includes both R and 310 helices. The
PDB files and references are in the same order as listed in tein was 14% of the residues.71
Table 3.
B. Relationships between Unfolding
the denatured state will control for systematic errors Thermodynamics and Features of Protein
in the use of a model for the denatured state,69 but Structure
it will not account for any real differences in the
extent to which the denatured forms of different
1. General Structural Features
proteins may vary in their relative solvent acces- Before looking at correlations between energetic
sibilities. and structural features of proteins, it is worth
The assignment of secondary structure in proteins examining the correlations of the structural features
is somewhat dependent on the choice of algorithm.70,71 themselves. For example, it has long been known
Secondary structure is generally defined as a regu- that the buried surface area correlates with the size
larly repeating conformation of the polypeptide chain. of the protein.20 This is illustrated in Figure 2a in
All algorithms yield very similar results for any given which ∆Atot, ∆Aap, and ∆Apol are plotted vs the
protein but specific criteria for identifying regulari- number of residues in the protein. In addition to the
ties in polypeptide conformation vary from laboratory increase in the total surface area buried in the
Protein Structure and the Energetics of Protein Stability Chemical Reviews, 1997, Vol. 97, No. 5 1259

polar and apolar ASA make different contributions


to ∆Cp.27,30,79 However, simultaneous regression of
∆Cp on both ∆Aap and ∆Apol yields very similar values,
0.66 ( 0.21 and 0.52 ( 0.32 J K-1 (mol Å2)-1
respectively, with R2 ) 0.856. Thus, as noted by
Myers et al.,77 the separate contributions of polar and
apolar surface to ∆Cp are not evident in the protein
data.
As noted above, the correlation between structural
features and ∆Cp has been investigated previously.
The values observed here for 49 different proteins
represent a much larger data set than has been used
previously. The analysis of Spolar et al.27 used the
Figure 3. Correlation of ∆Cp of unfolding with the number set of 12 proteins tabulated by Privalov and Gill,38
of residues. The line is the linear regression with slope, while the more recent analysis by Myers et al.77 used
intercept, and R2 of 0.062, -0.53, and 0.862. a set of 26 proteins.
Table 5. Results of Regression Analysis of Myers et al. also found that ∆Cp correlated equally
Thermodynamics of Protein Unfolding well with Nres as with ∆Atot or ∆Aap and ∆Apol. Their
thermodynamic regression regressed
value per residue was 59.4 J K-1 (mol res)-1, similar
parameter variables values R2 to the 58 J K-1 (mol res)-1 found here and the 59 J
∆Cp Nres 58 ( 1 J K-1 (mol res)-1 0.859
K-1 (mol res)-1 found by Privalov and Gill.38 The
∆Cp ∆Atot 0.61 ( 0.02 J K-1 (mol Å2)-1 0.856 correlation of ∆Cp with ∆Atot observed by Myers et
∆Cp ∆Aap 0.66 ( 0.21 J K-1 (mol Å2)-1 0.856 al. gave a value of 0.79 J K-1 (mol Å2)-1 compared to
∆Apol 0.52 ( 0.32 J K-1 (mol Å2)-1 our 0.61 J K-1 (mol Å2)-1. The discrepancy between
∆H (60 °C) Nres 2.92 ( 0.08 kJ (mol res)-1 0.766 these values probably reflects the different algo-
∆H (60 °C) ∆Atot 30.2 ( 0.9 J (mol Å2)-1 0.735
∆H (60 °C) ∆Aap -8 ( 11 J (mol Å2)-1 0.775 rithms used in calculating ASA.
∆Apol 86 ( 17 J (mol Å2)-1 The analysis here shows no significant difference
∆H (100 °C) Nres 5.28 ( 0.09 kJ (mol res)-1 0.918 in the contribution of apolar and polar surface to ∆Cp,
∆S° (60 °C) Nres 8.8 ( 0.3 J K-1 (mol res)-1 0.744
∆S° (60 °C) ∆Atot 0.091 ( 0.003 J K-1 (mol Å2)-1 0.716
although they have been observed previously to be
∆S° (60 °C) ∆Aap -0.03 ( 0.04 J K-1 (mol Å2)-1 0.757 of opposite sign.27,30,79 Murphy and Freire35 found a
∆Apol 0.27 ( 0.06 J K-1 (mol Å2)-1 value of 1.9 J K-1 (mol Å2)-1 for apolar surface and
∆S° (60 °C) Nres 9.2 ( 4.6 J K-1 (mol res)-1 0.771 -1.1 J K-1 (mol Å2)-1 for polar surface based on data
∆Aap -0.11 ( 0.05 J K-1 (mol Å2)-1 for the dissolution of cyclic dipeptides.79 Spolar et
∆Apol 0.15 ( 0.08 J K-1 (mol Å2)-1
∆S° (112 °C) Nres 17.3 ( 0.3 J K-1 (mol res)-1 0.919 al.27 found a value of 1.4 J K-1 (mol Å2)-1 for apolar
surface and -0.67 J K-1 (mol Å2)-1 for polar surface
from analysis of unfolding data on a set of 14 globular
protein with increasing protein size, the surface area proteins. Finally, Myers et al.77 find a value of 1.2 J
buried per residue also increases.17 As noted previ- K-1 (mol Å2)-1 for apolar surface and -0.38 J K-1
ously,20,73 and as seen in Figure 2b, this increase is (mol Å2)-1 for polar surface from their data set of 26
mainly due to an increase in the apolar surface proteins.
buried per residue, while the polar surface buried per In the data set presented here, the surface area
residue remains fairly constant. Because the polar buried by the average protein is 58.3% apolar and
surface area buried per residue is nearly constant 41.7% polar. If the average contribution to ∆Cp per
with protein size while the apolar area per residue unit surface area is calculated from the apolar and
increases, the fraction of the buried surface area that polar contributions weighted by these percentages,
is apolar increases with size, but this trend is very all of the above treatments give similar values. The
weak. Ignoring this weak upward trend, the per- coefficients of Murphy and Freire give a weighted
centage of the total buried area that is apolar is 58.3 average of 0.65 J K-1 (mol Å2)-1; those of Spolar et
( 3.4%. The percentage of the total buried area that al. give 0.54 J K-1 (mol Å2)-1; those of Myers et al.
is polar is 41.7 ( 3.4%. give 0.54 J K-1 (mol Å2)-1; and those from this study
give 0.60 J K-1 (mol Å2)-1. These can be compared
2. Heat Capacity of Unfolding to the value of 0.61 J K-1 (mol Å2)-1 from the
The thermodynamic term which has been most regression on ∆Atot. This observation points to the
often scaled to structural features is ∆Cp.26,32,35,74-78 difficulty of obtaining coefficients for the apolar and
The simplest approach is to assume that ∆Cp scales polar contributions to ∆Cp from the protein data
only with the size of the protein, that is with the alone. Any number of combinations of coefficients
number of amino acid residues, Nres. Regression of can give equally good fits to the data. This is not in
∆Cp on Nres (Figure 3) gives a value of 58 ( 2 J K-1 conflict with the model compound data, which clearly
(mol res)-1 with R2 ) 0.859 (Table 5). indicate different contributions of apolar and polar
The next simplest assumption is that ∆Cp scales surface to ∆Cp, but merely points to the limitations
with the total change in ASA, ∆Atot. This is nearly of using the protein data by themselves.
the same as regression on Nres since Nres and ∆Atot 3. Convergence Temperatures
are highly correlated (∆Atot ) (96.2 ( 0.7)Nres; R2 )
0.987; Table 5). Regression of ∆Cp on ∆Atot gives 0.61 In the mid-1970s Privalov and co-workers noted an
( 0.02 J K-1 (mol Å2)-1 with R2 ) 0.856. Results of interesting feature in the thermodynamics of unfold-
model compound studies clearly demonstrate that ing of globular proteins; namely, the ∆Hu values,
1260 Chemical Reviews, 1997, Vol. 97, No. 5 Robertson and Murphy

when normalized to the molecular weight of the


proteins (or to the number of residues), converge to
a common value at some high temperature (desig-
nated TH*).80 Likewise, the normalized ∆Su values
converge to a common value at some high tempera-
ture (TS*).80 It was originally surmised that the
hydrophobic contributions to ∆Hu and ∆Su approach
zero at their respective “convergence temperatures”,80
so that the convergence behavior could be used to
determine the contributions of fundamental interac-
tions to the stability of globular proteins. Thus the
value of ∆Hu at TH*(∆H*) could be attributed to polar
and van der Waals interactions, while the value of
∆Su at TS*(∆S*) could be attributed primarily to
configurational entropy. The convergence behavior
has been the source of significant interest and
speculation since that time.13,27,33,34,38,81-85
In 1986, Baldwin noted that TS* for proteins occurs
at the same temperature at which the ∆S° of dis-
solution for hydrophobic liquids extrapolates to zero13
suggesting that at TS* the hydrophobic contribution
to ∆Su was zero. Subsequently, it was shown that
the convergence behavior could be analyzed by plot-
ting the normalized ∆Su (or ∆Hu) vs the normalized
∆Cp.33 Consider the standard equation for ∆Su as a
function of temperature:
Figure 4. Correlation of the residue normalized ∆Hu (a)
∆Su ) ∆S* + ∆Cp ln(T/TS*) (6) and ∆Su (b) with ∆Cp at 25 °C. The solid lines are the linear
regression. The slopes are related to the convergence
For a given protein, eq 6 describes the ∆Su as a temperatures, TH* and TS*, and the intercepts give the
function of temperature. However, for a set of convergence values ∆H* and ∆S*. For ∆Hu, the slope,
intercept, and R2 values are -40.9, 3440, and 0.362,
proteins a plot of ∆Su vs ∆Cp will have a slope of ln corresponding to a TH* value of 65.9 °C. The dotted line
(T/TS*) and an intercept of ∆S*.33 assumes the previously determined value of TH* of 100.5
Similarly, for ∆Hu we have °C. For ∆Su the slope, intercept, and R2 values are -0.126,
10.1, and 0.330 corresponding to a TS* value of 65.0 °C.
∆Hu ) ∆H* + ∆Cp(T - TH*) (7) The dotted line assumes the previously determined value
of TS* of 112 °C.
Again, eq 7 describes the temperature dependence
of ∆Hu for a single protein. A plot of ∆Hu vs ∆Cp for value, ∆H* or ∆S*, is the contribution of the invari-
a set of proteins which show convergence will have ant group to ∆H or ∆S at the convergence tempera-
a slope of (T - TH*) and an intercept of ∆H*. ture.79,85 Consequently, if convergence is observed,
Using these plots, TS* was found to be the same the thermodynamics can be parsed into two groups,
for transfer of hydrophobic compounds from the gas, those arising from apolar interactions and those
liquid, and solid phases, as well as for protein arising from other contributions.34,79
unfolding.33 This confirmed the observation of Bald- The globular proteins appear to represent a ho-
win that the hydrophobic contribution to ∆Su ap- mologous series of compounds when the surface areas
proached zero at TS*. By analogy, it was argued that are normalized per residue, as indicated in Figure
the hydrophobic contribution to ∆Hu also approached 2b above. The polar surface area buried per residue
zero at TH*.33 is essentially constant with increasing protein size,
Convergence behavior has also been seen for the whereas the apolar surface area buried per residue
aqueous dissolution of a homologous series of model increases. However, the average value of ∆Apol per
compounds.79,85 From these, the requirements for residue is 38.3 ( 3.9 Å2 res-1 while the average value
observing convergence behavior also have been clari- of ∆Aap per residue is 53.6 ( 5.5 Å2 res-1. Looking
fied.34,79,82 Convergence will be observed for a set of at the standard deviations, the variability in buried
compounds if the following conditions hold: (1) the apolar area is only slightly greater than that for polar
series is homologous, i.e., one functional group is surface area.
constant throughout the series of compounds while To determine if the proteins in Table 1 exhibit
another functional group varies (e.g., the normal convergence behavior, plots of ∆Hu and ∆Su at 25 °C
alcohols); and (2) the contributions of the functional vs ∆Cp, all normalized per residue, were constructed
groups are independent and additive. If these two (Figure 4). The solid line represents the linear
conditions are met, then both ∆H and ∆S for the regression of the data. The linear regression of the
entire set will converge to common values at some ∆Hu data gives a slope of -40.9 and an intercept of
temperature. Furthermore, if the series is variable 3440 with R2 ) 0.36. This corresponds to TH* equal
in the number of methylene groups, then convergence to 65.9 °C and ∆H* equal to 3.44 kJ (mol res)-1. The
occurs at the temperature where the methylene dotted line corresponds to the analysis of a smaller
contributions (i.e., the hydrophobic contribution) to data set by Murphy and Gill34 with TH* ) 100.5 °C
the enthalpy or entropy are zero. The convergence and ∆H* ) 5.64 kJ (mol res)-1. The set of Spolar et
Protein Structure and the Energetics of Protein Stability Chemical Reviews, 1997, Vol. 97, No. 5 1261

surface and 131 J (mol Å2)-1 for polar surface, in


reasonable agreement with the results of our larger
data set. Values of -21.5 J (mol Å2)-1 for apolar
surface and 205 J (mol Å2)-1 are calculated from data
on the dissolution of cyclic dipeptides.12,79
As discussed above, the apolar and polar contribu-
tions to ∆Hu have also been estimated from the
convergence temperatures.33,80,85 In this analysis, it
is assumed that the apolar contribution to ∆Hu is zero
at TH*. The ∆Hu observed at that temperature, ∆H*,
can then be normalized to the change in polar surface
area to give the polar contribution to ∆Hu.
From the data set compiled by Privalov and Gill,38
Murphy and Gill determined a TH* of 100.5 °C at
which ∆H* equals 5.64 kJ (mol res)-1.34 This corre-
sponds to 146 J (mol Å2)-1 when normalized to
surface area.35 From this and the ∆Cp values used
in the convergence model35 the calculated contribu-
tions at 60 °C are -76 J (mol Å2)-1 for apolar surface
and 190 J (mol Å2)-1 for polar surface.
The current data set yields a TH* of 65.9 °C.
However, if the ∆Hu values are extrapolated to the
original TH* of 100.5 °C, the correlation between ∆Hu
and Nres is much improved (Figure 5b), with a value
of 5.28 ( 0.09 kJ (mol res)-1 and R2 ) 0.919. Thus,
Figure 5. Correlation of ∆Hu with the number of residues even though convergence behavior is not very evident
at 60 °C (a) and 100.5 °C (b). The lines are the linear in the protein data set, the value of ∆Hu is well
regressions. The slope, intercept, and R2 values are 2.53, predicted at the original TH* of 100.5 °C.
63.2, and 0.789 at 60 °C, and 5.03, 41.6, and 0.922 at 100.5 As with ∆Cp we find that all of the analyses give
°C. values for the average ∆Hu per unit surface area at
60 °C, when weighted by the percentage of apolar and
al.27 gives TH* ) 84 °C and ∆H* ) 4.62 kJ (mol res)-1. polar surface, that are very similar. The values from
The linear regression of the ∆Su data gives a slope the “convergence temperature” analysis give an aver-
of -0.126 and an intercept of 10.1 with R2 ) 0.33. age ∆Hu of 34.9 J (mol Å2)-1; the values of Xie and
This corresponds to TS* ) 64.9 °C and ∆S* ) 10.1 J Freire give 34.0 J (mol Å2)-1; and the values from the
K-1 (mol res)-1. The dotted line again corresponds current analysis give 31.2 J (mol Å2)-1. These
to the analysis of Murphy and Gill34 with TS* ) 112 compare well with the value of 30.2 J (mol Å2)-1
°C and ∆S* ) 18.1 J K-1 (mol res)-1. obtained when the current data set is regressed
Overall, the convergence behavior for this larger against ∆Atot. Overall, this comparison again il-
protein data is not very compelling. This is not lustrates the difficulty of obtaining precise coef-
surprising given the above discussion. The correla- ficients from the protein data alone.
tion coefficients suggest that only about 30% of the
variation in ∆Hu or ∆Su of unfolding at 25 °C can be 5. Entropy of Unfolding
accounted for by variation in ∆Cp. In contrast, Both ∆Cp and ∆Hu are expected to scale with
regression coefficients from the data set of 12 proteins changes in accessible surface area because these
originally analyzed by Murphy and Gill suggest that quantities result primarily from changes in solvation
the variation in ∆Cp accounts for >90% of the and changes in noncovalent interactions. The en-
variation in ∆Hu and ∆Su of unfolding at 25 °C. tropy change, on the other hand, includes additional
4. Enthalpy of Unfolding contributions from changes in the configurational
entropy of side chains and backbone upon unfolding.
The ∆Hu at 60 °C can also be treated as a function Regression of the ∆Su at 60 °C on the number of
of Nres, ∆Atot, or ∆Aap and ∆Apol. Regression on Nres residues (Figure 6a) gives a value of 8.8 ( 3 J K-1
(Figure 5a) yields 2.92 ( 0.08 kJ (mol res)-1 with R2 (mol res)-1 with R2 ) 0.744. The correlation of ∆Su
) 0.766, while regression on ∆Atot yields 30.2 ( 0.9 at 60 °C with the total buried surface area is
J (mol Å2)-1 with R2 ) 0.735 (Table 5). somewhat poorer with a value of 0.091 ( 0.003 J K-1
As with ∆Cp, results of model compound studies (mol Å2)-1 and R2 ) 0.716. The best correlation of
show that apolar and polar ASA make different ∆Su at 60 °C is with both ∆Aap and ∆Apol, giving
contributions to ∆Hu.27,30,35,79,86,87 Regression of the values of -0.03 ( 0.04 J K-1 (mol Å2)-1 and 0.27 (
∆Hu of unfolding at 60 °C on ∆Aap and ∆Apol yields 0.06 J K-1 (mol Å2)-1 with R2 ) 0.757 (Table 5).
values of -8 ( 11 and 86 ( 17 J (mol Å2)-1 Upon the basis of model compound data, one would
respectively with R2 ) 0.775. While the regression expect the apolar contribution to ∆Su at 60 °C to be
in terms of both ∆Aap and ∆Apol is statistically better about -0.2 to -0.3 J K-1 (mol Å2)-1. This value is
than for ∆Atot, the confidence in the regressed pa- based on the typical ∆Cp and TS* ) 112 °C. The
rameters is much less. magnitude of regressed value is significantly less
Analysis of a smaller protein data set by Xie and than this. The estimated polar contribution to ∆Su
Freire88 gave values of -35.3 J (mol Å2)-1 for apolar is small85,89 or negative,6 but the regressed value is
1262 Chemical Reviews, 1997, Vol. 97, No. 5 Robertson and Murphy

Table 6. Comparison of Calculated and Experimental


Values of ∆Cpa
protein calculated ∆Cp % error
R-chymotrypsin 13.6 6.6
R-chymotrypsinogen 14.0 -3.4
R-lactalbumin 7.1 -5.2
R-lactalbumin 7.0 -7.9
acyl carrier protein (apo) 3.7 10.6
acyl carrier protein (holo) 3.7 -42.6
arabinose binding protein 19.1 44.6
arc repressor 6.1 -8.3
B1 of protein G 2.8 8.4
B2 of protein G 3.1 6.4
barnase 6.4 10.3
barnase 6.3 -7.2
barstar 5.1 -18.9
BPTI 2.8 41.4
carbonic anhydrase B 16.1 0.7
CI2 3.4 34.8
cyt b5 (tryp frag) 4.5 -24.8
cytochrome c (horse) 5.8 15.1
cytochrome c (horse) 5.8 7.8
cytochrome c (yeast iso 1) 5.9 3.5
cytochrome c (yeast iso 1) 5.9 13.4
cytochrome c (yeast iso 2) 6.0 15.8
GCN4 3.2 8.5
HPr 4.6 -5.3
IL-1β 8.5 5.8
lac repressor headpiece 2.4 82.2
lysozyme (human) 7.8 9.0
Figure 6. Correlation of ∆Su with the number of residues lysozyme (human) 7.8 18.1
at 60 °C (a) and 112 °C (b). The lines are the linear lysozyme (apo equine) 7.5 18.6
regressions. The slope, intercept, and R2 values are 7.8, lysozyme (holo equine) 7.7 1.3
162, and 0.759 at 60 °C, and 16.8, 85, and 0.920 at 112 °C. lysozyme (hen) 7.7 4.0
lysozyme (hen) 7.5 16.9
lysozyme (hen) 7.5 11.7
large and positive. The discrepancies between the lysozyme T4 9.9 -1.6
regressed values for the ∆Su contributions of apolar met repressor 12.4 39.1
and polar surface and the expectation from model myoglobin (horse) 8.7 14.3
compound and theoretical studies is due to the myoglobin (whale) 9.0 -42.6
neglect of the configurational entropy in the regres- myoglobin (whale) 9.0 2.4
OMTKY3 2.4
sion analysis. If a regression is performed against papain 13.2
-5.6
the number of residues and the changes in apolar and
-3.8
parvalbumin 5.9 5.5
polar surface areas, the resulting coefficients are 9.2 pepsin 19.0 0.6
( 4.6 J K-1 (mol res)-1, -0.11 ( 0.5 J K-1 (mol Å2)-1, pepsinogen 22.5 -6.8
and 0.15 ( 0.08 J K-1 (mol Å2)-1 respectively with plasminogen K4 domain 4.4 -16.4
R2 ) 0.771. These values are in much better agree- RNase T1 5.4 10.8
RNase T1 5.4 11.1
ment with expectation. RNaseA 6.8 4.0
As with ∆Hu, the convergence behavior of ∆Su for RNaseA 6.8 42.2
this set of proteins is quite weak, but gives a RNaseA 6.8 41.9
convergence temperature of 64.9 °C. However, if the ROP 7.8 -24.0
values of ∆Su are extrapolated to the more commonly Sac7d 3.6 -1.1
SH3 spectrin 3.2
observed value of TS* ) 112 °C, a much improved
-1.7
Staphylococcus nuclease 8.0 -13.5
correlation with the number of residues is again stefin A 5.3 -28.4
observed (Figure 6b), giving 17.3 ( 0.3 J K-1 (mol stefin B 5.3 -21.2
res)-1 with R2 ) 0.919. This is very similar the value subtilisin inhibitor 5.2 -38.8
of 18.1 J K-1 (mol res)-1 observed in the smaller data subtilisin BPN′ 15.7 -21.7
tendamistat 3.7 28.3
set.33 thioredoxin 5.9 -14.8
thioredoxin 5.9
6. Comparison of Regressed and Experimental Values
-19.5
trp repressorb 6.2 1.9
Coefficients obtained from the regressions de- ubiquitin 4.1 22.1
scribed above (Table 5) can be used to calculate the a ∆C (kJ K-1 mol-1) values were calculated as a function of
p
thermodynamics of unfolding, which can then be Nres using the regression coefficients listed in Table 5. Errors
compared to the experimental values. The regression are calculated in comparison to the experimental values in
Table 2 as 100 × (calculated - experimental)/experimental.
parameters describe the average thermodynamics of b Using the 2WRP structure.
these proteins, so comparisons of the calculated and
experimental values provide some information on
how much a protein deviates from the average and subsequent tables, the structural data are taken
behavior of globular proteins. from Table 3, and the calculations are compared to
The ∆Cp values are calculated using the parameter the experimental values in Table 2. In cases where
for ∆Atot given in Table 5. The calculated values and there are multiple thermodynamic entries in Table
the percentage error are given in Table 6. In this 2, but a single structural entry in Table 3, the
Protein Structure and the Energetics of Protein Stability Chemical Reviews, 1997, Vol. 97, No. 5 1263

calculated values are compared to each of the experi- Table 7. Comparison of Calculated and Experimental
mental entries. In those cases where multiple struc- Values of ∆Hua
ture and thermodynamic entries are given, the ∆Hu error, error,
comparison is made between structures and thermo- name of protein (60 °C) % ∆H* %
dynamics in the same order in which they are given R-chymotrypsin 640 -9.7 1252 2.2
in Tables 2 and 3. The percentage error is calculated R-chymotrypsinogen 680 15.3 1294 9.9
as 100 × (calculated - experimental)/experimental. R-lactalbumin 353 35.9 650 15.3
363 645
The average error in calculating ∆Cp is 4 ( 22%. The
R-lactalbumin -9.1 -9.0
acyl carrier protein (apo) 212 14.9 407 27.2
average is expected to be small as overpredictions acyl carrier protein (holo) 212 -10.9 407 -18.5
and underpredictions cancel each other, but the arabinose binding protein 901 5.6 1611 16.1
standard deviation indicates that the error in the arc repressor 358 6.1 560 -7.9
prediction is larger than the estimated experimental B1 of protein G 147 -21.2 296 1.4
B2 of protein G 160 296
error.
-12.1 -1.1
barnase 326 -38.3 571 -25.1
The ∆Hu values at 60 °C are calculated using the barnase 322 -45.3 576 -33.4
parameters for both ∆Aap and ∆Apol given in Table 5 barstar 202 -12.1 470 -2.6
BPTI 148 306
and are summarized in Table 7. The average error carbonic anhydrase B 792
-35.4
9.2 1352
-1.2
-1.4
is again small, -2.8 ( 22%, but the standard devia- CI2 164 -33.3 338 -2.6
tion is large. Table 7 also lists the error in calculat- cyt b5 (tryp frag) 235 -13.6 465 -9.7
ing ∆H* (at TH* ) 100 °C) which has an average cytochrome c (horse) 283 -28.0 549 -7.8
error of 2 ( 16%. Thus, as evident in the regression cytochrome c (horse) 283 -7.8 549 5.0
cytochrome c (yeast iso 1) 308 571
coefficients, ∆H* is better predicted than ∆Hu at 60 cytochrome c (yeast iso 1) 308
-20.3
-1.3 571
-7.6
9.1
°C. cytochrome c (yeast iso 2) 330 6.1 592 13.5
Finally, the ∆Su values at 60 °C are calculated GCN4 181 -21.0 328 -6.3
HPr 227 24.2 460 21.2
using the parameters for Nres, ∆Aap, and ∆Apol given IL-1β 378 808 10.6
in Table 5 and are summarized in Table 8. The
-7.1
lac repressor headpiece 122 9.8 269 64.1
average error is 5 ( 26%. The calculated values of lysozyme (human) 423 -2.6 687 -5.1
∆S* (at TS* ) 112 °C) are also given in the table and lysozyme (human) 423 -4.9 687 -3.5
have an average error of 2 ( 17%. Again, the values lysozyme (apo equine) 425 5.9 682 -3.9
lysozyme (holo equine) 425 17.7 682 3.1
of ∆S* are better predicted than the values of ∆Su lysozyme (hen) 405 682 0.0
at 60 °C.
-5.1
lysozyme (hen) 405 -1.0 682 2.1
One possible explanation for error in the predic- lysozyme (hen) 405 -12.5 682 -7.1
tions is deviations from the mean structural charac- lysozyme T4 505 -15.3 866 -13.7
met repressor 642 13.4 1099 18.4
teristics of the proteins. For example, greater num- myoglobin (horse) 408 3.7 808 15.0
bers of disulfide bonds are expected to lead to myoglobin (whale) 443 -0.7 808 -25.1
decreases in ∆Su, so that one might expect ∆Su to be myoglobin (whale) 420 5.3 808 7.2
overpredicted for proteins with a greater than aver- OMTKY3 145 -17.0 296 5.7
age number of disulfides. In fact, no such correlation papain 650 12.5 1120 -1.1
parvalbumin 302 571 2.1
is seen between the number of disulfides and either
-9.2
pepsin 861 -19.5 1722 -6.0
∆Su at 60 °C or ∆S* (Figure 7). The correlation pepsinogen 1059 7.0 1928 -1.9
coefficients, R2, are less than 0.05 for both cases. In plasminogen K4 domain 265 -13.0 412 -20.1
fact, no correlation of the error in either ∆Su at 60 RNase T1 292 -30.4 549 -10.8
°C or ∆S* is observed with any of the structural RNase T1 290 -42.3 549 -21.4
RNaseA 427 12.8 655 1.6
features considered here, including the fraction of the RNaseA 427 -7.5 655 -0.2
buried surface area which is polar or apolar, the RNaseA 427 -4.6 655 1.9
percentage of the residues in any secondary structure ROP 534 14.4 666 -24.7
type (i.e., R-helix, β-sheet, β-turn, or the sum of all Sac7d 191 58.9 349 31.3
SH3 spectrin 147 301
three), and the number of residues. There is also no Staphylococcus nuclease 385
-17.2
718
-2.6
correlation with the experimental parameters such
-1.9 -6.3
stefin A 274 12.0 518 -5.0
as pH or Tm. stefin B 263 -26.7 502 -20.3
The same lack of correlation of the error in predic- subtilisin inhibitor 270 -31.8 565 -23.4
subtilisin BPN′ 769 92.5 1453 19.7
tion with any structural or experimental features is tendamistat 215 1.4 391 18.9
observed for ∆Hu at 60°C, ∆H*, and ∆Cp. It is thioredoxin 250 12.8 571 13.3
somewhat surprising that no correlation of error in thioredoxin 250 0.4 571 4.2
predicting ∆Hu is found with the percentage of trp repressor 309 17.4 555 8.8
residues in any secondary structural type as such a ubiquitin 193 -7.1 402 17.1
correlation has previously been noted for a smaller a ∆H (kJ mol-1) at 60 °C was calculated as a function of
u
data set.90 In fact, the only significant correlation ∆Aap and ∆Apol, and ∆H* (i.e., ∆H at 100 °C) was calculated
we have observed is between the error in ∆Hu and as a function of Nres, using the regression coefficients in Table
5. Errors are calculated in comparison to the experimental
the error in ∆Su. This is illustrated in Figure 8a. The values in Table 2.
line is the linear least-squares fit which has a slope
of 1 and an intercept of 7 with R2 ) 0.756. This
correlation is even more evident between the error
IV. Summary
in ∆H* and the error in ∆S*, as seen in Figure 8b in What conclusions can be drawn from the regression
which the slope is 1, the intercept is 0.4 and R2 ) analyses? The first conclusion is that, from a purely
0.926. empirical standpoint, the primary determinant of
1264 Chemical Reviews, 1997, Vol. 97, No. 5 Robertson and Murphy

Table 8. Comparison of Calculated and Experimental


∆Sua
∆Su error, error,
name of protein (60 °C) % ∆S* %
R-chymotrypsin 1985 -22.9 4095 -7.4
R-chymotrypsinogen 2108 19.8 4233 9.6
R-lactalbumin 1079 31.0 2125 11.2
R-lactalbumin 1111 -14.0 2108 -12.1
acyl carrier protein (apo) 759 34.1 1330 26.7
acyl carrier protein (holo) 759 7.6 1330 -18.8
arabinose binding protein 2537 -1.2 5270 17.6
arc repressor 1074 4.3 1831 -8.4
B1 of protein G 513 0.8 968 9.2
B2 of protein G 510 -0.2 968 3.9
barnase 973 -39.5 1866 -23.7
barnase 981 -45.5 1883 -32.4
barstar 650 -2.8 1538 -2.2
BPTI 533 -10.1 1002 13.6
carbonic anhydrase B 2220 0.1 4423 -2.4
CI2 554 -21.6 1106 3.4
cyt b5 (tryp frag) 806 2.0 1520 -8.4
cytochrome c (horse) 991 -15.7 1797 -5.7
cytochrome c (horse) 907 -23.2 1797 5.9
cytochrome c (yeast iso 1) 907 -1.6 1866 -6.8
cytochrome c (yeast iso 1) 991 4.6 1866 9.6
cytochrome c (yeast iso 2) 1068 12.8 1935 13.7
GCN4 606 -9.3 1071 -2.4
HPr 763 45.7 1503 22.5
IL-1β 1230 -1.3 2644 9.9
lac repressor headpiece 816 -10.2 881 70.1
lysozyme (human) 464 40.8 2246 -0.3 Figure 7. Correlation of the percentage error in calculat-
lysozyme (human) 1235 1.6 2246 -0.4 ing ∆Su at 112 °C (a) and 60 °C (b) with the number of
lysozyme (apo equine) 1235 -4.8 2229 -17.8 disulfide bonds in the protein. The lines are the linear
lysozyme (holo equine) 1223 -4.5 2229 -11.7 regressions. The slope, intercept, and R2 are -1.5, 4.0, and
lysozyme (hen) 1247 -22.5 2229 1.6 0.021 at 112 °C, and -3.0, 8.8, and 0.032 at 60 °C. There
lysozyme (hen) 1247 -14.1 2229 4.2 is no improvement in the correlation if the proteins with
lysozyme (hen) 1223 0.9 2229 -6.3 zero disulfides are excluded from analysis.
lysozyme T4 1223 -13.1 2834 -14.1
met repressor 1465 -20.0 3594 18.7
myoglobin (horse) 1888 9.0 2644 15.9
myoglobin (whale) 1276 20.8 2644 -23.8
myoglobin (whale) 1337 10.7 2644 10.8
OMTKY3 1271 14.1 968 12.9
papain 561 16.6 3663 2.5
parvalbumin 1841 16.0 1866 9.4
pepsin 972 8.7 5633 -4.7
pepsinogen 2641 -16.9 6306 -1.6
plasminogen K4 domain 3038 4.3 1348 -19.1
RNase T1 1331 16.7 1797 -13.6
RNase T1 984 -28.5 1797 -18.6
RNaseA 973 -35.2 2142 2.4
RNaseA 1331 2.2 2142 7.2
RNaseA 1331 -0.5 2142 5.3
ROP 1496 10.8 2177 -23.5
Sac7d 620 96.1 1140 36.3
SH3 spectrin 469 -10.4 985 -0.9
Staphylococcus nuclease 1157 -3.4 2350 -7.5
stefin A 893 38.3 1693 -1.5
stefin B 836 -24.4 1641 -21.0
subtilisin inhibitor 2381 97.6 1849 -24.4
subtilisin BPN′ 981 -19.5 4751 15.3
tendamistat 736 30.2 1279 29.8
thioredoxin 831 39.4 1866 16.3
thioredoxin 831 23.5 1866 7.2
trp repressor 920 31.2 1814 14.4
ubiquitin 645 15.0 1313 25.8
a ∆S (J K-1 mol-1) at 60 °C was calculated as a function of
u
∆Aap and ∆Apol, and ∆S* (i.e., ∆Su at 112 °C) was calculated
as a function of Nres, using the regression coefficients in Table Figure 8. Correlation of the percentage error in calculat-
5. Errors are calculated in comparison to the experimental ing ∆Su and ∆Hu. (a) At 60 °C, the line is the linear
values in Table 2. regression with slope, intercept, and R2 of 1.0, 7.3, and
0.756. (b) ∆Su at 112 °C and ∆Hu at 100.5 °C, the line is
protein unfolding thermodynamics is the size of the the linear regression. The slope, intercept, and R2 are 1.0,
protein, although the number of residues by itself 0.4, and 0.927.
does not give the best regression. The regression
analysis indicates also that at least 75% of the for by the variation in simple structural features such
variation in unfolding energetics can be accounted as buried surface area. Thus, this simple approach
Protein Structure and the Energetics of Protein Stability Chemical Reviews, 1997, Vol. 97, No. 5 1265

captures most of the important features which de- stability result from a simple sum of independent
termine protein energetics. Further evidence in favor contributions from individual interactions. Decon-
of analyses based on surface areas is the recent work volution of protein stability in terms of polar and
of Hilser and Freire:91 calculations of protein energet- nonpolar surface areas is predicated on the assump-
ics based on surface areas were successfully used to tion that the contributions from such surfaces are
predict amide hydrogen exchange behavior in a linear functions of surface area. Nonadditivity may
number of proteins. well contribute to the scatter in the calculated vs
It is also interesting to note that the thermody- observed energetics, but no straightforward approach
namics at the convergence temperatures observed is available yet for evaluating its role.
previously in a much smaller set of proteins are Electrostatic interactions are the principal long-
better predicted than the thermodynamics at 60 °C, range interactions in proteins. With a database
even though the current set of proteins do not consisting of many different proteins, differences in
convincingly show convergence behavior. This is the extent to which electrostatic interactions con-
perhaps not surprising for the entropy, as the value tribute to stability in different proteins are going to
of TS* seems to be fairly universal.33 It is not clear contribute to the error in parameters derived from
why it should also occur for TH*. regression analysis of the database.
Calculated values of ∆Cp range from 57% to 182% No direct experimental data are available for
of the experimental values (Table 6). The range for assessing the amount of new surface area that is
∆Hu at 60 °C is 55% to 159% of the experimental exposed when a protein unfolds. Analyses of protein
values while that for ∆H* is 67% to 164% (Table 7). stability with respect to solvent-exposed surface areas
Similar distributions are observed in the differences typically rely on the assumption that solvent expo-
between calculated and experimental values of ∆Sm sure in the denatured state is modeled accurately by
(Table 8). The extent to which calculated values of an extended polypeptide chain or by summing cal-
∆Hm and ∆Sm are under- or overestimated relative culated surface areas for tripeptides.26,94 The use of
to experimental values is highly correlated, which different algorithms for these calculations leads to
probably reflects the fact that experimental ∆Hu significant differences in surface areas, but use of a
values are used to calculate ∆Su values: experimen- single algorithm in deconvoluting energetics in terms
tal errors in ∆Hu are thus manifested in the relative of structure is only expected to lead to systematic
errors in ∆Su. Interestingly, distributions of differ- deviations with respect to results obtained using
ences between calculated and experimental values other algorithms.69 If, however, proteins differ in the
are broader, (16-27% at one standard deviation, extent to which their denatured states are exposed
than might be anticipated on the basis of experimen- to solvent, then considerable error will be introduced
tal error alone, which is about 10% on average (7% into the analysis regardless of the algorithm used to
for ∆Cp, 12% for ∆H(60), and 15% for ∆S(60) as noted calculate surface area.
above in section II.C). A number of investigators have argued that the
A likely explanation for the broad distribution in denatured state is not accurately modeled by an
the differences between the calculated and observed extended or random-coil polypeptide chain.41,42,95-99
parameters is inaccuracies in the model used in the Moreover, the extent of solvent exposure is proposed
regression analysis, which is based primarily on to be sensitive to solution conditions, so no one value
surface area differences. Moreover, the calculations for solvent-exposed surface area in the denatured
rely on convergence temperatures and the protein state is applicable to any protein. Is the proposed
data show considerable scatter in this regard (Figure heterogeneity in the extent of unfolding a reasonable
6). Overall, empirical correlations of energetics with explanation for the disagreement between calculated
“regular” features of protein structure give rise to and experimental values for the energetics of protein
errors that appear to exceed experimental error. The stability?
model is thus either inappropriate or incomplete. One intriguing observation in this regard is the
Because the model based on surface areas appears underestimated ∆Hu and ∆Su values for barnase and
to capture much, but not all, of the relationship RNase T1 (Table 7); these two proteins fall at the
between protein structure and the energetics of extreme low end for both parameters. Confidence in
protein stability, the simplest explanation is that the the experimental determinations is high because at
model is incomplete. Inclusion of information about least two independent determinations have been
secondary structure and disulfide bonds provides no made for each protein. Interestingly, barnase and
insight into the origin of discrepancies between RNase T1 have very similar three-dimensional struc-
calculated and observed energetics. tures in spite of the fact that their amino acid
What is missed when the energetics of protein sequences are only 14% identical. Finally, the extent
stability are decomposed in terms of changes in of unfolding in the denatured state of barnase ap-
solvent-exposed surface areas? Some possible an- pears to be high relative to other proteins.100
swers to this question are (1) nonadditivity of ener- If the extent of solvent exposure for the denatured
getic contributions from the various groups that states of barnase and RNase T1 is indeed greater
make up polar and nonpolar surfaces, (2) long-range than the average for all proteins in the database then
interactions in proteins, and (3) heterogeneity in the one would expect that, for barnase and RNase T1,
extent to which the denatured states for different the thermodynamic parameters calculated from the
proteins are exposed to solvent. The possibility of mean behavior for all proteins would be lower than
nonadditivity in protein energetics is the subject of the true values, as is observed (Table 7). However,
considerable discussion.7,92,93 The principle of addi- this behavior is not observed for ∆Cp (Table 6). In
tivity is that the observed thermodynamics of protein addition, the extent of unfolding for RNase T1 ap-
1266 Chemical Reviews, 1997, Vol. 97, No. 5 Robertson and Murphy

pears to be close to that for other proteins.95,100 (2) The Biology of Nonspecific DNA-Protein Interactions; Revzin, A.,
Ed.; CRC Press: Boca Raton, 1990.
Nevertheless, at least some of the experimental data (3) Lattman, E. E.; Rose, G. D. Proc. Natl. Acad. Sci. U.S.A. 1993,
tabulated here and presented elsewhere are consis- 90, 439.
tent with variability in the extent of unfolding for (4) Yue, K.; Dill, K. A. Proc. Natl. Acad. Sci. U.S.A. 1995, 92, 146.
(5) Kauzmann, W. Adv. Protein Chem. 1959, 14, 1.
different proteins. (6) Makhatadze, G. I.; Privalov, P. L. Adv. Protein Chem. 1995, 47,
The possibility that the denatured state of barnase 307.
is more unfolded than the average protein suggests (7) Lazaridis, T.; Archontis, G.; Karplus, M. Adv. Protein Chem.
1995, 47, 231.
that the average extent of unfolding for all proteins (8) Honig, B.; Yang, A.-S. Adv. Protein Chem. 1995, 46, 27.
is overestimated with the current algorithms. A low (9) Rose, G. D.; Wolfenden, R. Annu. Rev. Biophys. Biomol. Struct.
1993, 22, 381.
estimate for the average extent of unfolding can be (10) Creighton, T. E. Curr. Opin. Struct. Biol. 1991, 1, 5.
obtained by using barnase as a reference for dena- (11) Dill, K. A. Biochemistry 1990, 29, 7133.
tured protein that is completely exposed to solvent. (12) Habermann, S. M.; Murphy, K. P. Protein Sci. 1996, 5, 1229.
(13) Baldwin, R. L. Proc. Natl. Acad. Sci. U.S.A. 1986, 83, 8069.
In conjunction with the observation that calculated (14) Dill, K. A. Science 1990, 250, 297.
∆Hu and ∆Su values are e75% of the predicted values (15) Herzfeld, J. Science 1991, 253, 88.
(Tables 7 and 8), this suggests that the average (16) Privalov, P. L.; Gill, S. J.; Murphy, K. P. Science 1990, 250, 297.
(17) Chothia, C. Nature 1975, 254, 304.
extent of unfolding is e75% of the values calculated (18) Myers, J. K.; Pace, C. N. Biophys. J. 1996, 71, 2033.
with the model-based algorithms. This value is (19) Levitt, M.; Chothia, C. Nature 1976, 261, 552.
similar to those suggested by Lee82 and Brandts101 (20) Chothia, C. J. Mol. Biol. 1976, 105, 1.
(21) Richardson, J. Adv. Protein Chem. 1981, 34, 167.
and is consistent with the conclusions of a recent (22) Chothia, C. Annu. Rev. Biochem. 1984, 53, 537.
computational study,69 where alternative models for (23) Thornton, J. M. In Protein Folding; Creighton, T. E., Ed.; W. H.
the denatured state yielded surface areas that aver- Freeman and Co.: New York, 1992; pp 59.
(24) Privalov, P. L. Adv. Protein Chem. 1979, 33, 167.
aged about 80% of the values obtained with tripep- (25) Spolar, R. S.; Ha, J.-H.; Record, M. T., Jr. Proc. Natl. Acad. Sci.
tides. U.S.A. 1989, 86, 8382.
(26) Livingstone, J. R.; Spolar, R. S.; Record, M. T., Jr. Biochemistry
An important conclusion from this analysis is that 1991, 30, 4237.
additional refinement of the calculations and a mo- (27) Spolar, R. S.; Livingstone, J. R.; Record, M. T., Jr. Biochemistry
lecular interpretation of the regression coefficients 1992, 31, 3947.
(28) Nozaki, Y.; Tanford, C. J. Biol. Chem. 1971, 246, 2211.
are unlikely to come from the protein data them- (29) Privalov, P. L.; Makhatadze, G. I. J. Mol. Biol. 1992, 224, 715.
selves. The inability to obtain unique coefficients (30) Makhatadze, G. I.; Privalov, P. L. J. Mol. Biol. 1993, 232, 639.
which relate structural features to unfolding energet- (31) Privalov, P. L.; Makhatadze, G. I. J. Mol. Biol. 1993, 232, 660.
(32) Murphy, K. P.; Gill, S. J. J. Chem. Thermodyn. 1989, 21, 903.
ics may reflect variability in the quality of the data (33) Murphy, K. P.; Privalov, P. L.; Gill, S. J. Science 1990, 247, 559.
or variability in the validity of the assumptions across (34) Murphy, K. P.; Gill, S. J. J. Mol. Biol. 1991, 222, 699.
the data set; the latter appears to be likely. Rather (35) Murphy, K. P.; Freire, E. Adv. Protein Chem. 1992, 43, 313.
(36) Ooi, T.; Oobatake, M.; Némethy, G.; Scheraga, H. A. Proc. Natl.
than simply compile additional protein unfolding Acad. Sci. U.S.A. 1987, 84, 3086.
thermodynamics for a wide variety of proteins, it may (37) Eisenberg, D.; McLachlan, A. D. Nature 1986, 319, 199.
(38) Privalov, P. L.; Gill, S. J. Adv. Protein Chem. 1988, 39, 191.
be more promising to pursue systematic structural (39) Privalov, P. L.; Tiktopulo, E. I.; Yenyaminov, S. Y.; Griko, Y.
and calorimetric studies of single-site mutations or V.; Makhatadze, G. I.; Khechinashvili, N. N. J. Mol. Biol. 1989,
structurally homologous proteins. The idea here is 205, 727.
(40) Robertson, A. D.; Baldwin, R. L. Biochemistry 1991, 30, 9907.
that the differences between the proteins in such (41) Dill, K. A.; Shortle, D. Annu. Rev. Biochem. 1991, 60, 795.
studies would more closely conform to those of a (42) Shortle, D. FASEB J. 1996, 10, 27.
homologous series. (43) Edsall, J. T. J. Am. Chem. Soc. 1935, 57, 1506.
(44) Madan, B.; Sharp, K. J. Phys. Chem. 1996, 100, 7713.
More data concerning the denatured state are (45) Gill, S. J.; Dec, S. F.; Olofsson, G.; Wadsö, I. J. Phys. Chem 1985,
essential for progress in understanding the energetics 89, 3758.
of protein stability. In this regard, calorimetric (46) Privalov, P. L.; Potekhin, S. A. Methods Enzymol. 1986, 131, 4.
(47) Freire, E. In Protein Stability and Folding; Shirley, B. A., Ed.;
experiments appear to offer some promise.39,69 Ad- Humana Press: Totowa, NJ, 1995; Vol. 40, pp 191.
ditionally, data on protein-protein interactions,102 in (48) Christensen, J. J.; Hansen, L. D.; Izatt, R. M. Handbook of
which the structures of both the initial and final Proton Ionization Heats and Related Thermodynamic Quantities;
John Wiley and Sons: New York, 1976.
states are well determined, will probably provide less (49) Freire, E.; Biltonen, R. L. Biopolymers 1978, 17, 463.
ambiguous regression values. Finally, model com- (50) Bowie, J. U.; Sauer, R. T. Biochemistry 1989, 28, 7139.
(51) Swint, L.; Robertson, A. D. Protein Sci. 1993, 2, 2037.
pound studies will continue to be the principal means (52) Cohen, D. S.; Pielak, G. J. Protein Sci. 1994, 3, 1253.
by which precise thermodynamic values for specific (53) Scholtz, J. M. Protein Sci. 1995, 4, 35.
interactions can be determined. These studies pro- (54) Pace, C. N.; Laurents, D. V. Biochemistry 1989, 28, 2520.
(55) Chen, B.-l.; Schellman, J. A. Biochemistry 1989, 28, 685.
vide a rich framework to guide design and interpre- (56) Santoro, M. M.; Bolen, D. W. Biochemistry 1988, 27, 8063.
tation of the protein studies. (57) Kim, P. S.; Baldwin, R. L. Annu. Rev. Biochem. 1982, 51, 459.
(58) Pace, C. N.; Vajdos, F.; Fee, L.; Grimsley, G.; Gray, T. Protein
Sci. 1995, 4, 2411.
V. Acknowledgments (59) Becktel, W. J.; Schellman, J. A. Biopolymers 1987, 26, 1859.
(60) Carra, J. H.; Anderson, E. A.; Privalov, P. L. Protein Sci. 1994,
The authors thank the reviewers and Professor 3, 944.
Ken A. Dill for critical reading and helpful comments. (61) DeKoster, G. T.; Robertson, A. D. Biochemistry 1997, 36, 2323.
(62) Bevington, P. R.; Robinson, D. K. Data Reduction and Error
We also thank Dr. Wesley Stites for providing a copy Analysis for the Physical Sciences, 2nd ed.; McGraw-Hill: New
of his contribution to this volume prior to publication. York, 1992; pp 328.
The authors are grateful to the National Institutes (63) Bernstein, F. C.; Koetzle, T. F.; Williams, G. J. B.; Meyer, E. F.,
Jr.; Brice, M. D.; Rodgers, J. R.; Kennard, O.; Shimanouchi, T.;
of Health, National Science Foundation, American Tasumi, M. J. Mol. Biol. 1977, 112, 535.
Chemical SocietysPetroleum Research Fund, and (64) Abola, E. E.; Bernstein, F. C.; Bryant, S. H.; Koetzle, T. F.; Weng,
the University of Iowa for support of this work. J. In Crystallographic Databases - Information Content, Software
Systems, Scientific Applications; Allen, F. H., Bergerhoff, G.,
Sievers, R., Eds.; Data Commission of the International Union
VI. References of Crystallography: Bonn/Cambridge/Chester, 1987; p 107.
(65) Lee, B.; Richards, F. M. J. Mol. Biol. 1971, 55, 379.
(1) Anfinsen, C. B. Science 1973, 181, 223. (66) Russell, R. B.; Barton, G. J. J. Mol. Biol. 1994, 244, 332.
Protein Structure and the Energetics of Protein Stability Chemical Reviews, 1997, Vol. 97, No. 5 1267

(67) Flores, T. P.; Orengo, C. A.; Moss, D. S.; Thornton, J. M. Protein (86) Barone, G.; Del Vecchio, P.; Giancola, C.; Graziano, G. Int. J.
Sci. 1993, 2, 1811. Biol. Macromol. 1995, 17, 251.
(68) Chou, P. Y.; Fasman, G. D. Annu. Rev. Biochem. 1978, 47, 251. (87) Hilser, V. J.; Gómez, J.; Freire, E. Proteins 1996, 26, 123.
(69) Creamer, T. P.; Srinivasan, R.; Rose, G. D. Biochemistry 1995, (88) Xie, D.; Freire, E. Proteins 1994, 19, 291.
34, 16245. (89) D’Aquino, J. A.; Gómez, J.; Hilser, V. J.; Lee, K. H.; Amzel, L.
(70) Colloc’h, N.; Etchebest, C.; Thoreau, E.; Henrissat, B.; Mornon, M.; Freire, E. Proteins 1996, 25, 143.
J.-P. Protein Eng. 1993, 6, 377. (90) Makhatadze, G. I.; Clore, G. M.; Gronenborn, A. M.; Privalov,
(71) Frishman, D.; Argos, P. Proteins: Struct., Funct., Genet. 1995, P. L. Biochemistry 1994, 33, 9327.
23, 566. (91) Hilser, V. J.; Freire, E. J. Mol. Biol. 1996, 262, 756.
(72) Kabsch, W.; Sander, C. Biopolymers 1983, 22, 2577. (92) Dill, K. A. J. Biol. Chem. 1997, 272, 701.
(73) Murphy, K. P.; Bhakuni, V.; Xie, D.; Freire, E. J. Mol. Biol. 1992, (93) Mark, A. E.; van Gunsteren, W. F. J. Mol. Biol. 1994, 240, 167.
227, 293. (94) Shrake, A.; Rupley, J. A. J. Mol. Biol. 1973, 79, 351.
(74) Gill, S. J.; Wadsö, I. Proc. Natl. Acad. Sci. U.S.A. 1976, 73, 2955. (95) Pace, C. N.; Laurents, D. V.; Thomson, J. A. Biochemistry 1990,
(75) Makhatadze, G. I.; Privalov, P. L. J. Mol. Biol. 1990, 213, 375. 29, 2564.
(76) Gómez, J.; Hilser, V. J.; Xie, D.; Freire, E. Proteins 1995, 22, (96) Evans, P. A.; Topping, K. D.; Woolfson, D. N.; Dobson, C. M.
404. Proteins: Struct., Funct., Genet. 1991, 9, 248.
(77) Myers, J. K.; Pace, C. N.; Scholtz, J. M. Protein. Sci. 1995, 4, (97) Sosnick, T. R.; Trewhella, J. Biochemistry 1992, 31, 8329.
2138. (98) Neri, D.; Billeter, M.; Wider, G.; Wüthrich, K. Science 1992, 257,
1559.
(78) Graziano, G.; Barone, G. J. Am. Chem. Soc. 1996, 118, 1831.
(99) Fink, A. L.; Calciano, L. J.; Goto, Y.; Kurotsu, T.; Palleros, D.
(79) Murphy, K. P.; Gill, S. J. Thermochim. Acta 1990, 172, 11. R. Biochemistry 1994, 33, 12504.
(80) Privalov, P. L.; Khechinashvili, N. N. J. Mol. Biol. 1974, 86, 665. (100) Pace, C. N.; Laurents, D. V.; Erickson, R. E. Biochemistry 1992,
(81) Doig, A. J.; Williams, D. H. Biochemistry 1992, 31, 9371. 31, 2728.
(82) Lee, B. K. Proc. Natl. Acad. Sci. U.S.A. 1991, 88, 5154. (101) Brandts, J. F. J. Am.. Chem. Soc. 1964, 86, 4302.
(83) Baldwin, R. L.; Muller, N. Proc. Natl. Acad. Sci. U.S.A. 1992, (102) Stites, W. E. Chem. Rev. 1997, 97, 1233 (accompanying article
89, 7110. in this issue).
(84) Yang, A.-S.; Sharp, K. A.; Honig, B. J. Mol. Biol. 1992, 227, 889.
(85) Murphy, K. P. Biophys. Chem. 1994, 51, 311. CR960383C
1268 Chemical Reviews, 1997, Vol. 97, No. 5 Robertson and Murphy

You might also like