Structure Based Drug Discovery 1st Edition
Leslie W. Tari (Auth.) pdf download
https://ebookgate.com/product/structure-based-drug-discovery-1st-
edition-leslie-w-tari-auth/
Get the full ebook with Bonus Features for a Better Reading Experience on ebookgate.com
Instant digital products (PDF, ePub, MOBI) available
Download now and explore formats that suit you...
Biochips As Pathways To Drug Discovery Drug Discovery
Series 1st Edition Andrew Carmen (Editor)
https://ebookgate.com/product/biochips-as-pathways-to-drug-discovery-
drug-discovery-series-1st-edition-andrew-carmen-editor/
ebookgate.com
Genome Based Therapeutics Targeted Drug Discovery and
Development Workshop Summary 1st Edition Institute Of
Medicine
https://ebookgate.com/product/genome-based-therapeutics-targeted-drug-
discovery-and-development-workshop-summary-1st-edition-institute-of-
medicine/
ebookgate.com
Burger s Medicinal Chemistry and Drug Discovery Drug
Discovery Volume 1 6th Edition Donald J. Abraham
https://ebookgate.com/product/burger-s-medicinal-chemistry-and-drug-
discovery-drug-discovery-volume-1-6th-edition-donald-j-abraham/
ebookgate.com
drug discovery handbook 1st Edition Shayne Cox Gad
https://ebookgate.com/product/drug-discovery-handbook-1st-edition-
shayne-cox-gad/
ebookgate.com
Ethnomedicine and Drug Discovery 1st Edition M.M. Iwu
https://ebookgate.com/product/ethnomedicine-and-drug-discovery-1st-
edition-m-m-iwu/
ebookgate.com
Cancer Drug Resistance Cancer Drug Discovery and
Development 1st Edition Beverly A. Teicher
https://ebookgate.com/product/cancer-drug-resistance-cancer-drug-
discovery-and-development-1st-edition-beverly-a-teicher/
ebookgate.com
Epigenetic Targets in Drug Discovery 1st Edition Wolfgang
Sippl
https://ebookgate.com/product/epigenetic-targets-in-drug-
discovery-1st-edition-wolfgang-sippl/
ebookgate.com
Successful Drug Discovery Volume 5 1st Edition Janos
Fischer
https://ebookgate.com/product/successful-drug-discovery-volume-5-1st-
edition-janos-fischer/
ebookgate.com
Genome Editing in Drug Discovery 1st Edition Marcello
Maresca
https://ebookgate.com/product/genome-editing-in-drug-discovery-1st-
edition-marcello-maresca/
ebookgate.com
METHODS IN MOLECULAR BIOLOGY™
Series Editor
John M. Walker
School of Life Sciences
University of Hertfordshire
Hatfield, Hertfordshire, AL10 9AB, UK
For further volumes:
http://www.springer.com/series/7651
Structure-Based Drug Discovery
Edited by
Leslie W. Tari
Trius Therapeutics, San Diego, CA, USA
Editor
Leslie W. Tari
Trius Therapeutics
San Diego, CA, USA
[email protected]
ISSN 1064-3745 e-ISSN 1940-6029
ISBN 978-1-61779-519-0 e-ISBN 978-1-61779-520-6
DOI 10.1007/978-1-61779-520-6
Springer New York Dordrecht Heidelberg London
Library of Congress Control Number: 2011944430
© Springer Science+Business Media, LLC 2012
All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the
publisher (Humana Press, c/o Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA),
except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or
hereafter developed is forbidden.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified
as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.
Printed on acid-free paper
Humana Press is part of Springer Science+Business Media (www.springer.com)
Preface
The potential utility of atomic resolution structures of protein drug targets in drug discovery
has long been acknowledged. Without structure, medicinal chemists must rely on the costly,
time-consuming endeavor of screening large libraries of compounds for hits, and are often
forced to live with high molecular weight, non-ligand-efficient inhibitor scaffolds that must
be blindly decorated with thousands of groups to generate SAR, improve potency and
properties. With knowledge of the shape and chemical composition of the ligand-binding
pocket of the drug target, the de novo design of ligand efficient inhibitor scaffolds is
enabled. Also, iterative-structure-guided ligand optimization can be used to rationally
improve early leads in a few steps rather than with thousands of analogs. However, despite
its promise, structure-based drug design (SBDD) did not live up to expectations in its early
days: only a limited range of protein targets were tractable to crystallographic studies, crystal
structures took months or years to solve, and limitations in computing power and unrealistic
expectations of the capabilities of molecular modeling methods reduced the scope and
effectiveness of SBDD.
The last decade has seen the confluence of several enabling technologies that have
allowed protein crystallographic methods to live up to their true potential. Off-the-shelf
systems exist that allow the rapid cloning, and recombinant expression and isolation of large
quantities of protein in a wide range of prokaryotic or eukaryotic hosts. Low-cost nanovolume
liquid-handling robotic systems are available for the automated screening of vast arrays of
diverse solution conditions to find crystallization conditions for a protein target using mini-
mal quantities of protein. Latest generation synchrotron radiation sources allow for the
collection of high-resolution X-ray diffraction data on microcrystals in minutes. Continuing
improvements in computing power and advances in crystallographic software have made it
possible to go from X-ray dataset to refined crystal structure in less than an hour on a laptop
computer. Taken together, these advances have made it possible to tackle difficult biological
targets with a high probability of success: intact bacterial ribosomes have been structurally
elucidated, as well as eukaryotic trans-membrane proteins like the potassium channel and
GPCRs. Of additional importance is the impact the above mentioned advances have had on
the throughput of crystallographic structure determinations: it is now possible for medicinal
chemists to have access to structural information on their latest small molecule candidates
bound to the therapeutic target within days of compound synthesis, allowing structure-
guided ligand optimization to occur in “real time.” Also, using fragment screening, crystal
structures of hundreds of small molecule cores complexed with the protein target can be
utilized to construct novel inhibitor scaffolds.
The goal of this book is to provide scientists interested in adding SBDD to their arsenal
of drug discovery methods with a practical guide to the methods used to generate crystal
structures of biological macromolecules, how to leverage the structural information to
design new inhibitor classes de novo, and how to iteratively optimize hits and convert them
to leads. Where possible, specific protocols are described. Some examples highlighting the
utility of structural biology in the discovery and development of small molecule and protein
therapeutic agents are provided in the later chapters.
v
vi Preface
I am deeply grateful to all contributors who agreed to share their experiences in the
development and application of methodologies that support SBDD. I believe their patience
and hard work will be rewarded by the impact this volume has on scientists involved in drug
discovery. I would like to extend special thanks to John Walker for his guidance, inspiration
and patience in the preparation of this volume. Also, I am grateful to Les Tari Sr. for his
critical evaluation of this volume and sharp editorial eye.
San Diego, CA, USA Leslie W. Tari
Contents
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 The Utility of Structural Biology in Drug Discovery . . . . . . . . . . . . . . . . . . . . 1
Leslie W. Tari
2 Genetic Construct Design and Recombinant Protein Expression
for Structural Biology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Suzanne C. Edavettal, Michael J. Hunter, and Ronald V. Swanson
3 Purification of Proteins for Crystallographic Applications. . . . . . . . . . . . . . . . . 49
Daniel C. Bensen
4 Protein Crystallization for Structure-Based Drug Design. . . . . . . . . . . . . . . . . 67
Isaac D. Hoffman
5 X-Ray Sources and High-Throughput Data Collection Methods . . . . . . . . . . . 93
Gyorgy Snell
6 The Use of Molecular Graphics in Structure-Based Drug Design. . . . . . . . . . . 143
Paul Emsley and Judit É. Debreczeni
7 Crystallographic Fragment Screening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
John Badger
8 The Role of Enzymology in a Structure-Based Drug
Discovery Program: Bacterial DNA Gyrase . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
Mark L. Cunningham
9 Leveraging Structural Information for the Discovery
of New Drugs: Computational Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Toan B. Nguyen, Sergio E. Wong, and Felice C. Lightstone
10 Chemical Informatics: Using Molecular Shape Descriptors
in Structure-Based Drug Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Andy Jennings
11 Accounting for Solvent in Structure-Based Drug Design . . . . . . . . . . . . . . . . . 251
Leslie W. Tari
12 Structure-Based Drug Design on Membrane Protein Targets: Human
Integral Membrane Protein 5-Lipoxygenase-Activating Protein . . . . . . . . . . . . 267
Andrew D. Ferguson
13 Application of SBDD to the Discovery of New Antibacterial Drugs . . . . . . . . . 291
John Finn
vii
viii Contents
14 Leveraging SBDD in Protein Therapeutic Development:
Antibody Engineering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
Gary L. Gilliland, Jinquan Luo, Omid Vafa,
and Juan Carlos Almagro
15 A Medicinal Chemistry Perspective on Structure-Based
Drug Design and Development. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
Shawn P. Maddaford
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
Contributors
JUAN CARLOS ALMAGRO • Centocor R&D Inc., Radnor, PA, USA
JOHN BADGER • Zenobia Therapeutics, San Diego, CA, USA
DANIEL C. BENSEN • Trius Therapeutics, San Diego, CA, USA
MARK L. CUNNINGHAM • Trius Therapeutics, San Diego, CA, USA
JUDIT É. DEBRECZENI • Structure and Biophysics, Discovery Sciences, AstraZeneca,
Alderley Park, Macclesfield, UK
SUZANNE C. EDAVETTAL • Centocor R&D Inc., San Diego, CA, USA
PAUL EMSLEY • Department of Biochemistry, University of Oxford, Oxford, UK
ANDREW D. FERGUSON • Discovery Sciences, AstraZeneca Pharmaceuticals, Waltham,
MA, USA
JOHN FINN • Trius Therapeutics, San Diego, CA, USA
GARY L. GILLILAND • Centocor R&D Inc., Radnor, PA, USA
ISAAC D. HOFFMAN • Takeda San Diego, San Diego, CA, USA
MICHAEL J. HUNTER • Centocor R&D Inc., San Diego, CA, USA
ANDY JENNINGS • Takeda San Diego, San Diego, CA, USA
FELICE C. LIGHTSTONE • Lawrence Livermore National Laboratory, Physical and Life
Sciences Directorate, Livermore, CA, USA
JINQUAN LUO • Centocor R&D Inc., Radnor, PA, USA
SHAWN P. MADDAFORD • NeurAxonInc, Mississauga, ON, Canada L5K 1B3
TOAN B. NGUYEN • Lawrence Livermore National Laboratory, Physical and Life
Sciences Directorate, Livermore, CA, USA
GYORGY SNELL • Takeda San Diego, San Diego, CA, USA
RONALD V. SWANSON • Centocor R&D Inc., San Diego, CA, USA
LESLIE W. TARI • Trius Therapeutics, San Diego, CA, USA
OMID VAFA • Centocor R&D Inc., Radnor, PA, USA
SERGIO E. WONG • Lawrence Livermore National Laboratory, Physical and Life
Sciences Directorate, Livermore, CA, USA
ix
Chapter 1
The Utility of Structural Biology in Drug Discovery
Leslie W. Tari
Abstract
Access to detailed three-dimensional structural information on protein drug targets can streamline many
aspects of drug discovery, from target selection and target product profile determination, to the discovery
of novel molecular scaffolds that form the basis of potential drugs, to lead optimization. The information
content of X-ray crystal structures, as well as the utility of structural methods in supporting the different
phases of the drug discovery process, are described in this chapter.
Key words: X-ray crystallography, Structure-based drug design, Fragment screening, Structural bio-
informatics, Lead optimization
1. Introduction
The discovery of new drugs is a time and labor-intensive process.
On average, the discovery of a new drug requires the preparation
and evaluation of approximately 10,000 compounds over 12 years
at a cost of more than $350 million (1). Once in the marketplace,
many drugs fail to recover their development costs (as many as
30%, according to data from the 1980s (2)), and many others are
ultimately withdrawn from the market. These facts coupled with
limits on patent lifetime, escalating global competition, and increas-
ingly stringent government regulations for drug approval have
demanded more efficient and accelerated approaches to drug dis-
covery. Conventional “brute force” methods of lead discovery via
high-throughput screening (HTS) of proprietary synthetic, com-
binatorial, or natural product libraries, while effective in many
cases, are expensive and have limitations; they require access to
large compound libraries (sometimes over 1,000,000 compounds),
often yield hits with high molecular weight, poor ligand efficiency,
Leslie W. Tari (ed.), Structure-Based Drug Discovery, Methods in Molecular Biology, vol. 841,
DOI 10.1007/978-1-61779-520-6_1, © Springer Science+Business Media, LLC 2012
1
2 L.W. Tari
limited or no potential for optimization, and provide no information
to guide ligand optimization.
Advances in crystallographic methods, computational power,
molecular biology, and recombinant protein expression systems
over the last 30 years have provided researchers with rapid and reli-
able access to three-dimensional structural information on a wide
variety of protein drug targets. Structural information on protein–
ligand complexes can eliminate much of the complexity involved in
the discovery and optimization of prospective drug leads. Indeed,
structure-guided drug design efforts have led to the discovery of
high profile drugs in multiple therapeutic areas, including the pep-
tidomimetic HIV protease inhibitors for the treatment of HIV, the
neuraminidase inhibitor Tamiflu™ for the treatment of influenza,
the carbonic anhydrase inhibitor dorzolamide for the treatment of
glaucoma, and the thrombin inhibitor ximelagatran, an oral anti-
coagulant (3). Access to structural information on the target of
interest can streamline all aspects of drug discovery, from target
selection to lead discovery and optimization, using methods that
are summarized in this chapter.
2. The Information
Content of Protein
Crystal Structures
Protein crystals, like any crystalline substance, are regular, three
dimensionally periodic arrays of identical molecules or molecular
complexes (see Fig. 1). A common misconception regarding pro-
tein crystal structures is that they are not representative of the pro-
tein in solution due to the influence of extensive intermolecular
interactions present in the crystalline state. The idea that protein
crystal structures are heavily biased by “solid state” artifacts arises
from inaccurate comparisons made between protein crystals and
crystals of small molecular weight compounds. Crystals of small
molecules and proteins differ in ways that extend beyond the prop-
erties of their component molecules. Small-molecule crystals typi-
cally only comprise the small molecule, while protein crystals
contain 25–90% solvent by volume, depending on the protein. The
remaining volume in protein crystals is occupied by protein mole-
cules, and is analogous to an ordered gel with large interstitial
spaces between protein molecules. By comparison, the number of
contacts made in relation to the molecular mass of the protein in
protein crystals is smaller by orders of magnitude than it is for
small-molecule crystals. This causes the mechanical stability and
integrity of protein crystals to be much worse than it is for crystals
of small molecules. The high solvent content and tenuous thermo-
dynamic stability of protein crystals complicate the subsequent
steps in X-ray diffraction experiments, since these properties result
in crystal handling difficulties, susceptibility to temperature changes
1 The Utility of Structural Biology in Drug Discovery 3
Fig. 1. A view of crystal packing in a Haemophilus influenzae dihydrofolate reductase crystal. Boundaries for a single unit
cell within the crystal are shown. The view is perpendicular to the c-axis of the unit cell. The unit cell is the fundamental
building block of the crystal, a translationally periodic substance comprising trillions of unit cells that extend in three
dimensions. The unit cell is an arbitrary construction that describes the smallest “box” with the highest metric symmetry.
and dehydration, weaker diffraction, and greater sensitivity to radiation
damage. However, the key role played by solvent in protein crys-
tallization is a double-edged sword; while it adversely affects dif-
fraction, it is the very element that makes protein crystal structures
valuable. The high solvent content of protein crystals is essential
for maintaining the structures of the macromolecules in their solu-
tion states. Therefore, to a large extent, proteins in crystals possess
the structural, enzymatic, and functional properties of their coun-
terparts in solution. Protein crystal structures must be regarded
with care, however. In the hands of the uninformed, the danger
exists that crystallographic structural data will be misinterpreted,
or overreaching conclusions drawn. An understanding of the
parameters derived from crystallographic experiments is essential if
structural information from crystallographic experiments is to be
used effectively to support drug discovery.
X-ray crystallography and light microscopy share the same
basic principle; electromagnetic radiation scattered by the object to
be imaged is recombined and focused by a lens to reform the image
of the object. Theoretically, the resolving power of any imaging
technique is equal to one half of the wavelength of the radiation
used for imaging. To resolve the atomic details of protein struc-
tures, crystallographic experiments involve the exposure of protein
crystals to high-energy monochromatic X-rays (wavelengths on
the order of 1 Å). Imaging using X-rays is complicated by the fact
4 L.W. Tari
Fig. 2. A schematic outlining the steps in a crystallographic structure determination. Crystals are systematically exposed to
monochromatic X-rays in multiple orientations, and the diffraction patterns are captured with electronic detectors. Since
crystals are three-dimensionally periodic substances, the diffraction pattern comprises a series of spots rather than a
continuous function. Each spot represents a family of diffracted waves that map to discrete spatial periodicities in the unit
cell of the crystal. The diffraction pattern is a summation of waves of electromagnetic radiation and can thus be described
by a Fourier series, and the diffraction pattern and disposition of the atomic contents of the unit cell are related mathemati-
cally by a Fourier transform. An image of the atomic contents of the unit cell of the crystal is derived by applying a math-
ematical lens (inverse Fourier transform, equation shown on the lower left ) to the diffracted X-rays. The image reconstruction
process is complicated by the fact that only intensities of the diffracted X-rays are measurable (F (h) terms in the equation
shown), but not the relative phase shifts between each family of diffracted waves. The missing information is referred to
as the crystallographic phase problem. The missing phases are obtained using other experimental or computational meth-
ods described in the text. Since the diffraction of X-rays is caused by the interaction of the X-rays with electrons, the
resulting image obtained in a crystallographic experiment is of the electron density distribution in the unit cell of the crystal.
Interactive model building software is used to build the final atomic model into electron density.
that X-rays interact very weakly with matter, so that no lenses exist
which are able to reconstruct the image from the scattered X-rays.
Hence, the scattered X-rays from crystals must be captured with
electronic detectors and the function of a lens must be simulated
mathematically. A schematic describing the steps involved in the
solution of a crystal structure is shown in Fig. 2.
Mathematical reconstruction of the structure of the atomic
contents of the crystal is complicated by the fact that one of the
two key pieces of information describing the diffracted X-ray waves,
the relative phase shifts between the different families of diffracted
1 The Utility of Structural Biology in Drug Discovery 5
waves, cannot directly be measured (see Fig. 2). Three methods
are commonly employed to overcome the phase problem, as sum-
marized below.
(a) Molecular replacement. When an approximate model of the
unknown crystal structure is available, it can be used to over-
come the phase problem. The principle is simple; the model is
first oriented and then positioned in the unit cell of the target
crystal structure using rotation and translation functions. The
correctly oriented model is subsequently used to calculate
approximate phases and electron density maps. Alternate cycles
of interactive correction and rebuilding of the model into elec-
tron density and model refinement are used to improve the
quality of the phases and to transform the model structure into
the real structure. The success of molecular replacement
depends critically on two factors: the fraction of the asymmet-
ric unit for which suitable models exist, and the r.m.s. devia-
tion (after optimal superposition) between the model and
target structures. Generally, r.m.s. deviation increases with
decreasing sequence identity, or in cases where the target struc-
ture undergoes significant conformational changes with respect
to the model structure (e.g., movement of protein domains).
In the latter case, the model structure can be separated into
individual fragments that are sequentially oriented and posi-
tioned in the unit cell. Newer maximum-likelihood molecular
replacement algorithms, such as those implemented in the pro-
gram Phaser (4) are more discriminating, and have been suc-
cessful in solving difficult molecular replacement problems that
were previously intractable.
(b) Isomorphous replacement methods. This is a classical approach
used to solve protein structures with unknown folds. Crystals
are soaked in multiple solutions containing salts of heavy atoms
such as Hg, Pt, Pb, Au, etc., until conditions are found where
a small number of heavy atoms incorporate in well-defined
positions on the crystallized protein molecule (without alter-
ing the structure of the underlying protein). By analyzing the
differences in the intensities of diffraction patterns from the
native and heavy atom derivatized protein crystals, it is possible
to determine the locations of the heavy atoms in the unit cell
and to use the scattering “signal” from the heavy atoms to
calculate phases and an electron density map (reviewed in refs.
(5–7)).
(c) Anomalous scattering methods. For heavier elements, some
inner shell electrons have absorption edges in the range of the
X-ray wavelengths used in diffraction experiments. The heavy
atoms in the protein crystal cause absorption of the impinging
radiation, and impart small phase shifts on the radiation scat-
tered from the crystal. This phenomenon is used to determine
6 L.W. Tari
the positions of the heavy atoms in the unit cell, and subsequently
to extract phase information to allow electron density map
generation. Anomalous scattering can be used to supplement
the phase information obtained from isomorphous heavy atom
derivatives, or to independently obtain complete phase infor-
mation. A very powerful de novo phase determination method
utilizes anomalous scattering from proteins that are homoge-
neously labeled with selenomethionine (incorporated during
recombinant expression of the protein in Escherichia coli), a
derivatized selenium-containing amino acid. Independent dif-
fraction experiments are carried out (on the same crystal, if
possible) at multiple X-ray wavelengths on the high and low
energy sides of the selenium absorption edge that maximize
the anomalous diffraction signal. This method requires a tun-
able X-ray source, which is present only at synchrotrons
(reviewed in refs. (5–7)).
X-ray diffraction is caused by the interaction of the electric
field vector of monochromatic X-rays with electrons in a protein
crystal. These details, coupled with the fact that crystals are made
up of three-dimensionally periodic lattices of molecules, have sev-
eral important consequences (for excellent reviews see refs. (5–7)):
(1) X-ray diffraction experiments generate three-dimensional
images of the electron density distribution of the molecular com-
ponents of the crystal. So heavier atoms generate a proportionally
stronger signal, and hydrogen atoms are generally not discernable
in protein crystal structures. (2) The short wavelength radiation
used in X-ray diffraction experiments allows for the resolution of
macromolecular structures at an exquisite level of detail (typical
protein crystal structures are determined at resolutions between
1.5 and 3.0 Å resolution). (3) In a crystallographic experiment, the
structure of the molecular contents of the unique portion of a crys-
tal (called the asymmetric unit of the unit cell, which is the micro-
scopic building block of the crystal) are obtained, and the resulting
crystal can be built by the application of crystallographic symmetry
operators to the contents of the asymmetric unit, as shown in
Fig. 1. Since the diffraction signal from a crystal arises from con-
structive interference from trillions of crystallographic asymmetric
units, the resulting crystal structure comprises a time- and space-
averaged picture of the contents of the copies of asymmetric units
that are sampled. Hence, components of the asymmetric unit with
a large degree of random spatial heterogeneity, i.e., disordered
protein loops or side chains and the bulk solvent occupying the
spaces between protein molecules, fade into the background and
cannot be modeled. However, in cases where a molecular compo-
nent of a crystal, such as a protein side chain, occupies a finite
number of distinct, low energy conformations in different asym-
metric units, it is possible to simultaneously characterize each alter-
native conformation.
1 The Utility of Structural Biology in Drug Discovery 7
Examination of the equation relating diffracted X-rays to the
crystal structure provides insight into the structural parameters
that are modeled in a crystallographic experiment (see Eq. 1).
N
Fhkl = ∑ f j e − (B sin
2
θ)/ λ 2
e2 πi (hx + ky +lz ) . (1)
j =1
Equation 1 is one of the explicit forms of the structure factor
equation (8). Each Fhkl term represents a unique family of diffracted
X-ray waves from the crystal (diffracted waves from crystals con-
structively interfere to form patterns of spots, as shown in Fig. 2,
which can each be assigned integer indices h, k, and l ), which cor-
respond to discrete spatial periodicities in the crystal lattice. The
intensity and phase of each family of diffracted waves is derived via
a summation of the scattering contributions from all of the atoms
in the asymmetric unit of the crystal. The second exponential term
in Eq. 1 computes the net phase shift relative to an arbitrary origin
of the scattered wave with index h, k, l due to the relative positions
of the individual atoms in the unit cell (with fractional coordinates
x, y and z). The fj term corresponds to the scattering factor for each
atom in the summation, and is directly proportional to the number
of electrons in the atom in question. The first exponential B sin2 θ/λ2
term (θ is the angle of the scattered radiation with respect to the
source X-ray beam, and λ is the wavelength of the X-rays) accounts
for the reduction in the intensity of the scattered radiation with
scattering angle due to interference between scattered waves from
different parts of the electron cloud surrounding each atom. X-ray
scattering is attenuated further by smearing of the electron clouds
surrounding each atom due to thermal motion of the atoms.
Atomic thermal motion is modeled using the extra B term in the
structure factor equation. As a first approximation it is assumed
that the thermal motion of atoms is isotropic (spherically symmet-
ric), with B = 8π2μ2, where μ is the root mean square amplitude of
atomic vibration. Using the calculation above, for a B-factor of
15 Å2, the displacement of an atom from its equilibrium position is
approximately 0.44 Å, and it is as much as 0.87 Å for a B-factor of
60 Å2. Thus, analysis of B-factors is very important during any
structural analysis to provide insight into the dynamics and struc-
tural integrity of different regions of a protein molecule. However,
one must exercise caution before interpreting B-factors too quan-
titatively. In addition to measuring dynamic disorder caused by
temperature dependent vibration of atoms, the B-factor is also
influenced by subtle structural differences between protein mole-
cules in different unit cells throughout the crystal (which spatially
smears the atom positions), steric constraints from intermolecular
lattice contacts, and certain systematic experimental errors, such as
absorption of the X-ray beam during X-ray data collection.
Advanced mathematical models can be used to provide more
8 L.W. Tari
detailed information on atomic thermal motions. For example, the
relative motions of entire protein domains can be characterized
using TLS refinement (9). Also, when high-quality X-ray data are
available from crystals that diffract to high resolution (typically
better than 1.2 Å, rare in protein structure determinations), the
isotropic thermal correction can be replaced by a tensor, which
corrects not only for the extent of thermal motion of the atoms but
also for spatial anisotropy in their motions (10).
Based on the mathematical description of X-ray diffraction
provided above, four parameters are optimized in a single crystal
X-ray diffraction experiment for each atom in a protein crystal
structure: the x, y, and z coordinates of each atom and the B-factor
describing the thermal motion of each atom. The quality of
resulting electron density maps and the accuracy of refined para-
meters in protein crystal structures are largely dependent on the
resolution of the X-ray diffraction data (equivalent to the pixel size
of electron density sections). Examples of the effects of diffraction
resolution on electron density map quality are shown in Fig. 3.
The model is generally manually built (or refit) into electron density
by a crystallographer, using two types of electron density maps,
|2Fo − Fc|αc maps, and |Fo − Fc|αc difference maps, described below.
Fig. 3. Representative electron density maps contoured around tyrosine residues (using |2Fo − Fc|αc coefficients) from three
refined crystal structures: (a) A 2.8 Å resolution structure of Francisella tularensis topoisomerase IV, (b) A 2.2 Å structure of
Escherichia coli topoisomerase IV, and (c) A 1.4 Å structure of Enterococcus faecalis DNA gyrase B (all from D. Bensen and
L. Tari, unpublished results). The electron density maps were contoured using the electron density visualization software COOT
(see ref. (11), Chapter 6). At better than 3.0 Å resolution, amino-acid side chains can be recognized with the help of protein
sequence information, while at better than 2.5 Å resolution solvent molecules can be observed and added to the structural
model with some confidence. As the resolution improves to better than 2.0 Å resolution, fitting of individual atoms may be
possible and most of the amino-acid side chains can be readily assigned even in the absence of sequence information.
1 The Utility of Structural Biology in Drug Discovery 9
|Fo − Fc|αc maps. |Fo − Fc|αc maps, or difference maps, are generated
by subtracting the calculated structure factor amplitudes (Fc, from
the best current model structure) from the observed structure
factor amplitudes (Fo), using phase information (αc) calculated
from the available model structure. To a good approximation, this
operation is equivalent to subtracting the electron density calcu-
lated from the model from the “real” electron density in the crystal.
What is left behind is the electron density for ordered components
of the crystal structure that have not been accounted for by the
model, or that have not been modeled correctly. Features that are
present in the true structure that have not been accounted for in
the model structure appear as positive peaks, while atoms that have
been incorrectly placed in the model structure (i.e., that do not
exist in the real structure) appear as holes or negative peaks. These
maps are used to fix improperly modeled side-chains and/or entire
polypeptide chains, as well to fit substrates, inhibitors, and ordered
solvent molecules into the structure. A special type of difference
map called an omit map can be used to confirm the presence of
important features in a protein structure. An omit map is calcu-
lated by removing the feature of interest (say, an inhibitor) from
the model, refining the structure in the absence of that feature, and
calculating a new difference map. If the feature of interest is still
observed in a difference density map, then it is real, and not an
artifact caused by model bias present in the calculated phases. An
example of a difference map is shown in Fig. 4.
|2Fo − Fc|αc maps. |2Fo − Fc|αc maps are the maps most commonly
used for model fitting. They are used instead of |Fo|αc maps, which
suffer from model bias, and tend to show only electron density that
is associated with the model. As described above, |Fo − Fc|αc maps
reveal everything in the |Fo|αc map that has not been modeled. The
|2Fo − Fc|αc map essentially superposes an |Fo|αc map over an
|Fo − Fc|αc difference map, so that it simultaneously shows both the
electron density for the model and the electron density for features
that have not been accounted for by the model. Several weighting
schemes are used to further diminish the effects of model bias,
including figure-of-merit and σA weighting schemes (reviewed in
refs. (5–7)). An example of a |2Fo − Fc|αc electron density map is
shown in Fig. 4.
In addition to providing a more detailed picture of the elec-
tron density, higher resolution X-ray data correlates with a greater
number of experimental observations to support structure refine-
ment. For a typical protein structure from a crystal with a solvent
content of about 50%, the number of experimental observations
and refinement parameters will be about the same at 2.8 Å resolu-
tion. The paucity of experimental data compared with the number
of parameters that need to be defined make least squares model
optimization methods intractable. Additionally, at resolutions
lower than 2.8 Å, individual atomic B-factors have a very limited
10 L.W. Tari
Fig. 4. Examples of |Fo − Fc|αc and |2Fo − Fc|αc electron density maps. The electron density maps in all panels are drawn as
thin chicken-wire representations. In (a) an |Fo − Fc|αc map contoured at 3σ is used to fit an incorrectly modeled glutamic
acid side chain in an E. faecalis GyrB crystal structure. In the model structure, part of the side chain is in a negative electron
density peak, while a positive difference density peak on the left-hand side of the figure reveals the correct position for the
side chain from the experimental data. The correctly positioned glutamic acid side chain is shown in (b). In (c), an |Fo − Fc|αc
difference electron density map contoured at 3.5σ was used to fit a small-molecule inhibitor into the substrate-binding
pocket of E. faecalis gyrase B. The difference map was calculated in the absence of the inhibitor, indicating that the differ-
ence density shown arises entirely from the experimental X-ray data. Panel (d) shows a representative section of a
|2Fo − Fc|αc electron density map contoured at 1σ for an E. faecalis GyrB crystal structure. The map displays electron
density for both regions of the model that have been correctly fit, as well as regions that have not been accounted for by
the model. Because it comprises a superposition of an |Fo|αc map and a |Fo − Fc|αc map, |2Fo − Fc|αc maps are less subject
to the effects of model bias than |Fo|αc maps. During model fitting, crystallographers generally utilize |2Fo − Fc|αc and
|Fo − Fc|αc simultaneously to trace the polypeptide chain and correct errors in the existing model.
physical meaning. The problem of statistical under determination
is overcome by augmenting the X-ray diffraction data with struc-
tural parameters of proteins and peptides derived from small-mol-
ecule crystallography and spectroscopic data. The resulting function
that is minimized in a crystallographic structure refinement incor-
porates the experimental X-ray data and a molecular mechanics
function (which restrains bond lengths, angles, stereochemistry,
planarity of peptide bonds and aromatic groups, etc. to reasonable
values). The quality of structures refined in this fashion is excellent,
even for structures determined at modest resolutions. Properly
refined protein crystal structures generated from carefully mea-
sured X-ray data yield atomic positions that are precise to within
one fifth to one tenth of the stated experimental resolution. Once
a structure is fully refined, multiple criteria are used to judge the
quality of the model, as described below.
1 The Utility of Structural Biology in Drug Discovery 11
R-factor. The R-factor is the averaged error (in percent) between
the observed structure-factor amplitudes (the experimentally mea-
sured Fhkl values) and the calculated structure-factor amplitudes
(Fhklcalc) from the refined model of the contents of the crystal. The
ultimate value of the R-factor in a well-refined structure depends
on a number of variables, including the proportion of the contents
of the unit cell that can be correctly modeled, the relative weights
assigned to the molecular mechanics restraints vs. the experimental
X-ray data during refinement, the experimental resolution of the
diffraction experiment and the accuracy and overall quality of the
measured experimental X-ray intensities. In protein structures with
numerous dynamically disordered loops or domains that cannot be
modeled, the R-factor will not converge to low values. However,
as a general rule of thumb a correctly refined protein structure
should have an R-factor around 20%.
Free R-factor (Rfree). The function that is minimized during a
protein structure refinement is extremely complex, with multiple
false minima. Hence, when not used with care, modern refinement
algorithms can converge on convincing R-factors for incorrect
structures. The Rfree (12) statistic is an extremely simple and pow-
erful independent validation tool used in modern protein structure
refinement. The Rfree function is identical to the R-factor; the only
difference is that it is calculated using a small (5–10%) randomly
sampled subset of the X-ray diffraction data that is excluded from
structure refinement throughout the refinement process. In a cor-
rectly refined structure, Rfree will track with the R-factor to within
5–10%. For incorrect structures, Rfree will remain at a value near the
limit observed for random atomic models fit to an X-ray dataset
(~57%). In addition to Rfree, the geometric quality of the refined
protein structure should be used to evaluate the model. The aver-
aged bond lengths and angles of the final model should not deviate
much from ideal values (r.m.s. deviations from ideality should be
within 0.02 Å for bond lengths and 3° for bond angles), and the
majority of the protein residues should possess “allowed” combi-
nations of φ, ψ main-chain dihedral angles. It is important to note
that protein folding can force some residues into disallowed φ, ψ
values, which can have important functional significance (13). All
residues in disallowed regions must be carefully checked to ensure
that they are well described by experimental electron density.
Identification and refinement of ordered solvent molecules
becomes more reliable when data are available to at least 2.5 Å
resolution. Even then, before a water molecule is used in mecha-
nistic or computational analysis, it is always wise to check its
B-factor and to see if there exists at least one hydrogen bond to
hold the water to the protein or a nearby solvent molecule.
Unless the structure has been determined at very high resolu-
tion, electron density and refinement do not discriminate between
the oxygen and nitrogen atoms of asparagines and glutamines, or
12 L.W. Tari
the alternative conformations of histidine side chains. In a detailed
structural analysis, it is always necessary to check alternative con-
formations of Asn, Gln, or His side chains to decide which one
makes more sense chemically (i.e., by analyzing available H-bonding
networks). Also, great care has to be exercised when fitting dynam-
ically disordered protein side chains that are not fully described by
electron density. The crystallographer knows they are present from
the amino-acid sequence, and incorporates them in conformations
commonly observed for that side chain from databases of high-
resolution structures. The final refined conformation of the side
chain must ultimately be decided using the crystallographer’s
knowledge of chemistry and side-chain conformational prefer-
ences, in conjunction with the refinement program’s force field. In
many structures, entire loops or even domains are too disordered
to show any observable electron density. In such cases, the offend-
ing loops/domains are not included in the final model. When ana-
lyzing crystal structures, an additional point of caution that must
be noted regarding potential artifacts that can arise from contacts
between adjacent molecules in a crystal lattice. In the ideal sce-
nario, the protein of interest crystallizes in a lattice that leaves the
active-site/receptor pocket solvent exposed, with no lattice con-
tacts preventing the motion of functionally important mobile
structural elements surrounding the drug-binding site (i.e., the lat-
tice should not impede ligand-induced conformational changes in
the protein). However, protein crystallization does not allow for
control of lattice contacts, and the ideal situation does not always
exist. Hence, before a new protein crystal form is nominated as a
potential candidate for supporting structure-based drug design, a
careful analysis of the crystal lattice contacts between neighboring
molecules related by crystallographic or noncrystallographic sym-
metry must be carried out to assess the steric accessibility of the
receptor pocket and the solvent space around it, as well as the
nature and quantity of lattice contacts in the vicinity. This sort of
analysis is particularly important if the crystals are produced for the
purpose of ligand soaking experiments to support fragment screen-
ing or high throughput structure determination. If multiple crystal
forms are available, the crystal forms that approach the ideal crite-
ria should be chosen. Cocrystallization experiments usually cir-
cumvent problems related to lattice constraints, since the protein
and ligand are mixed in solution, allowing the system to reach a
low energy conformational state before crystallization occurs.
Additional important parameters to consider when analyzing crys-
tal structures are the solution conditions used in crystallization.
Some proteins undergo significant structural changes in different
solution conditions. A classic example is ribonuclease A, which
undergoes large, pH-dependent conformational changes that have
been characterized crystallographically (14).
1 The Utility of Structural Biology in Drug Discovery 13
3. Using Structure
in Target Selection
and Product
Profile In addition to supporting lead discovery and lead optimization,
Development structural information can be used at a very early stage in a drug
discovery program to evaluate the viability of a protein as a drug
target. Does the protein possess a binding pocket with suitable
properties for potent inhibitor development? In a large, structur-
ally related protein family, such as eukaryotic protein kinases, is it
possible to develop selective inhibitors against a kinase of interest?
More generally, what are the prospects for the development of spe-
cific inhibitors against a protein target while avoiding off-target
binding? In an antibiotic program, do the protein orthologs
encompassed by the proposed target product profile possess suffi-
cient structural homology to allow for the development of a small-
molecule agent with the desired spectrum? Careful analysis of the
structures of the protein target(s) of interest coupled with struc-
tural bioinformatics and molecular modeling can be used to address
questions such as those posed above. Such an analysis is important
to expose liabilities in target selection or the proposed drug prod-
uct profile early in a drug discovery program, before a substantial
investment of time, money and manpower has been made to pur-
sue a flawed hypothesis.
For example, in the antibacterial arena, the emergence of
genomics and proteomics has profoundly changed the approach
used for the identification of new targets essential for the survival
of bacteria (15). To highlight how this information is used to facili-
tate target selection, the analysis that led to the selection of bacte-
rial topoisomerases as prospective drug targets at the author’s
company is summarized below. To pursue a drug discovery pro-
gram, we sought essential bacterial targets with the following prop-
erties: (1) Novel proteins that are not targets of marketed antibiotics,
to avoid issues of cross-resistance with existing antibiotics. (2)
Targets possessing recessed ligand-binding pockets with mixed
polar/lipophilic character, the potential for solvent sheltered
“anchoring interactions” and no closely related human counter-
parts. (3) A high degree of sequence/structure conservation in the
ligand-binding pockets of the protein target(s) across bacterial spe-
cies commonly implicated in bacterial infections. (4) If possible,
the option to inhibit multiple bacterial targets with a single
therapeutic agent to minimize the threat of resistance emergence.
A detailed structural bioinformatics analysis of proteins in several
key bacterial pathways revealed the bacterial topoisomerases DNA
gyrase and topoisomerase IV as prospective drug targets that met
the criteria listed above. DNA gyrase is a type II topoisomerase
that plays an essential role in bacterial DNA replication with no
direct mammalian counterpart. The enzyme catalyzes the intro-
duction of negative supercoils into DNA using the free energy of
14 L.W. Tari
ATP hydrolysis (16). DNA gyrase consists of two subunits, GyrA
and GyrB that form a functional heterodimer A2B2. GyrA is involved
in DNA cleavage and religation, while the GyrB domain contains
the ATP-binding site and mediates the passage of the uncut DNA
strand through the strand that is cleaved by GyrA (16). A closely
related bacterial enzyme from the topoisomerase II family is topoi-
somerase IV (topo IV), which also forms a heterodimer C2E2 con-
sisting of two ParC subunits and two ParE subunits (17). Despite
possessing a high degree of sequence identity with DNA gyrase,
topo IV is involved in different aspects of DNA replication than
gyrase. The two topoisomerase complexes are well established drug
targets. Fluoroquinolone antibiotics, such as ciprofloxacin, exert
their antimicrobial activity via inhibition of the GyrA and ParC
subunits (18). However, no commercial antibiotics have yet
reached the market which target the ATP binding domains of the
respective topoisomerase complexes (GyrB and ParE), despite the
fact that GyrB and/or ParE inhibition has been shown to effec-
tively kill bacteria (19). A sequence alignment of the ATP-binding
domains of DNA gyrase and topo IV from key pathogens involved
in community acquired pneumonia mapped on to the crystal structure
of one of the enzymes (see Fig. 5), suggests that the development
Fig. 5. A solvent accessible surface representation of the ATP-binding pocket of GyrB from
the crystal structure of E. faecalis GyrB complexed with a benzimidazole inhibitor
(D. Bensen and L. Tari, unpublished results). The surface is colored by the degree of sequence
conservation observed in the underlying residues for GyrB and ParE enzymes from the
major pathogens implicated in community acquired pneumonia. Amino-acid sequences
for the relevant proteins were extracted from the KEGG database (20) and sequence align-
ments were performed with CLUSTALW (21). The high degree of overall sequence conser-
vation (not shown) and the remarkable degree of sequence conservation in the ATP-binding
pockets of the selected GyrB and ParE orthologs suggest that the geometries and compo-
sitions of the active sites of the enzymes from the different pathogens possess sufficient
similarity to allow for the development of dual targeting, broad spectrum inhibitors.
Subsequent generation of homology models and crystal structures of several of the
orthologs listed on the figure confirmed this hypothesis.
1 The Utility of Structural Biology in Drug Discovery 15
of broad spectrum, dual-targeting inhibitors against these enzymes
is feasible. As the above example demonstrates, structural bioinfor-
matics can be an important component in the target selection pro-
cess and drug product profile determination early in a drug
discovery program.
4. Using
Crystallographic
Methods to Initiate
a Drug Discovery The likelihood of success in a small-molecule drug discovery pro-
Program gram is greatly enhanced by the availability of multiple molecular
scaffolds that bind to and elicit the desired effects on the protein
target, while offering prospects for optimization into drug leads.
However, the discovery of viable molecular scaffolds for SBDD
and medicinal chemistry optimization is not trivial. HTS, when
successful, often delivers hits with high molecular weights and poor
potential for optimization. The probability of a small-molecule
ligand matching the shape and chemistry of a protein target
decreases as the complexity and size of the ligand increases, since
there exists a greater chance that some part of the ligand will pos-
sess features that do not complement those of the protein target.
Theoretically, the probability that a small molecule will bind to a
protein target decreases exponentially with increasing ligand com-
plexity (22). Thus, there is an advantage to screening for hits using
less complex, lower molecular weight compounds (called frag-
ments, with molecular weights ranging from 100 to 250 Da),
which interact with only a small number of sites on the protein and
possess a greater chance of achieving favorable steric and chemical
complimentarity with the protein target. However, the advantage
of screening with fragments is offset by the fact that fragments
generally bind with much lower affinities than the larger com-
pounds typically screened in HTS. Most biophysical techniques
perform poorly at detecting weak binding, limiting their utility in
screening fragment libraries. X-ray crystallography, however, is an
extremely sensitive technique, capable of detecting compounds
with binding constants in the low millimolar range. The extension
of crystallographic methods into the high-throughput realm over
the past decade has led to the adoption of crystallographic frag-
ment screening in many industrial and academic centers as a drug
discovery tool. In this section, the two flavors of crystallographic
fragment screening are reviewed: random fragment screening and
pharmacophore-based fragment screening.
4.1. Random The basic premise of crystallographic fragment screening is simple.
and Pharmacophore- A protein target is screened against a small library (typically <1,000
Based Fragment molecules) of structurally diverse, highly soluble low molecular weight
Screening compounds. The library is screened in one of two ways: pregrown
16 L.W. Tari
crystals of the protein of interest are soaked with concentrated
solutions (in aqueous solution or dimethyl sulfoxide) of individual
compounds or mixtures of compounds, or, the protein is crystal-
lized in the presence of compounds/compound mixtures. The lat-
ter method has the advantage of allowing the protein to undergo
compound induced conformational changes that may be precluded
in a preformed crystal lattice. However, cocrystallization generally
involves a scan of multiple crystallization conditions to generate
usable crystals and can lead to multiple crystal forms, so it is more
labor intensive, and requires much larger quantities of protein and
fragment solutions for screening. Once putative protein-fragment
complex crystals are created, X-ray data collection, crystal structure
solution and electron density map interpretation can proceed in a
high throughput manner, using high-flux laboratory or synchro-
tron X-ray sources equipped with sample handling robotics
(described in Chapter 5), and automated software for structure
solution, refinement and electron density map generation (described
in Chapter 6). A schematic representation of random fragment
screening and optimization paths from initial fragment hits is
shown in Fig. 6. Fragment screening methods are described in
more detail in Chapter 7.
An absolute requirement for the application of crystallographic
fragment screening is a target protein that is amenable to crystal-
lization (in its apo-form if crystal soaking experiments are used to
introduce fragments to the target). Moreover, the crystals must
routinely diffract to an adequate resolution (beyond 2.5 Å) to pro-
vide a detailed picture of the targeted binding pocket in the pro-
tein and the binding modes of bound fragments, and optimally, to
resolve ordered waters. When crystal soaking methods are
employed, the crystals must possess sufficient mechanical stability
to withstand the osmotic pressure generated during exposure to
concentrated fragment solutions, and provide unblocked access to
the target site. Once these prerequisites are met, crystallographic
fragment screening provides insights not offered by other screen-
ing techniques. Crystallographic experiments generate electron
density maps showing the binding mode of the fragment to the
protein target in three dimensions. A detailed knowledge of the
Fig. 6. A schematic representation of crystallographic fragment screening with random fragments. Fragment screening
libraries typically contain <1,000 structurally diverse compounds that meet the following criteria: (1) Molecular
weights <300 Da, (2) Less than three H–bond donors/acceptors, polar surface area <60 Å2, less than 3 rotatable bonds, (3)
c Log P < 3. The libraries can be screened using mixtures of 3–5 compounds, or using individual fragment solutions. (a) Two
fragments (the square and triangle) are shown binding to distinct regions of a target receptor binding-pocket. Based on
crystal structures of the fragment-receptor complexes, several optimization strategies can be employed. Fragments bound
to spatially distinct regions of the receptor can be linked as in (b) to form a more potent inhibitor, with an inhibition constant
(Ki ) proportional to the products of the Kids of the individual fragments. Or, as in (c), individual fragments can be optimized
to improve the steric and chemical fit with the receptor pocket. Additionally, fragments can be elaborated or “grown” into
adjacent pockets in the receptor site, as shown in (d).
1 The Utility of Structural Biology in Drug Discovery 17
18 L.W. Tari
key binding interactions, available analoging vectors off the bound
fragment(s) and accessible space in the protein defines the spatial
and chemical constraints on fragment optimization, streamlining
the optimization process. Additionally, crystallographic fragment
screening facilitates the identification of false positive hits and frag-
ments that bind to nonproductive sites on the protein target.
Conversely, fragment screening can identify novel binding sites
that impact protein function with therapeutic development poten-
tial. The weakness of X-ray crystallography as a screening method
is that it does not provide information on binding affinity and this
data must be obtained with a different technique (i.e., a solution
based assay, as described in Chapter 8). However, obtaining the
binding affinity of initial small fragment hits may be intractable and
testing for potency may only become feasible once elaborated
follow-on compounds are available.
By utilizing custom designed chemical fragment libraries based
on a known target pharmacophore model (as summarized in
Fig. 7), pharmacophore-based fragment screening differs philo-
sophically from the use of random chemical fragments in screen-
ing. Small molecules are designed or selected from commercial
libraries, to key in on specific H-bonding, electrostatic, lipophilic,
or π–stacking interactions in the receptor pocket of the target. The
same criteria used in random fragment library design for fragment
Fig. 7. Schematic outline of pharmacophore-based fragment screening. In this example, a simplified receptor-binding
pocket is shown that contains a closely spaced pair of H-bond donor/acceptor moieties comprising the pharmacophore
used to guide fragment library design. (a) Owing to steric incompatibilities, larger molecules frequently cannot bind to the
target, despite containing a potentially useful core. (b) To circumvent the problems observed in (a), and to find novel inhibi-
tor cores to form the basis for potent inhibitors, pharmacophore-based screening methods are effective. Using cheminfor-
matics software such as MOE™ (23), novel molecules can be designed and synthesized that contain the desired
pharmacophore, or commercial libraries can be searched for small-molecule entities with the target pharmacophore.
Crystallographic screening methods are then used to screen potential candidates for hits. (c) Using the three-dimensional
structural information describing the binding modes of fragment hits, the fragments are modified and elaborated with new
chemical groups to improve the fit between inhibitor and receptor, and to engage additional pockets in the receptor with
potency increasing interactions.
1 The Utility of Structural Biology in Drug Discovery 19
size, solubility, etc. are applied to the custom designed libraries
used in pharmacophore-based screening. Starting with small frag-
ment units with known binding interactions, chemical lead series
can be rapidly discovered and optimized for drug-like properties.
The main advantages of designing fragments around a defined
pharmacophore are the creation of highly ligand efficient molecu-
lar scaffolds that target the most energetically rewarding interac-
tions (i.e., nonsolvent exposed, available polar interactions) of the
target receptor pocket, and the ability to design molecules that
achieve the desired selectivity profile by targeting a selected region
of the target receptor pocket. For example, if the product of inter-
est is a broad-spectrum antibiotic against a specific bacterial pro-
tein target, fragment libraries can be designed to engage only the
most conserved regions of the protein target across the different
bacterial species.
5. Using X-Ray
Crystallography in
Lead Optimization
The simplistic view of structure-guided lead optimization is that
structural information from crystallographic structures of com-
plexes of lead candidates with the protein target are used as an
in vitro assay of sorts, to optimize the potency of the lead against
the drug target. However, drug discovery requires optimization of
a number of properties, including solubility, intestinal absorption,
tissue distribution, metabolic stability, plasma protein binding,
elimination, toxicology, and cost of synthesis. To highlight the
importance of using structure-based methods in the broader con-
text of a drug discovery program, it is instructive to follow the
trajectory of neuraminidase inhibitor development that led to the
discovery of the Tamiflu™. Influenza virus neuraminidase has long
been recognized as a potential target in the treatment of influenza.
Molecular modeling studies based on crystal structures of
neuraminidase inhibitor complexes suggested that substitution of
the 4-hydroxyl group (structure 1 in Fig. 8) in a compelling lead
molecule with a charged basic group would yield a more potent
inhibitor (25). Indeed, replacement of the hydroxyl group by a
basic guanidine (structure 2 in Fig. 8) resulted in a 5,000-fold
increase in potency. Ultimately, this compound (zanamivir) was
developed by GlaxoSmithKline and led to the first marketed
neuraminidase inhibitor used in the treatment of influenza,
Relenza™. However, Relenza™ is not absorbed orally due to its
high polarity and basicity, necessitating the development of a dry
powder inhaler to topically dose the compound in the lung (26).
Substitution of the dihydro-2H-pyran scaffold by cyclohexene and
replacement of the polar glycerol and basic guanidinyl moieties
with a 1-ethylpropoxy and a primary amine moiety, respectively,
20 L.W. Tari
Fig. 8. Structures of neuraminidase inhibitors: (1), lead molecule, (2), Zanamivir (Relenza™), (3), Oseltamivir (Tamiflu™).
The ester prodrug is cleaved hepatically to form the carboxylic acid. Semitransparent solvent accessible surface represen-
tations of the structures of both drugs bound to H5N1 avian influenza virus neuraminidase (24) (PDB codes 2HTQ and
2HT8) are shown to illustrate the interaction between the zanamivir guanidine and the active-site pocket, and to highlight
the conformational changes induced by the carbocyclic scaffold and ethylpropoxy group in oseltamivir upon drug binding
to the enzyme.
generate oseltamivir (sold as Tamiflu™, structure 3 in Fig. 8), a
smaller, less polar inhibitor than Relenza™ that retains sufficient
potency for efficacy. The improved ligand efficiency observed for
Tamiflu™ is due in part to the ethylpropoxy group, which induces
a conformational change in key active-site pocket residues and par-
ticipates in lipophilic interactions (24, 27). Conversion of the zwit-
terionic parent compound to the ethyl ester pro-drug allow the
compound to be administered orally, making Tamiflu™ the first
neuraminidase inhibitor used as an oral anti-influenza drug. The
improved physical property profile of Tamiflu™ vs. Relenza™
translated to considerable commercial success. In 2008, Tamiflu™
outsold Relenza™ by a factor of 5:1 (www.marketresearchmedia.
com). This example highlights the difference between good inhibi-
tors and good drugs. When applying structure-based methods,
absorption, distribution, metabolism, and elimination (ADME)
properties need to be addressed during the quest for potency to
1 The Utility of Structural Biology in Drug Discovery 21
avoid complications later in the drug discovery process. In the case
of the neuraminidase example above, crystallographic studies
revealed a ligand-induced active-site conformation that allowed for
the design of a small, moderately polar, less charged molecule with
superior drug-like properties to the first generation drug. When
used in such a manner, structural information can play a powerful
role in drug discovery. X-ray crystallographic methods can provide
information about active-site or ligand-binding pocket architec-
ture and its relationship to the functional state of a protein, bind-
ing pocket plasticity and small-molecule binding modes that can
dramatically streamline lead optimization. Additionally, crystal
structures can play a key role in resolving unexpected structure
activity relationships (SAR) arising from incorrect small-molecule
structure assignments, unanticipated small-molecule binding
modes or receptor plasticity. Additional examples highlighting
the utility of X-ray crystallography in lead optimization are pro-
vided below.
Once an experimental atomic structure of a protein–small mol-
ecule lead complex is in hand, available analoging vectors off the
lead molecule can be identified by the medicinal chemist and used
to guide the synthesis of the next molecule, as described in Fig. 7.
Vectors pointing toward empty pockets in the receptor can be filled
by complimentary groups to increase potency, while solvent facing
vectors can be used to generate analogs with improved bulk prop-
erties or metabolic stability. Knowledge of the structural and
chemical landscape of the receptor pocket also focuses optimiza-
tion efforts on analogs that add potency mainly via enthalpic (i.e.,
polar) interactions vs. analogs that add potency via entropic (i.e.,
lipophilic) interactions, improving the prospects for selective inhib-
itor binding and reducing the probability of off-target mediated
toxicity. Structure-based methods can also play a key role in more
complex systems, where the protein target must be captured in a
specific functional and structural state to achieve efficacy or to
improve prospects for designing selective inhibitors. For example,
stem cell factor receptor, c-Kit, is a receptor protein-tyrosine kinase
that initiates cell growth and proliferation signal transduction cas-
cades in response to stem cell factor binding (28). The kinase is
activated and transphosphorylates via dimer formation mediated
by the binding of stem cell factor dimers to its extracellular domain.
Mutations that constitutively activate c-Kit in the absence of the
stem cell factor are implicated in several highly malignant human
cancers, making it a validated target for the development of anti-
cancer drugs (29). Detailed analysis of the crystal structures of
c-Kit in multiple functional states (30), including an autoinhibited
form, an activated form, and a drug-bound form, reveal that the
kinase adopts discrete structural states when transitioning from an
autoinhibited to an activated state (see Fig. 9). The structural
results provide a detailed molecular basis for understanding the
22 L.W. Tari
Fig. 9. Ribbon representations of crystal structures of c-Kit kinase in three forms; (a) activated, in complex with ADP and
Mg2+, (b) unphosphorylated, containing the entire juxtamembrane region (autoinhibited state), and (c) in complex with the
anticancer drug Gleevec™. The mobile activation loop is colored black in each panel. Gleevec™ is a fairly selective inhibi-
tor that binds to few kinases, including Abl kinase and platelet-derived growth factor kinase (31, 32). The basis for this
selectivity stems from the fact that the inhibitor targets a kinase conformation that resembles the inactive state (compare
the activation loop conformations in (b) and (c)). However, Gleevec™ binding disrupts the fully autoinhibited state by pre-
venting the association of the juxtamembrane domain with the kinase domain. These results demonstrate that selective
inhibitors of type III protein-tyrosine kinases can be developed to exploit the unique autoinhibited conformations of these
kinases.
mechanism of c-Kit kinase autoinhibition, and snapshots of unique
structural states that are exploitable for the structure-based design
of specific and potent inhibitors targeting the activated or autoin-
hibited conformations of c-Kit kinase, as exemplified by the struc-
ture of c-Kit bound to the anticancer drug Gleevec™ described in
the study.
Many examples exist in the literature that demonstrate the
power of crystallographic methods in revealing receptor plasticity
in protein drug targets resulting in surprising ligand-induced con-
formational changes and inhibitor SAR. A case in point is a mem-
ber of the human histone deacetylase (HDAC) protein family, a
series of validated oncology targets. In eukaryotes, HDACs modu-
late the acetylation of histones and hence play a key role in the
regulation of gene expression (33). HDAC deregulation has been
linked to several types of cancer, and recently, the HDAC inhibitor
suberoylanilide hydroxamic acid (SAHA) was approved for the
treatment of cutaneous T-cell lymphoma (34). Crystal structures
of several inhibitor bound complexes of human HDAC8 (35)
reveal that the surface of the active-site pocket contains flexible
1 The Utility of Structural Biology in Drug Discovery 23
Fig. 10. Solvent accessible surface representations around the active-site pockets of the structures of complexes of HDAC8
with (a) trichostatin A (TSA) and (b) the anticancer drug suberoylanilide hydroxamic acid (SAHA). TSA induces dramatic
conformational changes in several surface elements of HDAC8, creating a second deep pocket adjacent to the active-site
pocket. A second TSA molecule occupies the newly formed pocket.
elements that can adopt diverse conformations in response to
inhibitor binding. In one of the complexes (see Fig. 10), a loop on
the protein surface moves, revealing a deep pocket adjacent to the
active-site pocket. This work suggests that HDAC8 inhibitors
could be designed with isoform selectivity, despite the highly con-
served nature of HDAC active-site pockets (35). The HDAC
example highlights the importance of using crystallographic meth-
ods for the characterization of novel, low energy conformational
states of protein drug targets that can be exploited for the design
of selective inhibitors. Such insights would not be possible without
the detailed information provided by X-ray crystallography.
In addition to the characterization of receptor plasticity and
the correlation of protein functional states with their underlying
structures, careful application of crystallographic methods can be
used to resolve very detailed questions relating to small-molecule
inhibitor structure, binding mode, and, in many cases, ionization
state. When high-resolution (typically <2.2 Å) X-ray data are avail-
able, cases of mistaken ligand identity can be resolved, or the exact
stereochemistry of a protein-bound small molecule (from a mix-
ture of isomers) can be determined unambiguously, revealing the
stereochemical preferences of the receptor pocket. In favorable
cases, small-molecule ligands can even be fit to electron density
without prior knowledge of the structure of the small molecule.
Additionally, the experimentally observed conformations of inhibi-
tors bound to protein targets can be subjected to in silico confor-
mational analysis to reveal cases where ligand binding to the target
incurs a significant energetic penalty, resulting in reduced inhibitor
potency. Based on the results, molecular modeling can be used to
design optimized inhibitors that “preorganize” into competent
binding conformations in solution, allowing for the development
24 L.W. Tari
Fig. 11. Demonstration of how crystallographic structural information can be used to deduce the protonation states of
ionizable moieties and the hydrogen bonding networks between inhibitors and their protein targets. (a) A close-up view of
the binding of a napthylsulphonyl-amidino-phenylalanine inhibitor to bovine β-trypsin from the high-resolution crystal
structure (36). For clarity, only the interactions between the side chains for the serine protease catalytic triad and the inhibi-
tor are shown. Potential hydrogen bonds are depicted as dotted lines. As shown in (b), the carboxylic acid moiety of the
inhibitor must be protonated, since the hydroxyl proton of Ser195 engages the imidazole ring of His57. The protonation state
of His57 is locked as shown by its interaction with Asp102.
of inhibitors with improved potency often without increasing
molecular weight.
Small-molecule lead optimization is hampered if the protona-
tion states of key acidic and basic amino-acid side chains and/or
the tautomeric states of ionizable groups on protein-bound small-
molecule inhibitors are not understood. Assuming that the param-
eters for amino-acid side chains in a folded protein are similar to
their counterparts in solution is not always correct; the pKa values
for basic and acidic amino-acid side chains on a protein interior can
shift dramatically (>2 units) from their value in aqueous solution,
1 The Utility of Structural Biology in Drug Discovery 25
based on the microenvironment created around the residue by the
protein structure (36). Furthermore, the pKa values of amino-acid
side chains can change upon complexation by an inhibitor, as can
the pKa values of the ionizable groups on the inhibitor. Although
impossible to determine experimentally from a protein crystal
structure, the positions (or presence, in the case of groups with
exchangeable protons) of hydrogen atoms can usually be inferred
from the molecular mechanics force fields used in structure refine-
ment and a careful analysis of the environment surrounding the
group in question. An example of how X-ray structural informa-
tion can be used to deconvolute complex hydrogen-bonding
networks and assign protonation states to ionizable groups is
shown in Fig. 11. Crystallographic structures and isothermal titra-
tion calorimetry were used to map the network of hydrogen bonds
of several thrombin and trypsin inhibitors, as well as the protona-
tion states of ionizable inhibitor moieties and active-site side
chains (37).
6. Summary
Structure-based drug design is now a staple in the pharmaceutical
industry and has contributed to the discovery of many marketed
drugs and late-stage clinical candidates. Access to detailed three-
dimensional structural information on protein drug targets can
streamline many aspects of drug discovery, from target selection
and target product profile determination, to the discovery of
novel molecular scaffolds that form the basis of potential drugs,
to lead optimization. Structural biology is currently in its golden
era; the advent of high-throughput methods for all of the steps
involved in the generation of protein crystal structures allow
empirically derived structural information to drive iterative lead
optimization efforts in real time for a wide range of protein
targets, avoiding many of the limitations that plague molecular
modeling techniques. Crystallographic methods are useful for
characterizing the structures correlated with specific functional
states of protein targets or alternative conformations of receptor
pockets that lead to unique structural states. Such information
can be leveraged to develop exquisitely selective small-molecule
ligands that target specific proteins, even in closely related pro-
tein families. When used carefully in conjunction with ADME
data during the lead optimization process, X-ray crystallographic
methods are an extremely powerful tool in the drug discovery
arsenal that will continue to contribute to the invention of new
medicines in diverse therapeutic areas.
26 L.W. Tari
References
1. Pharmaceutical Manufacturers Association 17. Peng, H. and Marians, K. J. (1993) Escherichia
(1993) ‘Facts at a Glance’, Washington DC. coli topoisomerase IV. Purification, character-
2. Grabowski, H. J. G. and Vernon, J. M. (1994) ization, subunit structure and subunit interac-
Returns to R&D on new drug introductions in tions. J. Biol. Chem. 268, 24481–24490.
the 1980s. J. Health Econ. 13, 282–406. 18. Wolfson, J. S. and Hooper, D. C. (1985) The
3. Gustafsson, D., Byland, R., Antonsson, T., fluoroquinolones: structures, mechanisms of
Nilsson, I., Nystrom, J. –E, Eriksson, E., action and resistance, and spectra of activity
Bredberg, U. and Teger-Nilsson, A. –C. (2004) in vitro. Antimicrob. Agents Chemother. 28,
A new oral anticoagulant: the 50-year chal- 581–586.
lenge. Nature Rev. Drug. Discov. 3, 649–659. 19. Oblak, M., Kotnik, M. and Solmajer, T. (2007)
4. McCoy, A. J., Grosse-Kunstleve, R. W., Adams, Discovery and Development of ATPase
P. D., Winn, M. D., Storoni, L. C. and Read, Inhibitors of DNA Gyrase as Antibacterial
R. J. (2007) Phaser crystallographic software. Agents. Curr. Med. Chem. 14, 2033–2047.
J. Appl. Cryst. 40, 658–674. 20. Kanehisha, M., Goto, S., Kawashima, S.,
5. Blundell, T. L. and Johnson, L. N. (1976) In Okuno, Y. and Hattori, M. (2004) The KEGG
Protein Crystallography. Academic Press, New resource for deciphering the genome. Nucleic
York. Acids Res. 32, 277–280.
6. Stout, G. H. and Jensen, L. H. (1989) In X-ray 21. Thompson, J. D., Higgins, D. G. and Gibson,
Structure Determination: A Practical Guide. T. J. (1994) CLUSTALW: improving the
2nd ed. Wiley, New York. sensitivity of progressive multiple sequence
7. Drenth J. (1999) In Principles of protein x-ray alignments through sequence weighting, posi-
crystallography. 2nd ed. Springer, New York. tion specific gap penalties and weight matrix
choice. Nucleic Acids Res. 22, 4673–4680.
8. Stout, G. H. and Jensen, L. H. (1989) In X-ray
Structure Determination: A Practical Guide. 22. Hann, M.M., Leach, A.R. and Harper, G.
2nd ed. Wiley, New York, Chapters 7–9. (2001) Molecular complexity and its impact on
the probability of finding leads for drug discov-
9. Winn, M. D., Murshudov G. N. and Papiz, M. ery. J.Chem.Inf.Comput.Sci. 41, 856–864.
Z. (2003) Macromolecular TLS refinement in
REFMAC at moderate resolutions. Methods 23. Labute, P. and Clark A. M. (2007) 2D
Enzymol. 374, 300–321. Depiction of Protein-Ligand Complexes.
J. Chem. Inf. Model 47, 1933–1944.
10. Drenth J. (1999) In Principles of protein x-ray
crystallography. 2nd ed. Springer, New York, 24. Russell, R. J., Haire, L. F., Stevens, D. J.,
pp. 89–90. Collins, P. J., Lin, Y. P., Blackburn, G. M., Hay,
A. J., Gamblin, S. J. and Skehel, J. J. (2006)
11. Emsley, P. and Cowtan K. (2004) Coot: model- The structure of avian flu neuraminidase
building tools for molecular graphics Acta suggests new opportunities for drug design.
Cryst. D60, 2126–2132. Nature. 443, 45–49.
12. Brünger, A. T. (1992) Free R value: a novel 25. von-Itzstein, M., Wu, W. Y., Kok, G. B., Pegg,
statistical quantity for assessing the accuracy of M. S., Dyason, J. C., Jin, B., Van Phan, T.,
crystal structures. Nature 355, 472–475. Smythe, M. L., White, H. F., Oliver, S. W.,
13. Jia, Z., Vandonselaar, M., Quail, J. W. and Colman, P. M., Varghese, J. N., Ryan, D. M.,
Delbaere, L. T. J. (1993) Active-center torsion- Woods, R. C., Bethell, R. C., Hotham, V. J.,
angle strain revealed in 1.6 Å-resolution struc- Cameron, J. M and Penn, C. R. (1993) Rational
ture of histidine-containing phosphocarrier design of potent sialidase-based inhibitors of
protein. Nature 361, 94–97. influenza virus replication. Nature. 363,
14. Bersio, R., Lazmin, V. S., Sica, F., Wilson, K. 418–423.
S., Zagari, A. and Mazzarella, L. (1999) Protein 26. (2001) In Physicians’ Desk Reference. 55th ed.
titration in the crystal state. J. Mol. Biol. 292, Medical Economics Company Inc. Montvale,
845–854. NJ, p.1454.
15. Payne, D. J., Gwynn, M. N., Holmes, D. J. and 27. Kim, C. U. Lew, W., Williams, M. A., Liu, H.,
Pompliano, D. L. (2007) Drugs for bad bugs: Zhang, L., Swaminathan, S., Bischofberger,
confronting the challenges of antibacterial dis- N., Chen, M. S., Mendel, D. B., Tai, C. Y.,
covery. Nat. Rev. Drug. Discov. 6, 29–40. Laver, W. G. and Stevens, R. C. (1997)
16. Champoux, J. J. (2001) DNA topoisomerases: Influenza neuraminidase inhibitors possessing a
structure, function and mechanism. Annu. Rev. novel hydrophobic interaction in the enzyme
Biochem. 70, 369–413. active-site: design, synthesis, and structural
1 The Utility of Structural Biology in Drug Discovery 27
analysis of carbocyclic sialic acid analogues with factor receptors. J. Pharmacol. Exp. Ther. 295,
potent anti-influenza activity. J. Am. Chem. Soc. 139–145.
119, 681–690. 33. Khochbin, S., Verdel, A., Lemercier, C. and
28. Linnekin, D. (1999) Early signaling pathways Seigneurin-Berny, D. (2001) Functional sig-
activated by c-Kit in hematopoietic cells. Int. J. nificance of histone deacetylase diversity. Curr.
of Biochem. Cell Biol. 31, 1053–1074. Opin. Genet. Dev. 11, 162–166.
29. Hirota, S., Isozaki, K., Moriyama, Y., Hashimoto, 34. Marks, P. A. and Xu, W. S. (2009) Histone-
K., Nishida, T., Ishiguro, S., Kawano, K., deacetylase inhibitors: potential in cancer ther-
Hanada, M., Kurata, A., Takeda, M., Tunio, G. apy. J. Cell Biochem. 107, 600–608.
M., Matsuzawa, Y., Kanakura, Y., Shinomura, Y. 35. Somoza, J. R., Skene, R. J., Katz, B. A., Mol,
and Kitamura, Y. (1998) Gain of function muta- C. D., Ho, J. D., Jennings, A. J., Luong, C.,
tions of c-Kit in human gastrointestinal stromal Arvai, A., Buggy, J. J., Chi, E., Tang, J., Sang,
tumors. Science. 279, 577–580. B. C., Verner, E., Wynands, R., Leahy, E. M.,
30. Mol, C. D., Dougan, D. R., Schneider, T. R., Dougan, D. R., Snell, G., Navre, M., Knuth,
Skene, R. J., Krause, M. L., Schiebe, D. N., M. W., Swanson, R. V., McRee, D. E. and Tari,
Snell, G. P., Zou, H., Sang, B. –C. and Wilson, L. W. (2004) Structural snapshots of human
K. P. (2004) Structural Basis for the autoinhibi- HDAC8 provide insights into the class I his-
tion and ST-571 inhibition of c-Kit tyrosine tone deacetylases. Structure 12, 1–20.
kinase. J. Biol. Chem. 279, 31655–31663. 36. Harris, T. K. and Turner, G. J. (2002) Structural
31. O’Dwyer, M. E., Mauro, M. J., and Druker, B. basis of perturbed pKa values of catalytic groups
J. (2003) STI571 as a targeted therapy for in enzyme active sites. IUBMB Life 53, 85–98.
CML. Cancer Investig. 3, 429–438. 37. Dullweber, F., Stubbs, M. T., Musil, D.,
32. Buchdunger, E., Cioffi, C. L., Law, N., Stover, Sturzebecher, J. and Klebe, G. (2001)
D., Ohno-Jones, S., Druker, B. J. and Lydon, Factorising ligand affinity: A combined ther-
N. B. (2000) Abl protein-tyrosine kinase inhib- modynamic and crystallographic study of
itor STI571 inhibits in vitro signal transduction trypsin and thrombin inhibition. J. Mol. Biol.
mediated by c-kit and platelet-derived growth 313, 593–614.
Chapter 2
Genetic Construct Design and Recombinant Protein
Expression for Structural Biology
Suzanne C. Edavettal, Michael J. Hunter, and Ronald V. Swanson
Abstract
Obtaining diffraction quality crystals is frequently an iterative process which traditionally has involved
screening large numbers of crystallization conditions. Due to advances in high-throughput gene engi-
neering, recombinant expression, and purification, the protein of interest has now become one of the
many variables routinely investigated during crystallization trials. As such, construct design is a critical step
in the path toward successful crystallization. In this chapter will we address construct design strategies
frequently employed to improve the solution and crystallization behavior of proteins. Topics covered
include choosing a recombinant expression system and reducing disorder through truncations and surface
mutagenesis. Also covered are strategies to reduce heterogeneity from posttranslational modifications,
impurities, and aggregation.
Key words: Protein Expression Constructs, Recombinant Protein Expression, X-ray crystallography,
protein crystallization
1. Introduction
A protein crystal represents a homogeneous population of three-
dimensionally arrayed protein molecules. To maximize the likeli-
hood of crystallizing a protein from a solution, it is important to
minimize the heterogeneity of the sample. The common miscon-
ception is that homogeneity is synonymous with purity at the pro-
tein contaminant level. For protein crystallization, this idea that
purity and homogeneity are synonymous is an oversimplification.
While purity is fundamentally important, the advent of recombi-
nant methods for overexpression, coupled with affinity chroma-
tography tags, has made achieving highly pure protein samples less
problematic than in the past. In addition, there are other, perhaps
Leslie W. Tari (ed.), Structure-Based Drug Discovery, Methods in Molecular Biology, vol. 841,
DOI 10.1007/978-1-61779-520-6_2, © Springer Science+Business Media, LLC 2012
29
30 S.C. Edavettal et al.
underappreciated, protein characteristics where heterogeneity
arises. At the conformational level, loops and/or the termini of the
protein may exhibit disorder, and in larger multi-domain proteins
the individual domains may be flexible relative to one another lead-
ing to many conformational isomers of the protein in solution and
representing a barrier to crystallization due to structural heteroge-
neity. Poor crystal formation can also arise from aggregation or
changes in monodispersity representing higher order solution state
heterogeneity. Heterogeneity may also be present at the chemical
level of the protein from posttranslational modifications, such as
phosphorylation or glycosylation, which are often incomplete or
heterogeneous. Proteolysis either during purification or expression
can also introduce heterogeneity. In addition, nonenzymatic chem-
ical modifications such as oxidation can occur, creating subspecies
of closely related, difficult to distinguish, contaminants. These
sources of heterogeneity can be addressed by choosing an appro-
priate expression system, designing proper construct boundaries,
advantageous mutational changes and an effective purification
strategy, the latter of which will be addressed in Chapter 3.
Frequently, generating a truly homogeneous crystallizable protein
solution will require exploring several constructs and variables. In
this chapter we explore the considerations used in determining
construct design as well as vector–host combinations aimed at
minimizing heterogeneity and generating homogenous protein
solutions that will lead to well-diffracting protein crystals.
2. Construct
Design
Protein engineering is a powerful tool for improving protein phys-
iochemical properties leading to proteins that are more stable,
soluble, and have a higher propensity to crystallize. The concept of
the protein as a variable in protein crystallization (1) is important
and has been enabled by modern molecular biology techniques.
Securing downstream success hinges primarily on rational con-
struct design and can be undertaken independent of vector/host
choice. The overall goal in construct design is to produce large,
homogenous quantities of soluble proteins with a high likelihood
to crystallize. An important decision in this effort is the choice of
expression boundaries as all other changes take place within these
confines. The degree of difficulty in determining domain boundar-
ies depends on the degree of similarity of the protein of interest to
other proteins and, in particular, to proteins of known structure.
Side chain, loop, or termini flexibility (i.e., changes in entropy) can
also be addressed in construct design. Alterations of surface exposed
residues, both hydrophilic and hydrophobic, can lead to better
behaved protein solutions and protein crystals. Posttranslational
2 Genetic Construct Design and Recombinant Protein… 31
modifications, both the addition and subtraction of, can have
dramatic results with respect to protein stability and good crystal
formation. Even the isolation of protein domains out of the con-
text of the larger protein can lead to better crystal formation when
inherent inter-domain flexibility hinders crystallization efforts. In
this section, we will address these considerations individually and
give the reader a better sense of how the researcher addresses, and
ultimately strives to overcome, these issues.
3. Boundaries
Boundary determination often represents the crucial choice for
producing sufficient quantities of soluble, high-quality protein for
crystallographic purposes. For small prokaryotic proteins the native
termini are often ideal. However, for more complex eukaryotic
proteins, crystallization of truncated protein or of individual
domains for multi-domain proteins is often more appropriate. It is
difficult to predict the boundaries of a domain of interest or the
proper truncation of a protein’s termini from the primary sequence
in isolation. Analysis of the similarity of the protein of interest to
other family members and in particular to proteins of known struc-
ture provides the template for proper boundary choices. The first
step is to query the sequence of interest against the Protein
Databank (http://www.pdb.org/pdb/home/home.do). This
allows a quick assessment of the closest homologs with published
structure. If reasonable hits are found, a close approximation of
proper boundaries is already at hand through design of a construct
with similar termini. Refinement of boundary choice can be com-
pleted by analyzing whether the termini used in the database struc-
tures are ordered. Multiple sequence alignments also represent an
important analytical approach, as primary sequence similarity is a
strong predictor of structural similarity. From a structural biology
viewpoint, their utility comes in the ability to visually illustrate
conserved and variable sites within a protein family that typically
correspond to structurally important and dispensable features,
respectively. Commonly used alignment tools include algorithms
such as ClustalW2, Tree-based Consistency Objective Function for
alignment Evaluation (T-Coffee), or multiple sequence compari-
son by log-expectation (MUSCLE) (2, 3).
A more sophisticated approach entails homology or compara-
tive protein structural modeling (4). This method produces an all-
atom model of a sequence based on its alignment to one or more
related protein structures. Either sequential or simultaneous mod-
eling of the core of the protein as well as loops and side chains can
greatly facilitate the identification of not only end terminal boundar-
ies, but also the identification of surface exposed side chain residues
32 S.C. Edavettal et al.
and potential flexible loops. Although it should be noted that many
loops will be readily predictable from insertion/deletion gaps in
the multiple sequence alignments. Templates for comparative
model building are often found by sequence alignment methods
such as BLAST, PSI-BLAST, FASTA or SALIGN. Once identified,
the atomic coordinates of the templates and a short script file are
fed into a computer program for comparative protein structure
modeling such as MODELLER. This program implements com-
parative protein structure modeling by the satisfaction of spatial
restraints that are input by the user. MODELLER can also per-
form a number of auxiliary tasks including the calculation of phy-
logenetic trees, alignment of two protein sequences or their profiles,
multiple alignments of protein sequences or their profiles, multiple
alignments of protein sequence and/or structures, and de novo
modeling of loops in protein structures. This method remains the
most reliable method to predict the three dimensional structure of
a protein.
Recently, Mooij et al. presented a web-based tool, ProteinCCD
(CCD: Crystallographic Construct Design) which consolidates
common tools in structural biology into a single platform that
enables comparative analysis of the sequence and allows the design
of oligonucleotides for PCR amplification of the chosen protein
constructs (5). This suite is divided into four groups of sequence
analysis tools, predicting secondary structure, disordered regions,
structural motifs, and flexible domains. Secondary structure pre-
diction uses primary sequence information to predict stretches of
sequences that are likely to be beta-strands or alpha-helices in the
three-dimensional structure and, thus, should not be disrupted.
The second group of tools employs algorithms that aim to predict
disordered regions in the protein primary sequences. Rigidifying
or deleting loop structures predicted to be flexible can improve
crystal formation. Predicting specific features of a protein sequence
including transmembrane topology, signal peptides, or regions of
coiled-coils will also aid in construct design by identifying regions
which should have structural rigidity and therefore should not be
mutated. The Simple Model Architecture Research Tool (SMART)
and the domain Linker Predictor are used to analyze domain struc-
ture and identify genetically mobile domains. The information
output displays a condensed view of all results against the protein
sequence where the researcher can analyze the data and choose,
interactively, possible construct boundaries.
These computational methods normally yield multiple possible
termini for any given protein. Even in the best-case scenario
where a crystal structure of a close homolog exists as a guide, the
choice of the precise starting or ending residues may be difficult. In
some cases, one terminus may be better defined than the other. It
is often advisable to bring forward multiple constructs to test
different hypotheses. The expression system, the throughput of
2 Genetic Construct Design and Recombinant Protein… 33
the lab, and the ambiguity of the alignment all impact the number
of clones that should be generated and evaluated. Ideally, high-
throughput expression and purification techniques can be
employed to efficiently assess the quality of each construct. High-
throughput melting temperature analysis has become a method of
choice for ranking the propensity for crystallization of such
constructs, as thermal stability has been correlated with crystalliz-
ability (6). However, expression level very often provides a good
surrogate assessment of the behavior of the protein; better expressers
often leading to better crystallizers. The number of constructs can
usually be limited by focusing on the most aggressively truncated
candidates, in most cases less is more.
In instances of truly novel sequences where computational or
comparative methods do not provide guidance, empirical
approaches may be employed. Boundaries can be based on infor-
mation from limited proteolysis/mass spectrometry (LPMS) where
a time course of digestion with a protease such as chymotrypsin,
subtilisin, or endo-Glu-C is followed by mass spectrometry to
identify stable proteolytic fragments. This powerful method can
identify exposed flexible termini or loop structures that can be
problematic for good crystal formation. Information about cleav-
age sites gained from limited proteolysis can be translated into new
construct design to eliminate inherently flexible regions creating a
more minimalist rigid structure and increasing the likelihood of
successful structure determination.
A second empirical method, based on enhanced hydrogen/
deuterium exchange mass spectrometry can be employed when
working with novel sequences (7). This method allows one to
identify regions of disorder in a protein through the enhanced
exchange rate of backbone amide hydrogens. Slower exchange
rates would suggest more highly structured regions whereas faster
exchange rates would be indicative of domain boundaries, flexible
loops, or disordered termini that could be adjusted or deleted in
subsequent construct design. Several examples have been pub-
lished outlining the utility of this approach (8). It is now routinely
employed by both the NESG and JCSG (9, 10).
4. Choosing an
Expression System
Several expression systems are routinely used to generate recombi-
nant protein suitable for crystallization purposes, including bacte-
ria, insect cells, yeast, and mammalian cells. Initially, the choice of
expression system may be a balance of cost, ease of use, or the
complexity of the system. Since most cDNAs can be expressed in
many different systems, choosing a host is generally based on
expressed protein yields, desired posttranslational modifications,
34 S.C. Edavettal et al.
Table 1
Protein classes
Protein classa Protein size (AA) Expression system (s)
Peptides <80 E. coli (generally as fusion
proteins)
Secreted proteins 80–500 All (proven track record in
yeast and mammalian)
Large secreted proteins/ >500 Mammalian
cell surface receptors
Non-secreted proteins >80 All, based on individual
nature or protein
a
Arbitrary classification of proteins
and relevant purity. The choice of expression system begins with
arbitrarily assigning the target protein to one of four broad classes
(see Table 1) (11). The first class is small proteins and peptides that
are less than ~80 amino acids in length. These are generally best
expressed in bacteria with fusion partners that are enzymatically
removed post-purification. The second class comprises secreted
proteins that range in size from approximately 80–500 amino acids.
This class of proteins can be expressed in all expression systems but
is generally targeted for yeast, insect cell, and mammalian cell
expression. The third class is composed of secreted proteins that
are large (>500 amino acids). These are best expressed in mam-
malian cells as these cells contain the complex machinery needed
for processing and adding posttranslational modifications. Finally,
the fourth class is cytosolic proteins that are generally larger than
80 amino acids. The choice of expression system for this class of
protein depends of the nature of the protein to be expressed as will
be discussed later where we address the advantages and disadvan-
tages of each system with respect to the specific class of protein.
While membrane proteins are not described here in detail, they are
rapidly becoming more common targets for crystallization studies.
The eukaryotic hosts are most commonly used for this purpose,
although several novel techniques have been described to express
membrane proteins in bacterial systems (12, 13). Finally, desired
posttranslational modifications, or lack thereof, can influence the
choice of expression system where the most limiting system is bac-
teria (see Table 2). However, the primary driver of choice of expres-
sion system should always be probability of success.
4.1. Prokaryotic Several factors may direct one toward a prokaryotic system includ-
Expression Systems ing target proteins which are cytosolic, prokaryotic in origin, or
lacking in relative complexity, as well as the desire for no or limited
2 Genetic Construct Design and Recombinant Protein… 35
Table 2
Common posttranslational modifications in different host systems
Posttranslational modification E. coli Insect cells Yeast Mammalian cells
Disulfide bond formation Possiblea Yes Yes Yes
Proteolytic processing Signal sequence Yes Yes Yes
removal
Phosphorylation Yes Yes Yes Yes
N-linked glycosylation No Yes Yes Yes
O-linked glycosylation No Yes Yes Yes
N-terminal methionine removal Yes Yes Yes Yes
a
Possible when expressed in host cells with thioredoxin reductase (trxB) and glutathione reductase (gor)
mutations
posttranslational modifications. For most research labs, the choice
of bacterial cell host for recombinant protein expression is Escherichia
coli. This system is often chosen for simple economic considerations,
ease of use and a large selection of vector/host combinations allow-
ing one to tackle most protein expression situations. Hosts which
promote disulfide bond formation in the cytosol and the titration of
IPTG for more uniform expression are available as well as hosts that
provide rare codon tRNAs for non-codon optimized cDNAs.
A variety of expression vectors are available offering different
antibiotic resistances, induction protocols, and fusion partners.
Popular fusion partners include poly-histidine for immobilized
metal affinity chromatography purification, thioredoxin (Trx) for
disulfide bond formation in an oxidizing cytosol, signal peptides
or disulfide bond isomerase (Dsb) variants for periplasmic expres-
sion and disulfide bond formation and glutathione S-transferase
(GST), N utilization substance A (NusA), maltose-binding protein
(MBP), or small ubiquitin-related modifier (SUMO) for soluble
cytosolic expression of proteins and peptides (see Table 3). Also
offered, or engineered directly, are many choices for fusion partner
removal by proteolytic digestion. These include enterokinase,
thrombin, factor Xa, tobacco etch virus protease (TEV), and human
rhino virus 3C protease (HRV3C). The choice of which protease to
use is often dictated by the desired, mature N-termini; however
TEV usually represents a robust choice.
Direct, cytosolic, expression in bacteria is usually the method
of choice for a heterologous protein from the first and fourth
classes of proteins as long as this target protein does not contain an
inordinate number of cysteines involved in native disulfide bonds.
The reducing environment of the bacterial cytosol does not allow
disulfide bond formation and overexpression often leads to the
36 S.C. Edavettal et al.
Table 3
Fusion tags and partner proteins
Fusion partner Placement of tag Approx. size (AA) Advantages
Poly-histidine N, C, I 6–10 Affinity purification
Trx N 110 Disulfide formation
Signal peptide N 20 Periplasmic expression, native folding
Dsb N 220 Periplasmic expression, native
folding, disulfide formation
SUMO N 100 Cytoplasmic solubility, native
N-termini
GST N 220 Cytoplasmic solubility
NusA N, I 500 Cytoplasmic solubility
MBP N 400 Cytoplasmic solubility
N N-terminus; C C-terminus; I internal
formation of insoluble inclusion bodies that require solubilization
and refolding to yield the desired product. However, as mentioned
earlier, there are several vector choices that not only facilitate the
expression of a soluble fusion construct but also the formation of
disulfide bonds when used in conjunction with a host cell carrying
the thrioredoxin reductase (trxB) and glutathione reductase (gor)
mutations that result in an oxidizing cytosol. There are several
examples in the literature where these combinations, along with
refolding chaperones, led to native disulfide bond formation
(14–16) and the mature protein was acquired following enzymatic
removal of the fusion partner.
Intracellular expression in E. coli yields a protein containing
the initiating methionine residue. Endogenous proteins are nor-
mally processed by E. coli N-terminal methionine amino peptidase
(MAP). For highly expressed recombinant proteins, this process-
ing can be rate-limiting leading to N-terminal heterogeneity.
However, the “N-end rule” is often a good guide in construct
design when looking to enhance protein expression or to opti-
mize, or minimize, N-terminal methionine processing (17, 18).
Effective methionine processing has been shown to be directly
related to the radius of gyration of the penultimate residue.
Methionine cleavage decreases proportionally to the increase of
the minimal side chain length. Smaller residues such as Gly, Ala,
Ser, or Cys result in more efficient processing, intermediate resi-
dues such as Thr, Pro, Val, Gln, or Glu result in less efficient pro-
cessing and all other residues in the penultimate position result in
little to no methionine processing.
Random documents with unrelated
content Scribd suggests to you:
Ludwig Thoma
Assessor Karlchen
und andere Geschichten
Umschlag-Zeichnung von B r u n o P a u l
Zehntes Tausend
Preis geheftet 1 Mark
Elegant gebunden 1 Mark 50 Pf.
D e r Ta g, Berlin: Ihre sozialpolitisch gerichtete Tendenz, ihr feiner,
über der Sache stehender Humor und die scharfe Beobachtung von
Dingen und Menschen geben diesen Skizzen einen bleibenden Wert,
und der Kulturhistoriker künftiger Zeiten könnte aus dem Büchlein
mehr Einsicht in unsere Kulturgeschichte gewinnen, als es ihm aus
dem Studium der „Quellen" möglich ist.
D i e P o s t, Berlin: Ludwig Thoma hat sich durch sein köstliches
Bauernbuch „Agricola" und seine Geschichten im „Simplicissimus"
einen Namen gemacht. Sein neues Buch wird ihm viele neue
Bewunderer erwerben. Die prächtige Frische seiner Geschichten, das
scharfe Künstlerauge, mit dem er beobachtet, und die verblüffende
Sicherheit und Originalität, mit dem er das Beobachtete wiedergibt,
sein urdeutscher Humor und die Kraft ehrlicher Entrüstung in seiner
Satire, das alles macht dieses höchst amüsante Buch zu einem
erfreulichen Zeugnis dafür, daß es unter unseren jüngeren
Schriftstellern noch ganze Kerle gibt mit derben Knochen und festen
Muskeln.
Albert Langen Verlag f. Litteratur u. Kunst München
Ludwig Thoma
Die Medaille
Komödie in einem Akt
Sechstes Tausend
Geheftet 1 M. 50 Pf.
Elegant gebunden 2 Mark 50 Pf.
Bei der Erstaufführung am Münchener kgl. Residenz-Theater erntete
„Die Medaille" stürmischen Erfolg.
D i e M e d a i l l e wurde bis jetzt von nachstehenden Bühnen
angenommen: B a m b e r g (Stadttheater) — B e r l i n (Buntes
Theater) — E r l a n g e n (Stadttheater) — F ü r t h (Stadttheater) —
G r a z (Stadttheater) — H a m b u r g (Stadttheater) — M ü n c h e n
(Kgl. Hoftheater) — N ü r n b e r g (Stadttheater) — S c h w e r i n
(Großherzogl. Hoftheater) — Wien (Deutsches Volkstheater).
Thomas intime Kenntnis gerade der bayerischen Bauern
prädestinierte ihn von Anfang an gerade zum Dichten einer
Bauernkomödie. Die Echtheit seiner Gestalten empfindet jeder, mag
er ihre Urbilder aus eigener Anschauung kennen oder nicht. Und so
wird die „Medaille" ihrem hochtalentvollen Autor in allen deutschen
Gauen viele Freunde zu den alten werben.
Albert Langen Verlag f. Litteratur u. Kunst München
Peter Schlemihl (Ludwig Thoma)
Grobheiten
Simplicissimus-Gedichte
Umschlagzeichnung von B r u n o P a u l
Dreizehntes Tausend
Preis geheftet 1 Mark
Elegant gebunden 1 Mark 50 Pf.
Wem die Natur einen Magen verliehen hat, der die Würze von Pfeffer
und Salz dem Zucker vorzieht, der greife getrost nach dem Buch des
Münchners, den man im Süden schon aus dem
„S i m p l i c i s s i m u s" unter dem Namen „Peter S c h l e m i h l"
kennt. Empfindet man in den satirischen Spalten des Münchner
Karikaturenblatts die Beiträge Schlemihls zuerst nur als gut
versifizierte Leitartikel, so erkennt man aus dem kleinen Buch, in
dem man die Gedichte in vollen Zügen und nicht bloß löffelweise
genießen kann, daß die Form einen eigenartigen dichterischen Wert
besitzt, daß den Kraftgedanken eines rücksichtslosen
Wahrheitsbekenners auch eine echt poetische Kraft der Darstellung
entspricht. Es ist ein neues Genre und eine neue Saite. Aber aus der
Vielfältigkeit der Ausdrucksmittel erkennt man bald ein reiches und
übersprudelndes Talent.
Albert Langen Verlag f. Litteratur u. Kunst München
Druck von Hesse & Becker in Leipzig
Anmerkungen zur Transkription
Inkonsistenzen wurden beibehalten, wenn beide Schreibweisen gebräuchlich waren, wie:
anderen — andern
Euch — euch
Turmes — Turms
Umschlag-Zeichnung — Umschlagzeichnung
Interpunktion wurde ohne Erwähnung korrigiert. Im Text wurden folgende Änderungen
vorgenommen:
S. 10 „Jungfrau von Cordoan" in „Jungfrau von Cordouan" geändert.
S. 39 „Gebahren" in „Gebaren" geändert.
S. 40 „Schiffahrt" in „Schifffahrt" geändert.
S. 55 „tötliche" in „tödliche" geändert.
S. 94 „Fär-Ör-Inseln" in „Fär-Oer-Inseln" geändert.
S. 100 „Bemannnung" in „Bemannung" geändert.
S. 101 „stoßweißes" in „stoßweises" geändert.
S. 104 „widerspänstigen" in „widerspenstigen" geändert.
S. 108 „Schiffahrt" in „Schifffahrt" geändert.
S. 115 „intermittirende" in „intermittierende" geändert.
S. 151 „prädestinierten" in „prädestinierte" geändert.
*** END OF THE PROJECT GUTENBERG EBOOK DIE HEXE VON
NORDEROOG ***
Updated editions will replace the previous one—the old editions
will be renamed.
Creating the works from print editions not protected by U.S.
copyright law means that no one owns a United States
copyright in these works, so the Foundation (and you!) can copy
and distribute it in the United States without permission and
without paying copyright royalties. Special rules, set forth in the
General Terms of Use part of this license, apply to copying and
distributing Project Gutenberg™ electronic works to protect the
PROJECT GUTENBERG™ concept and trademark. Project
Gutenberg is a registered trademark, and may not be used if
you charge for an eBook, except by following the terms of the
trademark license, including paying royalties for use of the
Project Gutenberg trademark. If you do not charge anything for
copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such
as creation of derivative works, reports, performances and
research. Project Gutenberg eBooks may be modified and
printed and given away—you may do practically ANYTHING in
the United States with eBooks not protected by U.S. copyright
law. Redistribution is subject to the trademark license, especially
commercial redistribution.
START: FULL LICENSE
THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK
To protect the Project Gutenberg™ mission of promoting the
free distribution of electronic works, by using or distributing this
work (or any other work associated in any way with the phrase
“Project Gutenberg”), you agree to comply with all the terms of
the Full Project Gutenberg™ License available with this file or
online at www.gutenberg.org/license.
Section 1. General Terms of Use and
Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand,
agree to and accept all the terms of this license and intellectual
property (trademark/copyright) agreement. If you do not agree
to abide by all the terms of this agreement, you must cease
using and return or destroy all copies of Project Gutenberg™
electronic works in your possession. If you paid a fee for
obtaining a copy of or access to a Project Gutenberg™
electronic work and you do not agree to be bound by the terms
of this agreement, you may obtain a refund from the person or
entity to whom you paid the fee as set forth in paragraph 1.E.8.
1.B. “Project Gutenberg” is a registered trademark. It may only
be used on or associated in any way with an electronic work by
people who agree to be bound by the terms of this agreement.
There are a few things that you can do with most Project
Gutenberg™ electronic works even without complying with the
full terms of this agreement. See paragraph 1.C below. There
are a lot of things you can do with Project Gutenberg™
electronic works if you follow the terms of this agreement and
help preserve free future access to Project Gutenberg™
electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright
law in the United States and you are located in the United
States, we do not claim a right to prevent you from copying,
distributing, performing, displaying or creating derivative works
based on the work as long as all references to Project
Gutenberg are removed. Of course, we hope that you will
support the Project Gutenberg™ mission of promoting free
access to electronic works by freely sharing Project Gutenberg™
works in compliance with the terms of this agreement for
keeping the Project Gutenberg™ name associated with the
work. You can easily comply with the terms of this agreement
by keeping this work in the same format with its attached full
Project Gutenberg™ License when you share it without charge
with others.
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.
1.E. Unless you have removed all references to Project
Gutenberg:
1.E.1. The following sentence, with active links to, or other
immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project
Gutenberg™ work (any work on which the phrase “Project
Gutenberg” appears, or with which the phrase “Project
Gutenberg” is associated) is accessed, displayed, performed,
viewed, copied or distributed:
This eBook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and
with almost no restrictions whatsoever. You may copy it,
give it away or re-use it under the terms of the Project
Gutenberg License included with this eBook or online at
www.gutenberg.org. If you are not located in the United
States, you will have to check the laws of the country
where you are located before using this eBook.
1.E.2. If an individual Project Gutenberg™ electronic work is
derived from texts not protected by U.S. copyright law (does not
contain a notice indicating that it is posted with permission of
the copyright holder), the work can be copied and distributed to
anyone in the United States without paying any fees or charges.
If you are redistributing or providing access to a work with the
phrase “Project Gutenberg” associated with or appearing on the
work, you must comply either with the requirements of
paragraphs 1.E.1 through 1.E.7 or obtain permission for the use
of the work and the Project Gutenberg™ trademark as set forth
in paragraphs 1.E.8 or 1.E.9.
1.E.3. If an individual Project Gutenberg™ electronic work is
posted with the permission of the copyright holder, your use and
distribution must comply with both paragraphs 1.E.1 through
1.E.7 and any additional terms imposed by the copyright holder.
Additional terms will be linked to the Project Gutenberg™
License for all works posted with the permission of the copyright
holder found at the beginning of this work.
1.E.4. Do not unlink or detach or remove the full Project
Gutenberg™ License terms from this work, or any files
containing a part of this work or any other work associated with
Project Gutenberg™.
1.E.5. Do not copy, display, perform, distribute or redistribute
this electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the
Project Gutenberg™ License.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must,
at no additional cost, fee or expense to the user, provide a copy,
a means of exporting a copy, or a means of obtaining a copy
upon request, of the work in its original “Plain Vanilla ASCII” or
other form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.
1.E.7. Do not charge a fee for access to, viewing, displaying,
performing, copying or distributing any Project Gutenberg™
works unless you comply with paragraph 1.E.8 or 1.E.9.
1.E.8. You may charge a reasonable fee for copies of or
providing access to or distributing Project Gutenberg™
electronic works provided that:
• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You provide a full refund of any money paid by a user who
notifies you in writing (or by e-mail) within 30 days of receipt
that s/he does not agree to the terms of the full Project
Gutenberg™ License. You must require such a user to return or
destroy all copies of the works possessed in a physical medium
and discontinue all use of and all access to other copies of
Project Gutenberg™ works.
• You provide, in accordance with paragraph 1.F.3, a full refund of
any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.E.9. If you wish to charge a fee or distribute a Project
Gutenberg™ electronic work or group of works on different
terms than are set forth in this agreement, you must obtain
permission in writing from the Project Gutenberg Literary
Archive Foundation, the manager of the Project Gutenberg™
trademark. Contact the Foundation as set forth in Section 3
below.
1.F.
1.F.1. Project Gutenberg volunteers and employees expend
considerable effort to identify, do copyright research on,
transcribe and proofread works not protected by U.S. copyright
law in creating the Project Gutenberg™ collection. Despite these
efforts, Project Gutenberg™ electronic works, and the medium
on which they may be stored, may contain “Defects,” such as,
but not limited to, incomplete, inaccurate or corrupt data,
transcription errors, a copyright or other intellectual property
infringement, a defective or damaged disk or other medium, a
computer virus, or computer codes that damage or cannot be
read by your equipment.
1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except
for the “Right of Replacement or Refund” described in
paragraph 1.F.3, the Project Gutenberg Literary Archive
Foundation, the owner of the Project Gutenberg™ trademark,
and any other party distributing a Project Gutenberg™ electronic
work under this agreement, disclaim all liability to you for
damages, costs and expenses, including legal fees. YOU AGREE
THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE, STRICT
LIABILITY, BREACH OF WARRANTY OR BREACH OF CONTRACT
EXCEPT THOSE PROVIDED IN PARAGRAPH 1.F.3. YOU AGREE
THAT THE FOUNDATION, THE TRADEMARK OWNER, AND ANY
DISTRIBUTOR UNDER THIS AGREEMENT WILL NOT BE LIABLE
TO YOU FOR ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL,
PUNITIVE OR INCIDENTAL DAMAGES EVEN IF YOU GIVE
NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.
1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you
discover a defect in this electronic work within 90 days of
receiving it, you can receive a refund of the money (if any) you
paid for it by sending a written explanation to the person you
received the work from. If you received the work on a physical
medium, you must return the medium with your written
explanation. The person or entity that provided you with the
defective work may elect to provide a replacement copy in lieu
of a refund. If you received the work electronically, the person
or entity providing it to you may choose to give you a second
opportunity to receive the work electronically in lieu of a refund.
If the second copy is also defective, you may demand a refund
in writing without further opportunities to fix the problem.
1.F.4. Except for the limited right of replacement or refund set
forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’,
WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.
1.F.5. Some states do not allow disclaimers of certain implied
warranties or the exclusion or limitation of certain types of
damages. If any disclaimer or limitation set forth in this
agreement violates the law of the state applicable to this
agreement, the agreement shall be interpreted to make the
maximum disclaimer or limitation permitted by the applicable
state law. The invalidity or unenforceability of any provision of
this agreement shall not void the remaining provisions.
1.F.6. INDEMNITY - You agree to indemnify and hold the
Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and
distribution of Project Gutenberg™ electronic works, harmless
from all liability, costs and expenses, including legal fees, that
arise directly or indirectly from any of the following which you
do or cause to occur: (a) distribution of this or any Project
Gutenberg™ work, (b) alteration, modification, or additions or
deletions to any Project Gutenberg™ work, and (c) any Defect
you cause.
Section 2. Information about the Mission
of Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new
computers. It exists because of the efforts of hundreds of
volunteers and donations from people in all walks of life.
Volunteers and financial support to provide volunteers with the
assistance they need are critical to reaching Project
Gutenberg™’s goals and ensuring that the Project Gutenberg™
collection will remain freely available for generations to come. In
2001, the Project Gutenberg Literary Archive Foundation was
created to provide a secure and permanent future for Project
Gutenberg™ and future generations. To learn more about the
Project Gutenberg Literary Archive Foundation and how your
efforts and donations can help, see Sections 3 and 4 and the
Foundation information page at www.gutenberg.org.
Section 3. Information about the Project
Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-
profit 501(c)(3) educational corporation organized under the
laws of the state of Mississippi and granted tax exempt status
by the Internal Revenue Service. The Foundation’s EIN or
federal tax identification number is 64-6221541. Contributions
to the Project Gutenberg Literary Archive Foundation are tax
deductible to the full extent permitted by U.S. federal laws and
your state’s laws.
The Foundation’s business office is located at 809 North 1500
West, Salt Lake City, UT 84116, (801) 596-1887. Email contact
links and up to date contact information can be found at the
Foundation’s website and official page at
www.gutenberg.org/contact
Section 4. Information about Donations to
the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission
of increasing the number of public domain and licensed works
that can be freely distributed in machine-readable form
accessible by the widest array of equipment including outdated
equipment. Many small donations ($1 to $5,000) are particularly
important to maintaining tax exempt status with the IRS.
The Foundation is committed to complying with the laws
regulating charities and charitable donations in all 50 states of
the United States. Compliance requirements are not uniform
and it takes a considerable effort, much paperwork and many
fees to meet and keep up with these requirements. We do not
solicit donations in locations where we have not received written
confirmation of compliance. To SEND DONATIONS or determine
the status of compliance for any particular state visit
www.gutenberg.org/donate.
While we cannot and do not solicit contributions from states
where we have not met the solicitation requirements, we know
of no prohibition against accepting unsolicited donations from
donors in such states who approach us with offers to donate.
International donations are gratefully accepted, but we cannot
make any statements concerning tax treatment of donations
received from outside the United States. U.S. laws alone swamp
our small staff.
Please check the Project Gutenberg web pages for current
donation methods and addresses. Donations are accepted in a
number of other ways including checks, online payments and
credit card donations. To donate, please visit:
www.gutenberg.org/donate.
Section 5. General Information About
Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.
Project Gutenberg™ eBooks are often created from several
printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
This website includes information about Project Gutenberg™,
including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.
Welcome to Our Bookstore - The Ultimate Destination for Book Lovers
Are you passionate about books and eager to explore new worlds of
knowledge? At our website, we offer a vast collection of books that
cater to every interest and age group. From classic literature to
specialized publications, self-help books, and children’s stories, we
have it all! Each book is a gateway to new adventures, helping you
expand your knowledge and nourish your soul
Experience Convenient and Enjoyable Book Shopping Our website is more
than just an online bookstore—it’s a bridge connecting readers to the
timeless values of culture and wisdom. With a sleek and user-friendly
interface and a smart search system, you can find your favorite books
quickly and easily. Enjoy special promotions, fast home delivery, and
a seamless shopping experience that saves you time and enhances your
love for reading.
Let us accompany you on the journey of exploring knowledge and
personal growth!
ebookgate.com