3
Most read
4
Most read
5
Most read
Lecture 9- Molecular descriptors
BTT- 516– Drug Designing and Development
Topic To be covered
1. Introduction
2. Types of molecular descriptors
3. Tools for descriptor calculations
4. Home work
Molecular descriptors can be defined as mathematical representations of molecules’
properties that are generated by algorithms.
The numerical values of molecular descriptors are used to quantitatively describe the
physical and chemical information of the molecules.
An example of molecular descriptors is the LogP which is a quantitative representation of
the lipophilicity of the molecules, it is obtained by measuring the partitioning of the
molecule between an aqueous phase and a lipophilic phase which consists usually of
water/n-octanol.
Introduction
Molecular descriptors can be useful in performing similarity searches in molecular
libraries, as they can find molecules with similar physical or chemical properties based on
their similarity in the descriptors’ values.
The molecular descriptors are used in ADMET prediction models to correlate the
structure–property relationship to help in predicting the ADMET properties of molecules
based on their descriptors values (Khan and sylte, 2007).
The molecular descriptors that are used in ADMET models can be classified on the basis of
level of molecular representation required for calculating the descriptor.
• One-dimensional (1D)
• Two-dimensional (2D)
• Three-dimensional (3D)
The 1D descriptors are the simplest type of molecular descriptors, these represent
information that are calculated from the molecular formula of the molecule, which
includes the count and type of atoms in the molecule and the molecular weight.
One-dimensional (1D)
The 2D descriptors are more complex than the 1D descriptors, usually, they represent
molecular information regarding the size, shape, and electronic distribution in the molecule.
Calculating the 2D descriptors depends mainly on the database size, and the calculation of parts
of a molecule in which the data is missing could largely result in a false result.
The 3D descriptors describe mainly properties that are related to the 3D conformation of the
molecule, such as the intramolecular hydrogen bonding.
Examples of descriptors obtained from calculations involving the 3D structure of the molecules
are the polar and nonpolar surface area (PSA and NPSA, respectively). More advanced
calculation like quantum mechanics calculations can be used to obtain 3D descriptors that
describe the valence electron distribution in the molecules (Bergström, 2005).
3D descriptors
2 D descriptors
• 0D - bond counts, mol weight, atom counts
• 1D - fragment counts, H-Bond acc/don, Crippen, PSA, SMARTS
• 2D - topological descriptors (Balaban, Randic, Wiener, BCUT, kappa, chi)
• 3D - geometrical descriptors (3D WHIM, 3D autocorrelation, 3D-Morse) + surface
properties + COMFA
• 4D - 3D coordinates + conformations (JCHEM conformer, CORINA, gold set,
Crystaleye)
A selection of commercial and free descriptor calculation utilities is collected under the
molecular descriptor software collection or the CompChem list or new programs are posted
to CCL.
• alvaDesc - new visual descriptor suite from Kode solutions covering 4000 descriptors
(developed by Alvascience)
•CDK descriptor GUI (free and open source - using Open Source CDK and Joelib code)
•BlueDesc- Molecular Descriptor Calculator (free and open source - using CDK and Joelib
code, requires JAVA 1.6
•ChemAxon JChem - Descriptor package using Marvin JAVAAPI (free academic license)
•ISIDA/QSPR - free fragment based QSPR descriptor package
•E-Dragon (VCCLab) free (150 molecules), now with GSFRAG, GSFRAG-L, ETState >
3000 descriptors
Tools for descriptor calculations
•MOLD2 - (FDA) a free 2D molecule descriptor package
•Toxicity Estimation Software Tool (T.E.S.T.) - (EPA) contains more than 790 2-dimensional
descriptors
•Open3DQSAR - pharmacophore modelling using molecular interaction fields (MIFs)
•Dragon - 5,270 molecular descriptors for LINUX and WIN (Todeschini/Talete/Kode)
•PaDEL-Descriptor- based on CDK but includes additional 737 2D and 3D descriptors
(NUS/Singapore)
•ADMEWORKS ModelBuilder - 400 descriptors (Jurs) and MOPAC (Stewart) (Fujitsu/Poland)
•QuBiLS-MIDAS - a highly parallel software for three-dimensional molecular descriptor
calculation
Concepts for descriptor calculations and QSAR/QSPR
modeling
• You need a large dataset with the molecular property (logP, bp) to be modeled. The
larger the number of data points the better. There are QSAR models with 20 or less
points, however for broad applications one need to cover a large diversity space.
Hundreds or thousands of such values can be collected from databases or are now
available from HT screening methods.
• You need the molecular structures itself (as SMILES, SDF in 2D or optimized 3D
structure). Handling the molecules together with all descriptors can be a challenging
task, software which can do that is highly preferred.
• You need a descriptor package for descriptor calculation
• You need to apply feature selection (a statistical process) to discard unimportant
(invariant) or sometimes highly correlated descriptors (othogonalization)
• You need to divide your molecule set into three parts. A training (70%), validation (30%) and
an additional external training or validation set which is not used in either method. (Sometime
the validation set is called testing set or vice versa). Cross-validation (n-fold or v-fold)
techniques or other resampling tests (Monte Carlo Sampling, Jackknifing, Bootstrapping) need
to be applied, especially if not enough molecules are available.
• You need to apply regression or classification methods (including meta-learning approaches).
• One need to make sure that for future predictions no other compound classes are included
(which usually results in wrong predictions) by either including error values, fingerprint or
substructure matches or a simple dimension reduction method (PCA, PLS) to avoid molecules
which were not covered during development. As example a logP method only developed on
alkanes will 100% fail on complex drug molecules or molecules with multiple -OH and -NH
or -SH groups. Further more a complete statistical description for either the regression
performance or classification performance needs to be included.
Utility of molecular descriptors
• The purpose of molecular-Descriptor is to calculate properties of molecules
that serve as numerical descriptions or characterizations of molecules in
other calculations such as QSAR model, diversity analysis or combinatorial
library design.
Thank you
Er. Rajan Rolta
Faculty of Applied Sciences and Biotechnology
Shoolini University,
Village Bhajol, Solan (H.P)
+91-7018792621 (Mob No.)
rajanrolta@shooliniuniversity.com

More Related Content

PPTX
Molecular modelling for M.Pharm according to PCI syllabus
PPTX
Energy minimization
PDF
MOLECULAR DOCKING
PPTX
Molecular docking.pptx
PPT
PPT
Qsar and drug design ppt
PPTX
Computer aided drug design
PPT
Cadd and molecular modeling for M.Pharm
Molecular modelling for M.Pharm according to PCI syllabus
Energy minimization
MOLECULAR DOCKING
Molecular docking.pptx
Qsar and drug design ppt
Computer aided drug design
Cadd and molecular modeling for M.Pharm

What's hot (20)

PPTX
energy minimization
PPTX
Molecular modelling
PPT
MOLECULAR DOCKING
PPTX
Virtual sreening
PPTX
Molecular modelling
PPT
Structure base drug design
PPTX
Molecular docking
PPTX
Basics Of Molecular Docking
PPTX
Molecular Mechanics in Molecular Modeling
PPTX
2D - QSAR
PPTX
Pharmacophore mapping
PDF
Molecular dynamics and Simulations
PPTX
Pharmacophore
PPT
Chemoinformatic
PPTX
Cheminformatics in drug design
PPTX
Virtual screening techniques
PPTX
molecular docking its types and de novo drug design and application and softw...
PDF
MD Simulation
PPTX
docking
PPTX
SAR & QSAR
energy minimization
Molecular modelling
MOLECULAR DOCKING
Virtual sreening
Molecular modelling
Structure base drug design
Molecular docking
Basics Of Molecular Docking
Molecular Mechanics in Molecular Modeling
2D - QSAR
Pharmacophore mapping
Molecular dynamics and Simulations
Pharmacophore
Chemoinformatic
Cheminformatics in drug design
Virtual screening techniques
molecular docking its types and de novo drug design and application and softw...
MD Simulation
docking
SAR & QSAR
Ad

Similar to Lecture 9 molecular descriptors (20)

PPTX
Descriptors
PDF
Unit 2 cadd assignment
PPT
371_Molecular_Dessddddddddddddddddddddddcriptors.ppt
PPTX
Molecular Descriptors: Comparing Structural Complexity and Software
PPTX
PDF
Molecular Descriptors: Understanding Structural Complexity
PDF
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
PDF
Electron Density Derived Descriptors in Drug Discovery and Protein Modeling
PPTX
VIRTUAL SCREENING TECHNIQUE CADD.pptx
PDF
Predicting Value of Binding Constants of Organic Ligands to Beta-Cyclodextrin...
PPTX
Chemoinformatics
PPTX
Overview of cheminformatics
PPT
SOT short course on computational toxicology
PDF
Representing molecules with minimalism: A solution to the entropy of informatics
PDF
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
PPTX
Cheminformatics, concept by kk sahu sir
PPTX
Review On Molecular Modeling
PPTX
Free online access to experimental and predicted chemical properties through ...
Descriptors
Unit 2 cadd assignment
371_Molecular_Dessddddddddddddddddddddddcriptors.ppt
Molecular Descriptors: Comparing Structural Complexity and Software
Molecular Descriptors: Understanding Structural Complexity
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
Electron Density Derived Descriptors in Drug Discovery and Protein Modeling
VIRTUAL SCREENING TECHNIQUE CADD.pptx
Predicting Value of Binding Constants of Organic Ligands to Beta-Cyclodextrin...
Chemoinformatics
Overview of cheminformatics
SOT short course on computational toxicology
Representing molecules with minimalism: A solution to the entropy of informatics
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
Cheminformatics, concept by kk sahu sir
Review On Molecular Modeling
Free online access to experimental and predicted chemical properties through ...
Ad

More from RAJAN ROLTA (12)

PDF
Lecture 8 drug targets and target identification
PDF
Lecture 13 – comparative modeling
PDF
Lecture 12 – chemoinformatic
PDF
Lecture 11 developing qsar, evaluation of qsar model and virtual screening
PDF
Lecture 10 pharmacophore modeling and sar paradox
PDF
Lecture 7 computer aided drug design
PDF
Lecture 6 –active site identification
PDF
Lecture 5 pharmacophore and qsar
PDF
Lecture 4 ligand based drug design
PDF
Lecture 3 rational drug design
PDF
Lecture 2 history of drug designing and development
PDF
Lecture 1 –Introduction to drug design and development
Lecture 8 drug targets and target identification
Lecture 13 – comparative modeling
Lecture 12 – chemoinformatic
Lecture 11 developing qsar, evaluation of qsar model and virtual screening
Lecture 10 pharmacophore modeling and sar paradox
Lecture 7 computer aided drug design
Lecture 6 –active site identification
Lecture 5 pharmacophore and qsar
Lecture 4 ligand based drug design
Lecture 3 rational drug design
Lecture 2 history of drug designing and development
Lecture 1 –Introduction to drug design and development

Recently uploaded (20)

PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PDF
My India Quiz Book_20210205121199924.pdf
PPTX
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PPTX
Virtual and Augmented Reality in Current Scenario
PPTX
B.Sc. DS Unit 2 Software Engineering.pptx
PDF
HVAC Specification 2024 according to central public works department
PPTX
Computer Architecture Input Output Memory.pptx
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PDF
advance database management system book.pdf
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
PPTX
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
PDF
AI-driven educational solutions for real-life interventions in the Philippine...
PDF
IGGE1 Understanding the Self1234567891011
PDF
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
Practical Manual AGRO-233 Principles and Practices of Natural Farming
My India Quiz Book_20210205121199924.pdf
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
What if we spent less time fighting change, and more time building what’s rig...
FORM 1 BIOLOGY MIND MAPS and their schemes
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
Virtual and Augmented Reality in Current Scenario
B.Sc. DS Unit 2 Software Engineering.pptx
HVAC Specification 2024 according to central public works department
Computer Architecture Input Output Memory.pptx
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
advance database management system book.pdf
LDMMIA Reiki Yoga Finals Review Spring Summer
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
AI-driven educational solutions for real-life interventions in the Philippine...
IGGE1 Understanding the Self1234567891011
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf

Lecture 9 molecular descriptors

  • 1. Lecture 9- Molecular descriptors BTT- 516– Drug Designing and Development
  • 2. Topic To be covered 1. Introduction 2. Types of molecular descriptors 3. Tools for descriptor calculations 4. Home work
  • 3. Molecular descriptors can be defined as mathematical representations of molecules’ properties that are generated by algorithms. The numerical values of molecular descriptors are used to quantitatively describe the physical and chemical information of the molecules. An example of molecular descriptors is the LogP which is a quantitative representation of the lipophilicity of the molecules, it is obtained by measuring the partitioning of the molecule between an aqueous phase and a lipophilic phase which consists usually of water/n-octanol. Introduction Molecular descriptors can be useful in performing similarity searches in molecular libraries, as they can find molecules with similar physical or chemical properties based on their similarity in the descriptors’ values.
  • 4. The molecular descriptors are used in ADMET prediction models to correlate the structure–property relationship to help in predicting the ADMET properties of molecules based on their descriptors values (Khan and sylte, 2007). The molecular descriptors that are used in ADMET models can be classified on the basis of level of molecular representation required for calculating the descriptor. • One-dimensional (1D) • Two-dimensional (2D) • Three-dimensional (3D) The 1D descriptors are the simplest type of molecular descriptors, these represent information that are calculated from the molecular formula of the molecule, which includes the count and type of atoms in the molecule and the molecular weight. One-dimensional (1D)
  • 5. The 2D descriptors are more complex than the 1D descriptors, usually, they represent molecular information regarding the size, shape, and electronic distribution in the molecule. Calculating the 2D descriptors depends mainly on the database size, and the calculation of parts of a molecule in which the data is missing could largely result in a false result. The 3D descriptors describe mainly properties that are related to the 3D conformation of the molecule, such as the intramolecular hydrogen bonding. Examples of descriptors obtained from calculations involving the 3D structure of the molecules are the polar and nonpolar surface area (PSA and NPSA, respectively). More advanced calculation like quantum mechanics calculations can be used to obtain 3D descriptors that describe the valence electron distribution in the molecules (Bergström, 2005). 3D descriptors 2 D descriptors
  • 6. • 0D - bond counts, mol weight, atom counts • 1D - fragment counts, H-Bond acc/don, Crippen, PSA, SMARTS • 2D - topological descriptors (Balaban, Randic, Wiener, BCUT, kappa, chi) • 3D - geometrical descriptors (3D WHIM, 3D autocorrelation, 3D-Morse) + surface properties + COMFA • 4D - 3D coordinates + conformations (JCHEM conformer, CORINA, gold set, Crystaleye)
  • 7. A selection of commercial and free descriptor calculation utilities is collected under the molecular descriptor software collection or the CompChem list or new programs are posted to CCL. • alvaDesc - new visual descriptor suite from Kode solutions covering 4000 descriptors (developed by Alvascience) •CDK descriptor GUI (free and open source - using Open Source CDK and Joelib code) •BlueDesc- Molecular Descriptor Calculator (free and open source - using CDK and Joelib code, requires JAVA 1.6 •ChemAxon JChem - Descriptor package using Marvin JAVAAPI (free academic license) •ISIDA/QSPR - free fragment based QSPR descriptor package •E-Dragon (VCCLab) free (150 molecules), now with GSFRAG, GSFRAG-L, ETState > 3000 descriptors Tools for descriptor calculations
  • 8. •MOLD2 - (FDA) a free 2D molecule descriptor package •Toxicity Estimation Software Tool (T.E.S.T.) - (EPA) contains more than 790 2-dimensional descriptors •Open3DQSAR - pharmacophore modelling using molecular interaction fields (MIFs) •Dragon - 5,270 molecular descriptors for LINUX and WIN (Todeschini/Talete/Kode) •PaDEL-Descriptor- based on CDK but includes additional 737 2D and 3D descriptors (NUS/Singapore) •ADMEWORKS ModelBuilder - 400 descriptors (Jurs) and MOPAC (Stewart) (Fujitsu/Poland) •QuBiLS-MIDAS - a highly parallel software for three-dimensional molecular descriptor calculation
  • 9. Concepts for descriptor calculations and QSAR/QSPR modeling • You need a large dataset with the molecular property (logP, bp) to be modeled. The larger the number of data points the better. There are QSAR models with 20 or less points, however for broad applications one need to cover a large diversity space. Hundreds or thousands of such values can be collected from databases or are now available from HT screening methods. • You need the molecular structures itself (as SMILES, SDF in 2D or optimized 3D structure). Handling the molecules together with all descriptors can be a challenging task, software which can do that is highly preferred. • You need a descriptor package for descriptor calculation • You need to apply feature selection (a statistical process) to discard unimportant (invariant) or sometimes highly correlated descriptors (othogonalization)
  • 10. • You need to divide your molecule set into three parts. A training (70%), validation (30%) and an additional external training or validation set which is not used in either method. (Sometime the validation set is called testing set or vice versa). Cross-validation (n-fold or v-fold) techniques or other resampling tests (Monte Carlo Sampling, Jackknifing, Bootstrapping) need to be applied, especially if not enough molecules are available. • You need to apply regression or classification methods (including meta-learning approaches). • One need to make sure that for future predictions no other compound classes are included (which usually results in wrong predictions) by either including error values, fingerprint or substructure matches or a simple dimension reduction method (PCA, PLS) to avoid molecules which were not covered during development. As example a logP method only developed on alkanes will 100% fail on complex drug molecules or molecules with multiple -OH and -NH or -SH groups. Further more a complete statistical description for either the regression performance or classification performance needs to be included.
  • 11. Utility of molecular descriptors • The purpose of molecular-Descriptor is to calculate properties of molecules that serve as numerical descriptions or characterizations of molecules in other calculations such as QSAR model, diversity analysis or combinatorial library design.
  • 12. Thank you Er. Rajan Rolta Faculty of Applied Sciences and Biotechnology Shoolini University, Village Bhajol, Solan (H.P) +91-7018792621 (Mob No.) [email protected]