Bioinformatics
An overview
Soumitra Nath
mail: nath.soumitra1@gmail.com
Bioinformatics
Biological
Data
Computer
Calculations
+
What is Bioinformatics?
“The field of science in which biology,
computer science, and information technology
merge to form a single discipline”
Central Dogma in Molecular Biology
mRNA
Gene (DNA) Protein
21ST century
Genome Transcriptome Proteome
The Human Genome Project
Initiated in 1986 Completed in 2003
Project goals were to
• identify all the genes in human DNA,
• determine the sequences of the 3 billion chemical base pairs
that make up human DNA,
• store this information in databases,
• improve tools for data analysis and develop new tools
• address the ethical, legal, and social issues that may arise
from the project.
What makes us human?
CHIMP GENOME
Chimpanzees are similar to humans in so many
ways: they are socially complex, sensitive and
communicative, and yet indisputably on the animal
side of the man/beast divide. Scientists have now
sequenced the genetic code of our closest living
relative, showing the striking concordances and
divergences between the two species, and perhaps
holding up a mirror to our own humanity.
Perhaps not surprising!!!
Comparison between the full drafts of the human and chimp genomes revealed
that they differ only by 1.23%
How humans
are chimps?
Annotation
Open reading frames
Functional sites
Structure, function
CCTGACAAATTCGACGTGCGGCATTGCATGCAGACGTGCATG
CGTGCAAATAATCAATGTGGACTTTTCTGCGATTATGGAAGAA
CTTTGTTACGCGTTTTTGTCATGGCTTTGGTCCCGCTTTGTTC
AGAATGCTTTTAATAAGCGGGGTTACCGGTTTGGTTAGCGAGA
AGAGCCAGTAAAAGACGCAGTGACGGAGATGTCTGATG CAA
TAT GGA CAA TTG GTT TCT TCT CTG AAT ......
.............. TGAAAAACGTA
CCTGACAAATTCGACGTGCGGCATTGCATGCAGACGTGCATG
CGTGCAAATAATCAATGTGGACTTTTCTGCGATTATGGAAGAA
CTTTGTTACGCGTTTTTGTCATGGCTTTGGTCCCGCTTTGTTC
AGAATGCTTTTAATAAGCGGGGTTACCGGTTTGGTTAGCGAGA
AGAGCCAGTAAAAGACGCAGTGACGGAGATGTCTGATG CAA
TAT GGA CAA TTG GTT TCT TCT CTG AAT .................................
.............. TGAAAAACGTA
Transcription Factor
binding site
promoter
Ribosome binding Site
ORF=Open Reading Frame
CDS=Coding Sequence
Transcription
Start
Site
Organisms Genome maps
DNA sequences
RNA sequences
...AATGGTACCGATGACCTGGAGCTTGGTTCGA...
Molecular biology data types
DNA sequences
RNA sequences
Protein sequences
...TRLRPLLALLALWPPPPARAFVNQHLCGSHLVEA...
Molecular biology data types
Organisms Genome maps
Protein sequences
Protein
structures
RNA
structures
Molecular biology data types
Organisms Genome maps
DNA sequences
RNA sequences
Protein
structures
DNA motifs
Protein
motifs
RNA
expression
Molecular biology data types
Organisms Genome maps
DNA sequences
RNA sequences
RNA
structures
Protein sequences
Lei Liu
Bioinformatics
Sequence
Analysis
What we want to know about a sequence?
• Is this sequence similar to any known genes?
How close is the best match? Significance?
• What do we know about that gene?
– Genomic (chromosomal location, allelic information,
regulatory regions, etc.)
– Structural (known structure? structural domains? etc.)
– Functional (molecular, cellular & disease)
• Evolutionary information:
– Is this gene found in other organisms?
– What is its taxonomic tree?
Larry Hunter
Biological databases
• Data is of different types
– Raw data (DNA, RNA, protein sequences)
– Curated data (DNA, RNA and protein
annotated sequences and structures,
expression data)
EMBL / GenBank / DDBJ
• Serve as archives/ storage containing all sequences
(single genes, ESTs, complete genomes, etc.) derived from:
– Genome projects
– Sequencing centers
– Individual scientists
– Patent offices (i.e. European Patent Office, EPO)
• Non-confidential data are exchanged daily
• Currently: 18 x106 sequences, over 20 x109 bp;
• Over the last 12 months the database size has tripled
• Sequences from > 50’000 different species;
• These 3 db contain mainly the same informations within 2-3
days (few differences in the format and syntax)
www.ncbi.nlm.nih.gov
• Created in 1988 as part of the
National Library of Medicine at NIH
–Establish public databases
–Research in computational biology
–Develop software tools for sequence
analysis
–Disseminate biomedical information
20
NCBI and Entrez
• NCBI provides interesting summaries, browsers
for genome data, and search tools
• Entrez is their database search interface
http://www.ncbi.nlm.nih.gov/Entrez
• Can search on gene names, sequences,
chromosomal location, diseases, keywords, ...
Sequence Comparison
• DNA is blue print for living organisms
 Evolution is related to changes in DNA
 By comparing DNA sequences we can
infer evolutionary relationships between
the sequences
Copyright 2004 limsoon
wong
Sequence Alignment
Sequence U
Sequence V
mismatch
match
indel
• Key aspect of sequence
comparison is sequence
alignment
• A sequence alignment
maximizes the number
of positions that are in
agreement in two
sequences
Copyright 2004 limsoon wong
Multiple Alignment: An Example
Conserved sites
• Multiple seq alignment maximizes number of
positions in agreement across several seqs
• seqs belonging to same “family” usually have
more conserved positions in a multiple seq
alignment
Copyright 2004 limsoon wong
Phylogeny: An Example
• By looking at extent of conserved positions in the
multiple seq alignment of different groups of seqs,
can infer when they last shared an ancestor
 Construct “family tree” or phylogeny
Visualizing the
3d structure of
Proteins
From: Brandon & Tooze, “Introduction to Protein Structure”
primary (1º) secondary (2º) tertiary (3º) quaternary (4º)
Small-scale X-ray source
in lab or at national
synchrotron facility
Getting crystals of proteins
or nucleic acids is no small
feat!
Diffraction pattern
Computers:
Aid in model
building, phase
determination,
visualization
Problem: no way to “focus”
Need to determine phases
Cn3d
Cn3D is a visualization tool for
macromolecules.
It allows you to view 3-D
structures from NCBI's Entrez
retrieval service.
Cn3D is able to correlate
structure and sequence
information; for example, you can
find the residues in a crystal
structure that correspond to
known disease mutations.
Software for 3d structure visualization
RasMol
RasMol is a molecular graphics
program
Intended for the visualization of
proteins, nucleic acids, and small
molecules
Aimed at display, teaching, and
generation of publication quality
images.
Software for 3d structure visualization
Swiss-Pdb Viewer
Swiss-Pdb viewer is used to calculate the distance
and angle between atoms atoms.
It allows browsing a rotamer library in order to
change amino acids side chains.
This can be very useful to quickly evaluate the
assumed effect of mutation before actually doing the
lab work.
It allows altering the torsions angles of amino-acids
and hetero-atoms, as well as the backbone omega,
phi and psi angles.
Software for 3d structure visualization
CADD
What is a drug target?
A drug target may be a native protein (or sometimes DNA/RNA)
in the body whose activity is modified by a drug resulting in a
desirable therapeutic effect.
Drug Targets may be:
Enzymes
Hormone Receptors
Ion Channel Proteins
sometimes, DNA or RNA
CADD
The Drug Designing Pathway:
Disease
Drug Target
Ligand
Database
Natural Product
Combinatorial
Library
Ligand
Side chain
modification
-ve Docking Result
+ve Docking Result
Synthesis
Docking
(in silico binding
study)
In vitro
screening In vivo
screening
Clinical Trials
Ligand (analog) based drug
design
1. Receptor structure is not known
2. Mechanism is known/ unknown
3. Ligands and their biological
activities are known
Target (structure) based drug
design
1. Receptor structure is known
2. Mechanism is known
3. Ligands and their biological
activities are known/ unknown
Computational tools are used to:
• Identify and study drug targets of various diseases
• Study and identify suitable ligand that binds with the drug
target
• Prediction of toxicity and drug likeness of small molecules
(Lipinski Filters & ADMET Screening)
• Generation of Combinatorial Library
There are two major types of drug design.
3D Structure of the protein (Drug Target)
• Download from Protein Data Bank (www.rcsb.org/pdb)
(It is a macromolecular structure database)
•If not available in PDB, predict the structure
(Homology Modeling, Ab initio prediction, Threading etc.)
• 3D Structure of the small molecule (Ligand)
Small molecule 2D Structures are available in Databases like PubChem,
KEGG-Ligand etc. The structure of isolated natural product or synthetic
compound may also be derived using NMR spectroscopy or/and XRC.
• Convert the 2D small molecule to its 3D
structure using software, like CORINA (It stands
for CoORdINAtes)
Prerequisites of a docking experiment:
•The Molecular Wt. must be less than (≤) 500
•C logP ≤ 5 (Octanol/Water Partition Coefficient)
•H-bond Donors ≤ 5
•H-bond Acceptors (sum of N and O atoms) ≤ 10
•No. of Rotatable Bonds ≤ 10
Lipinski‘s Rule of Five is applicable to orally active
compounds.
Lipinski‘s Rule of Five
•Absorption:- Must be easily absorbed by
body
•Distribution:- Compound needs to be easily
transferred and distributed to its target site.
•Metabolism:- Should take part in various
metabolic activities
•Excretion:- Byproducts need to be excreted
out from the body.
•Toxicity:- The toxic effect must be neutralized
ADME-Tox Screening
Examples:
Tubulin:As a Cancer Drug Target
Tubulin heterodimer (a + b) is the basic structural
unit of microtubule. Drug molecule (Taxol) binds to
the tubulin, so that heterodimer can’t be formed. As a
result, cell division ceases.
Tubulin-a + Tubulin-b Heterodimer
Microtubule
Benefits of Bioinformatics
To the patient:
Better drug, better treatment
To the pharma:
Save time, save cost, make more $
To the scientist:
Better science
Programme
Designing
PERL:
Practical Extraction and Report Language
Perl 1.0.0 Larry Wall 1987
http://www.perl.org/ 42
Perl is a programming language
that is offered at no cost.
Why Perl?
Fairly easy to learn the basics
Many powerful functions for working with
text: search & extract, modify, combine
Can control other programs
Free and available for all operating systems
Most popular language in bioinformatics
Many pre-built “modules” are available that
do useful things
43
Get Perl
• You can install Perl on any type of computer.
Download and install Perl on your own
computer: www.perl.org
• Windows version:
http://www.activestate.com/Products/ActivePerl/
• On your desktop Set up a shortcut to the
Command Prompt
Programs/Accessories/Command Prompt
• Edit the properties of the command prompt to
set the Start in to be blank
44
Extension and Path
• On Windows systems, it's usual to associate the
filename extension .pl .
• This is done as part of the Perl installation process,
which modifies the registry settings to include this
file association. You can then launch
this_program.pl
• In MS-DOS type the complete pathname to the
program, for instance perl
c:windowsdesktopmy_program.pl.
• Notepad works satisfactorily.
45
(Computers are VERY dumb -they do
exactly what you tell them to do, so be
careful what you ask for…........)
46
Program details
 Perl programs always start with the line:
#!/usr/bin/perl
 this tells LINUX that this is a Perl program and
where to get the Perl interpreter.
 In windows this is not needed the .pl extension is
enough but it is a good idea to include this card.
 All other lines that start with # are considered
comments, and are ignored by Perl
 Lines that are Perl commands end with a ;
47
The most simpliest
#!/usr/bin/perl
print "Hello";
48
#!/usr/bin/perl
$a="ATGCTGATGCGT";
$b=length($a);
print"$b";
49
Length
#!/usr/bin/perl
$a=“ATGCAGC”;
$b=reverse($a);
print"$b";
50
Reverse
#!/usr/bin/perl
$DNA="ATGCAGTCAGT";
$revcom=reverse$DNA;
$revcom=~tr/ATGC/TACG/;
print"$revcom";
51
Reverse Complement
#!/usr/bin/perl
$DNA='ATGTGCGTGACGTGCAGT';
$RNA=$DNA;
$RNA=~s/T/U/g;
print"$RNAnn";
52
Translation
Using <STDIN>
print"TYPE THE DNA FRAGMENT: ";
$DNA=<STDIN>;
chomp($DNA);
$L=length($DNA);
print“The length of the sequence is $L";
Bioinformatics- An overwiew..................

More Related Content

PPTX
Genomics_final.pptx
PPTX
Introduction to Bioinformatics
PPT
Intro to in silico drug discovery 2014
PDF
Bioinformatics - Exam_Materials.pdf by uos
PDF
Apollo Introduction for i5K Groups 2015-10-07
PPTX
Biological database ppt(1).pptx Introuction
PPTX
Biological database ppt(1).pptx Introuction
PPTX
Bioinformatics final
Genomics_final.pptx
Introduction to Bioinformatics
Intro to in silico drug discovery 2014
Bioinformatics - Exam_Materials.pdf by uos
Apollo Introduction for i5K Groups 2015-10-07
Biological database ppt(1).pptx Introuction
Biological database ppt(1).pptx Introuction
Bioinformatics final

Similar to Bioinformatics- An overwiew.................. (20)

PPTX
BioInformatics Tools -Genomics , Proteomics and metablomics
PPTX
Molecular basis of evolution and softwares used in phylogenetic tree contruction
PDF
Group 5 DNA Tech - Ecology & Envt
PPTX
617....sjuwbwjisjnslosoanwbwbdhidje.pptx
PDF
Bioinformatics: History of Bioinformatics, Components of Bioinformatics, Geno...
PDF
Epigenetic Analysis Sequencing
PPTX
Introduction to databases.pptx
PPTX
Introduction
PPTX
Informal presentation on bioinformatics
PDF
Genome Curation using Apollo
PPTX
Introduction to Biological database ppt(1).pptx
PPTX
Basic Biocomputing
PPTX
Human genome project
PPTX
Introduction to bioinformatics
PPT
Lesson 14.3
PDF
Genome Curation using Apollo - Workshop at UTK
PDF
Apollo Workshop AGS2017 Introduction
PPTX
Dna chip
PPTX
bioinformatics simple
DOCX
Bioinformatics
BioInformatics Tools -Genomics , Proteomics and metablomics
Molecular basis of evolution and softwares used in phylogenetic tree contruction
Group 5 DNA Tech - Ecology & Envt
617....sjuwbwjisjnslosoanwbwbdhidje.pptx
Bioinformatics: History of Bioinformatics, Components of Bioinformatics, Geno...
Epigenetic Analysis Sequencing
Introduction to databases.pptx
Introduction
Informal presentation on bioinformatics
Genome Curation using Apollo
Introduction to Biological database ppt(1).pptx
Basic Biocomputing
Human genome project
Introduction to bioinformatics
Lesson 14.3
Genome Curation using Apollo - Workshop at UTK
Apollo Workshop AGS2017 Introduction
Dna chip
bioinformatics simple
Bioinformatics
Ad

Recently uploaded (20)

PDF
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
PDF
Environmental Education MCQ BD2EE - Share Source.pdf
PDF
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PPTX
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
PPTX
Virtual and Augmented Reality in Current Scenario
PDF
Hazard Identification & Risk Assessment .pdf
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf
PDF
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PDF
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
PDF
AI-driven educational solutions for real-life interventions in the Philippine...
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PPTX
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
PDF
advance database management system book.pdf
PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
PPTX
20th Century Theater, Methods, History.pptx
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
Environmental Education MCQ BD2EE - Share Source.pdf
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
Virtual and Augmented Reality in Current Scenario
Hazard Identification & Risk Assessment .pdf
Practical Manual AGRO-233 Principles and Practices of Natural Farming
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
202450812 BayCHI UCSC-SV 20250812 v17.pptx
FORM 1 BIOLOGY MIND MAPS and their schemes
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
AI-driven educational solutions for real-life interventions in the Philippine...
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
advance database management system book.pdf
Share_Module_2_Power_conflict_and_negotiation.pptx
20th Century Theater, Methods, History.pptx
A powerpoint presentation on the Revised K-10 Science Shaping Paper
Ad

Bioinformatics- An overwiew..................

  • 3. What is Bioinformatics? “The field of science in which biology, computer science, and information technology merge to form a single discipline”
  • 4. Central Dogma in Molecular Biology mRNA Gene (DNA) Protein 21ST century Genome Transcriptome Proteome
  • 5. The Human Genome Project Initiated in 1986 Completed in 2003 Project goals were to • identify all the genes in human DNA, • determine the sequences of the 3 billion chemical base pairs that make up human DNA, • store this information in databases, • improve tools for data analysis and develop new tools • address the ethical, legal, and social issues that may arise from the project.
  • 6. What makes us human? CHIMP GENOME Chimpanzees are similar to humans in so many ways: they are socially complex, sensitive and communicative, and yet indisputably on the animal side of the man/beast divide. Scientists have now sequenced the genetic code of our closest living relative, showing the striking concordances and divergences between the two species, and perhaps holding up a mirror to our own humanity.
  • 7. Perhaps not surprising!!! Comparison between the full drafts of the human and chimp genomes revealed that they differ only by 1.23% How humans are chimps?
  • 8. Annotation Open reading frames Functional sites Structure, function
  • 10. CCTGACAAATTCGACGTGCGGCATTGCATGCAGACGTGCATG CGTGCAAATAATCAATGTGGACTTTTCTGCGATTATGGAAGAA CTTTGTTACGCGTTTTTGTCATGGCTTTGGTCCCGCTTTGTTC AGAATGCTTTTAATAAGCGGGGTTACCGGTTTGGTTAGCGAGA AGAGCCAGTAAAAGACGCAGTGACGGAGATGTCTGATG CAA TAT GGA CAA TTG GTT TCT TCT CTG AAT ................................. .............. TGAAAAACGTA Transcription Factor binding site promoter Ribosome binding Site ORF=Open Reading Frame CDS=Coding Sequence Transcription Start Site
  • 11. Organisms Genome maps DNA sequences RNA sequences ...AATGGTACCGATGACCTGGAGCTTGGTTCGA... Molecular biology data types
  • 12. DNA sequences RNA sequences Protein sequences ...TRLRPLLALLALWPPPPARAFVNQHLCGSHLVEA... Molecular biology data types Organisms Genome maps
  • 13. Protein sequences Protein structures RNA structures Molecular biology data types Organisms Genome maps DNA sequences RNA sequences
  • 14. Protein structures DNA motifs Protein motifs RNA expression Molecular biology data types Organisms Genome maps DNA sequences RNA sequences RNA structures Protein sequences Lei Liu
  • 17. What we want to know about a sequence? • Is this sequence similar to any known genes? How close is the best match? Significance? • What do we know about that gene? – Genomic (chromosomal location, allelic information, regulatory regions, etc.) – Structural (known structure? structural domains? etc.) – Functional (molecular, cellular & disease) • Evolutionary information: – Is this gene found in other organisms? – What is its taxonomic tree? Larry Hunter
  • 18. Biological databases • Data is of different types – Raw data (DNA, RNA, protein sequences) – Curated data (DNA, RNA and protein annotated sequences and structures, expression data)
  • 19. EMBL / GenBank / DDBJ • Serve as archives/ storage containing all sequences (single genes, ESTs, complete genomes, etc.) derived from: – Genome projects – Sequencing centers – Individual scientists – Patent offices (i.e. European Patent Office, EPO) • Non-confidential data are exchanged daily • Currently: 18 x106 sequences, over 20 x109 bp; • Over the last 12 months the database size has tripled • Sequences from > 50’000 different species; • These 3 db contain mainly the same informations within 2-3 days (few differences in the format and syntax)
  • 20. www.ncbi.nlm.nih.gov • Created in 1988 as part of the National Library of Medicine at NIH –Establish public databases –Research in computational biology –Develop software tools for sequence analysis –Disseminate biomedical information 20
  • 21. NCBI and Entrez • NCBI provides interesting summaries, browsers for genome data, and search tools • Entrez is their database search interface http://www.ncbi.nlm.nih.gov/Entrez • Can search on gene names, sequences, chromosomal location, diseases, keywords, ...
  • 22. Sequence Comparison • DNA is blue print for living organisms  Evolution is related to changes in DNA  By comparing DNA sequences we can infer evolutionary relationships between the sequences
  • 23. Copyright 2004 limsoon wong Sequence Alignment Sequence U Sequence V mismatch match indel • Key aspect of sequence comparison is sequence alignment • A sequence alignment maximizes the number of positions that are in agreement in two sequences
  • 24. Copyright 2004 limsoon wong Multiple Alignment: An Example Conserved sites • Multiple seq alignment maximizes number of positions in agreement across several seqs • seqs belonging to same “family” usually have more conserved positions in a multiple seq alignment
  • 25. Copyright 2004 limsoon wong Phylogeny: An Example • By looking at extent of conserved positions in the multiple seq alignment of different groups of seqs, can infer when they last shared an ancestor  Construct “family tree” or phylogeny
  • 27. From: Brandon & Tooze, “Introduction to Protein Structure” primary (1º) secondary (2º) tertiary (3º) quaternary (4º)
  • 28. Small-scale X-ray source in lab or at national synchrotron facility Getting crystals of proteins or nucleic acids is no small feat! Diffraction pattern Computers: Aid in model building, phase determination, visualization Problem: no way to “focus” Need to determine phases
  • 29. Cn3d Cn3D is a visualization tool for macromolecules. It allows you to view 3-D structures from NCBI's Entrez retrieval service. Cn3D is able to correlate structure and sequence information; for example, you can find the residues in a crystal structure that correspond to known disease mutations. Software for 3d structure visualization
  • 30. RasMol RasMol is a molecular graphics program Intended for the visualization of proteins, nucleic acids, and small molecules Aimed at display, teaching, and generation of publication quality images. Software for 3d structure visualization
  • 31. Swiss-Pdb Viewer Swiss-Pdb viewer is used to calculate the distance and angle between atoms atoms. It allows browsing a rotamer library in order to change amino acids side chains. This can be very useful to quickly evaluate the assumed effect of mutation before actually doing the lab work. It allows altering the torsions angles of amino-acids and hetero-atoms, as well as the backbone omega, phi and psi angles. Software for 3d structure visualization
  • 32. CADD
  • 33. What is a drug target? A drug target may be a native protein (or sometimes DNA/RNA) in the body whose activity is modified by a drug resulting in a desirable therapeutic effect. Drug Targets may be: Enzymes Hormone Receptors Ion Channel Proteins sometimes, DNA or RNA CADD
  • 34. The Drug Designing Pathway: Disease Drug Target Ligand Database Natural Product Combinatorial Library Ligand Side chain modification -ve Docking Result +ve Docking Result Synthesis Docking (in silico binding study) In vitro screening In vivo screening Clinical Trials
  • 35. Ligand (analog) based drug design 1. Receptor structure is not known 2. Mechanism is known/ unknown 3. Ligands and their biological activities are known Target (structure) based drug design 1. Receptor structure is known 2. Mechanism is known 3. Ligands and their biological activities are known/ unknown Computational tools are used to: • Identify and study drug targets of various diseases • Study and identify suitable ligand that binds with the drug target • Prediction of toxicity and drug likeness of small molecules (Lipinski Filters & ADMET Screening) • Generation of Combinatorial Library There are two major types of drug design.
  • 36. 3D Structure of the protein (Drug Target) • Download from Protein Data Bank (www.rcsb.org/pdb) (It is a macromolecular structure database) •If not available in PDB, predict the structure (Homology Modeling, Ab initio prediction, Threading etc.) • 3D Structure of the small molecule (Ligand) Small molecule 2D Structures are available in Databases like PubChem, KEGG-Ligand etc. The structure of isolated natural product or synthetic compound may also be derived using NMR spectroscopy or/and XRC. • Convert the 2D small molecule to its 3D structure using software, like CORINA (It stands for CoORdINAtes) Prerequisites of a docking experiment:
  • 37. •The Molecular Wt. must be less than (≤) 500 •C logP ≤ 5 (Octanol/Water Partition Coefficient) •H-bond Donors ≤ 5 •H-bond Acceptors (sum of N and O atoms) ≤ 10 •No. of Rotatable Bonds ≤ 10 Lipinski‘s Rule of Five is applicable to orally active compounds. Lipinski‘s Rule of Five
  • 38. •Absorption:- Must be easily absorbed by body •Distribution:- Compound needs to be easily transferred and distributed to its target site. •Metabolism:- Should take part in various metabolic activities •Excretion:- Byproducts need to be excreted out from the body. •Toxicity:- The toxic effect must be neutralized ADME-Tox Screening
  • 39. Examples: Tubulin:As a Cancer Drug Target Tubulin heterodimer (a + b) is the basic structural unit of microtubule. Drug molecule (Taxol) binds to the tubulin, so that heterodimer can’t be formed. As a result, cell division ceases. Tubulin-a + Tubulin-b Heterodimer Microtubule
  • 40. Benefits of Bioinformatics To the patient: Better drug, better treatment To the pharma: Save time, save cost, make more $ To the scientist: Better science
  • 42. PERL: Practical Extraction and Report Language Perl 1.0.0 Larry Wall 1987 http://www.perl.org/ 42 Perl is a programming language that is offered at no cost.
  • 43. Why Perl? Fairly easy to learn the basics Many powerful functions for working with text: search & extract, modify, combine Can control other programs Free and available for all operating systems Most popular language in bioinformatics Many pre-built “modules” are available that do useful things 43
  • 44. Get Perl • You can install Perl on any type of computer. Download and install Perl on your own computer: www.perl.org • Windows version: http://www.activestate.com/Products/ActivePerl/ • On your desktop Set up a shortcut to the Command Prompt Programs/Accessories/Command Prompt • Edit the properties of the command prompt to set the Start in to be blank 44
  • 45. Extension and Path • On Windows systems, it's usual to associate the filename extension .pl . • This is done as part of the Perl installation process, which modifies the registry settings to include this file association. You can then launch this_program.pl • In MS-DOS type the complete pathname to the program, for instance perl c:windowsdesktopmy_program.pl. • Notepad works satisfactorily. 45
  • 46. (Computers are VERY dumb -they do exactly what you tell them to do, so be careful what you ask for…........) 46
  • 47. Program details  Perl programs always start with the line: #!/usr/bin/perl  this tells LINUX that this is a Perl program and where to get the Perl interpreter.  In windows this is not needed the .pl extension is enough but it is a good idea to include this card.  All other lines that start with # are considered comments, and are ignored by Perl  Lines that are Perl commands end with a ; 47
  • 53. Using <STDIN> print"TYPE THE DNA FRAGMENT: "; $DNA=<STDIN>; chomp($DNA); $L=length($DNA); print“The length of the sequence is $L";