Jalormi Parekh
M.Sc. Final Medical Biotechnology
• Genomics - It is the study of genomes.
• The field of genomics comprises of two
main areas:
1.Comparative genomics
2. Functional genomics
• Comparative genomics is a large-scale, holistic approach that
compares two or more genomes to discover the similarities
and differences between the genomes and to study the
biology of the individual genomes
• The subject of comparative genomics impinges on
– Evolutionary biology and phylogenetic reconstructions of the tree of
life,
– Drug discovery programs,
– Function predictions of hypothetical proteins
– Identification of genes, regulatory motifs and other non-coding
DNA motifs
– Genome flux and dynamics
• Comparative analysis of genome structure
• Comparative analysis of coding regions
• Comparative analysis of non-coding regions
Methods for comparative genomics
• The structure of different genomes can be compared at
three levels:
– Overall nucleotide statistics,
– Genome structure at DNA level,and
– Genome structure at gene level
• Overall nucleotide statistics, such as
– Genome size,
– Overall (G+C) content,
– Regions of different (G+C) content,
– Genome signature such as codon usage biases,
– Amino acid usage biases, and the ratio of observed di-
nucleotide frequency and
– The expected frequency given random nucleotide
distribution
• Chromosomal breakage and exchange of chromosomal
fragments are common mode of gene evolution. They can be
studied by comparing genome structures at DNAlevel.
– Identification of conserved synteny and genome
 rearrangement events
– Analysis of breakpoints
– Analysis of content and distribution of DNArepeats
• Chromosomal breakage and exchange
of chromosomal fragments cause
disruption of gene order
• Therefore gene order correlates with
evolutionary distance between genomes
Identification of gene-coding regions
comparison of gene content
comparison of protein content
Comparative genome based function prediction
• Noncoding regions of the genome gained a lot
of attention in recent years because of its
predicted role in regulation of transcription,
DNA replication, and other biological
functions
• This approach is based on the presumption that
selective pressure causes regulatory elements
to evolve at a slower rate than that of non
regulatory sequences in the non coding
regions
• Comparative genomic studies throw important light
on the pathogenesis of organisms, throwing up
opportunities for therapeutic intervention as well as
help in understanding and identifying disease genes
• One of the most important fallouts of comparative
analyses at a genome-wide scale is in the ability to
identify and develop novel drug targets
• If one is looking for antibacterial, antifungal, or
antiprotozoal proteins to be used as targets,
comparative genome analysis can reveal virulence
genes, uncharacterized essential genes, species-
specific genes, organism-specific genes, while ensuring
that the chosen genes have no homologues in humans
• It is largely experiment based with a focus on
gene functions at the whole genome level
using high throughput approaches.
• The high throughput analysis of all expressed
genes is also termed transcriptome
analysis
• Transcriptome analysis can be conducted by
two approaches:
1) sequence based approaches
2) microarray based approaches
• Expressed sequence tags :- ESTs are short
sequences of cDNA typically 200-400 nucleotides in
length.
• Obtained from either 5’ end or 3’ end of cDNA inserts of cDNA
library.
 ADVANTAGESOF E.S.T :-
• Provide a rough estimate of genes that are actively
expressed in a genome under a particular physiological
condition.
• Help in discovering new genes, due to random
sequencing of cDNA clones.
• EST libraries can be easily generated
 DRAWBACKSOF USING E.S.Ts:-
• Automatically generated without verification thus
contain high error rates.
• There is often contamination by vector
sequence , introns, ribosomal RNA,
mitochondrial RNA.
• Weakly expressed genes are hardly found in EST
sequencing survey.
• ESTs represent only partial sequences of genes.
 Major EST index sequence databases
are:
• Unigene – is an NCBI EST cluster
database.
• TIGR Gene indices
• Serial analysis of gene expression
• Another high throughput, Sequence-based
approach for gene expression profile analysis.
• More quantitative in determining mRNA
expression in cell.
• Short fragments (taqs) of DNA, excised from
cDNA sequences , act as unique markers of gene
transcript
• SAGE invented at Johns Hopkins University in
USA (Oncology Center) by Dr. Victor in 1995.
• A short sequence tag (10-14bp) contains
sufficient information to uniquely identify a
transcript.
• Sequence tag can be linked together to
form long serial molecules that can be
cloned and sequenced.
• the number of times a particular tag is
observed provides the expression level of
the corresponding transcript
Trapping of RNA with beads
cDNA synthesis
Enzymatic cleavage of cDNA
Ligation of Linkers to bound cDNA
Isolation and concatamerization of ditags
PCR amplification of Ditags
Formation of Ditags
Cleaving with tagging enzyme
• Software tools for SAGEanalysis:
• SAGE map
• SAGE xprofiler
• SAGE Genie
• Advantages over EST analysis:
a. Detect weakly expressed genes
b. It uses a short nucleotide tag and allows
sequencing of multiple tags in a single clone
 A microarray is a pattern of ssDNA probes which are
immobilized on a surface called a chip or a slide.
• Microarrays use hybridization to detect a specific DNA
or RNA in a sample.
• DNA microarray uses a million different probes, fixed
on a solid surface.
• Microarray technology evolved from Southern
blotting.
• To analyze the expression of thousands of
genes in single reaction, very quickly and
in an efficient manner.
• To understand the genetic causes for
the abnormal functioning of the
human body.
• To understand which genes are active
and which genes are inactive in
different cell types.
• Length of oligonucleotides is in range of 25-70 bases
long
• Oligonucleotides are called probes
• Oligonucleotide should not form stable internal
secondary structure.
• All probes should have approx. equal Tm.
• OligoWiz & OligoArray are 2 programs used in
designing probe for microarray spotting.
Sample
preparation
and
labeling
Hybridisation Washing
Image processing
and
Data analysis
• Software programs to perform microarray
image analysis:
• ArrayDB
• TIGR Spotfinder
THANKYOU

Comparative and functional genomics

  • 1.
    Jalormi Parekh M.Sc. FinalMedical Biotechnology
  • 2.
    • Genomics -It is the study of genomes. • The field of genomics comprises of two main areas: 1.Comparative genomics 2. Functional genomics
  • 3.
    • Comparative genomicsis a large-scale, holistic approach that compares two or more genomes to discover the similarities and differences between the genomes and to study the biology of the individual genomes • The subject of comparative genomics impinges on – Evolutionary biology and phylogenetic reconstructions of the tree of life, – Drug discovery programs, – Function predictions of hypothetical proteins – Identification of genes, regulatory motifs and other non-coding DNA motifs – Genome flux and dynamics
  • 4.
    • Comparative analysisof genome structure • Comparative analysis of coding regions • Comparative analysis of non-coding regions Methods for comparative genomics
  • 5.
    • The structureof different genomes can be compared at three levels: – Overall nucleotide statistics, – Genome structure at DNA level,and – Genome structure at gene level
  • 6.
    • Overall nucleotidestatistics, such as – Genome size, – Overall (G+C) content, – Regions of different (G+C) content, – Genome signature such as codon usage biases, – Amino acid usage biases, and the ratio of observed di- nucleotide frequency and – The expected frequency given random nucleotide distribution
  • 7.
    • Chromosomal breakageand exchange of chromosomal fragments are common mode of gene evolution. They can be studied by comparing genome structures at DNAlevel. – Identification of conserved synteny and genome  rearrangement events – Analysis of breakpoints – Analysis of content and distribution of DNArepeats
  • 8.
    • Chromosomal breakageand exchange of chromosomal fragments cause disruption of gene order • Therefore gene order correlates with evolutionary distance between genomes
  • 9.
    Identification of gene-codingregions comparison of gene content comparison of protein content Comparative genome based function prediction
  • 10.
    • Noncoding regionsof the genome gained a lot of attention in recent years because of its predicted role in regulation of transcription, DNA replication, and other biological functions • This approach is based on the presumption that selective pressure causes regulatory elements to evolve at a slower rate than that of non regulatory sequences in the non coding regions
  • 11.
    • Comparative genomicstudies throw important light on the pathogenesis of organisms, throwing up opportunities for therapeutic intervention as well as help in understanding and identifying disease genes • One of the most important fallouts of comparative analyses at a genome-wide scale is in the ability to identify and develop novel drug targets • If one is looking for antibacterial, antifungal, or antiprotozoal proteins to be used as targets, comparative genome analysis can reveal virulence genes, uncharacterized essential genes, species- specific genes, organism-specific genes, while ensuring that the chosen genes have no homologues in humans
  • 13.
    • It islargely experiment based with a focus on gene functions at the whole genome level using high throughput approaches. • The high throughput analysis of all expressed genes is also termed transcriptome analysis • Transcriptome analysis can be conducted by two approaches: 1) sequence based approaches 2) microarray based approaches
  • 14.
    • Expressed sequencetags :- ESTs are short sequences of cDNA typically 200-400 nucleotides in length. • Obtained from either 5’ end or 3’ end of cDNA inserts of cDNA library.  ADVANTAGESOF E.S.T :- • Provide a rough estimate of genes that are actively expressed in a genome under a particular physiological condition. • Help in discovering new genes, due to random sequencing of cDNA clones. • EST libraries can be easily generated
  • 15.
     DRAWBACKSOF USINGE.S.Ts:- • Automatically generated without verification thus contain high error rates. • There is often contamination by vector sequence , introns, ribosomal RNA, mitochondrial RNA. • Weakly expressed genes are hardly found in EST sequencing survey. • ESTs represent only partial sequences of genes.
  • 16.
     Major ESTindex sequence databases are: • Unigene – is an NCBI EST cluster database. • TIGR Gene indices
  • 17.
    • Serial analysisof gene expression • Another high throughput, Sequence-based approach for gene expression profile analysis. • More quantitative in determining mRNA expression in cell. • Short fragments (taqs) of DNA, excised from cDNA sequences , act as unique markers of gene transcript • SAGE invented at Johns Hopkins University in USA (Oncology Center) by Dr. Victor in 1995.
  • 18.
    • A shortsequence tag (10-14bp) contains sufficient information to uniquely identify a transcript. • Sequence tag can be linked together to form long serial molecules that can be cloned and sequenced. • the number of times a particular tag is observed provides the expression level of the corresponding transcript
  • 19.
    Trapping of RNAwith beads cDNA synthesis Enzymatic cleavage of cDNA Ligation of Linkers to bound cDNA Isolation and concatamerization of ditags PCR amplification of Ditags Formation of Ditags Cleaving with tagging enzyme
  • 20.
    • Software toolsfor SAGEanalysis: • SAGE map • SAGE xprofiler • SAGE Genie • Advantages over EST analysis: a. Detect weakly expressed genes b. It uses a short nucleotide tag and allows sequencing of multiple tags in a single clone
  • 21.
     A microarrayis a pattern of ssDNA probes which are immobilized on a surface called a chip or a slide. • Microarrays use hybridization to detect a specific DNA or RNA in a sample. • DNA microarray uses a million different probes, fixed on a solid surface. • Microarray technology evolved from Southern blotting.
  • 22.
    • To analyzethe expression of thousands of genes in single reaction, very quickly and in an efficient manner. • To understand the genetic causes for the abnormal functioning of the human body. • To understand which genes are active and which genes are inactive in different cell types.
  • 23.
    • Length ofoligonucleotides is in range of 25-70 bases long • Oligonucleotides are called probes • Oligonucleotide should not form stable internal secondary structure. • All probes should have approx. equal Tm. • OligoWiz & OligoArray are 2 programs used in designing probe for microarray spotting.
  • 24.
  • 26.
    • Software programsto perform microarray image analysis: • ArrayDB • TIGR Spotfinder
  • 27.

Editor's Notes

  • #10 1) The analysis and comparison of the coding regions starts with the gene identification algorithm that is used to infer what portions of the genomic sequence actively code for genes. Based on transcription evidence, homology of gene, genome similarity. 2)first statistics to compare is the estimated total number of genes in a genome, elucidate the similarities and differences between the genomes include percentage of the genome that code for genes, distribution of coding regions across the genome (a.k.a. gene density), average gene length, codon usage. BLASTN or TBLASTX is used. 3)It is important to compare the protein contents in critical pathways and important functional categories across genom Two widely used resources for pathways and functional categories are the KEGG pathway database and the Gene Ontology (GO) hierarchy 4) functional assignment of genes in a non similarity-based manner. This rely on the basic premise that genes; that are functionally related, are genes that are closely associated across genomes in some form This include three methods: Co-conservation across genomes Conservation of gene clusters and genomic context across species Physical fusion of functionally linked genes across species (Domain fusion analysis)
  • #25 Isolate a total RNA containing mRNA that ideally represents genes, that are expressed at the time of sample collection. Preparation of cDNA from mRNA using a reverse-transcriptase enzyme. Short primer is required to initiate cDNA synthesis. Each cDNA (Sample and Control) is labelled with fluorescent cyanine dyes (i.e. Cy3 and Cy5). labelled cDNA is competitively hybridized against cDNA molecules spotted on a glass slide.