SlideShare a Scribd company logo
2
Most read
4
Most read
7
Most read
Introduction to BioinformaticsA tale of myths and legends[Freevector]June 16, 2011
June 16, 2011“Bioinformatics is the field of science in which biology, computer science, and information technology merge to form a single discipline.”National Center for Biotechnology Information(NCBI)
Areas where bioinformatics is applied GenomicsGenomic feature predictionSequencing data analysisProteomicsProtein 3D structure modelingDrug designSystems BiologyGene set enrichmentPathway analysisPhenotypeImage analysisIntegrationJune 16, 2011
ApproachBiological QuestionGenerate DataTranslate into a computer solvable taskDevelop an algorithmImplement algorithmRun algorithmCondense result in human readable formAnswer Biological QuestionExampleGenes regulated by protein X ChIP-Seq data“Align reads and identify clusters in the genome”Choose data structuresWrite source codeAlign readsWrite script to summarize results genome wide Report protein’s binding sitesJune 16, 2011“Bioinformatics is the field of science in which biology, computer science, and information technology merge to form a single discipline.”NCBI
The challenges in bioinformaticsAcceptance by biological collaborators when all that matters for the publication is the biologyRetaining quality workWorkflows poorly annotated in papersPrograms poorly writtenNo reproducibilityKeeping up-to-dateNew programs are published every weekNew formats because no time to evaluate existing standardsNew databases because existing ones full of noiseJune 16, 2011
Bioinformatics a mythical creature?June 16, 2011Christos OuzounisHead of the Computational Genomics Group at the European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Cambridge UK
Myth #1: Anybody can do it!Assumption: Most bioinformatics analysis can be done by using web applications and commercial programs with GUIsJune 16, 2011http://www.broadinstitute.org/cancer/software/genepattern/index.htmlhttp://main.g2.bx.psu.edu/Ouzounis C. Two or three myths about bioinformatics. Bioinformatics. 2000Mar;16(3):187-9. PubMed PMID: 10869011.
Customized answers require expert inputAnyone can do predefined analysis with web pages and off-the-shelf programs, howeverUsing tools without understanding the methodology is dangerous“Anyone” needs to understand algorithmic papersE.g. Program might produce output that has a certain bias, not knowing this the researchers could publish this artificial bias as biological result.A standard bioinformatics tool that works well for general tasks does not existE.g. Local-Alignment Algorithm (1981) vs. PCR (1983) in NGSOnly novel tools/pipelines can provide customized answers“Anyone” needs to be proficient in programming to write the required algorithms/scriptsE.g. You might have to settle for comparing your features with known genes because the program is not able to compare to novel transcriptsJune 16, 2011!Smith, Temple F.; and Waterman, Michael S. (1981). "Identification of Common Molecular Subsequences". Journal of Molecular Biology 147: 195–197.
Myth #2: Bioinformatics is a serviceAssumption: Bioinformatics merely supports the experimental research and can be a disconnect serviceJune 16, 2011Traditional BiologyHypothesisExperimentEyeballingExperimental DesignEvaluationBiologyHigh Throughput Biology (assumption)ExperimentBiologyHypothesisExperimental DesignEvaluationData analysisBioinformatics
Interdisciplinary analysis requires an interdisciplinary team throughout!June 16, 2011Standard data analysis can be a service task, however Having a service performed without knowing the methodology is dangerous. “Service” needs to make scientific decisions to take the assumptions under which the data was produced into account.Repeating a statistical test for all genes requires an E-value to be calculated.Producing data not suitable for the planned analysis is wasteful.“Service” needs to have scientific input in the experimental design to ensure the data can be analyzed.Comparing the distribution of mapped reads of runs with different read lengths will result in a difference that is due to the mapping bias of different read lengths.High Throughput BiologyExperimentAnalysisEvaluationExperimental DesignHypothesis
Myth #4: Bioinformatics is quickAssumption: bioinformatics analysis can be done quickly because computers are involved.June 16, 2011http://www.ads-links.com/images/wp/keyboard-fast.gif
Bioinformatics analysis is a scientific experiment in itselfBioinformatics is faster than manual work, howeverQuick tasks accumulated take a long timeTask: Map 15 million reads of 76 bp length against the complete human genome (hg18)Manual: couple of decadesBrute-Force: couple of yearsBLAST (1995): couple of daysModern Aligners: BWA ~ 4 hBioinformatics is a proper scientific experiment in itself requires time for experimental design, development of controls, parameter tuning, evaluation, and summarizing.   June 16, 2011!Bao S, Jiang R, Kwan W, Wang B, Ma X, Song YQ. Evaluation of next-generation sequencing software in mapping and assembly. J Hum Genet. 2011 Apr 28. PubMed PMID: 21525877.
Myth #5: All dry-lab research does the sameAssumption: bioinformatics is interchangeable with other dry lab research area because they all “analyze data”.Assumption: All biological research areas are interchangeable because they all “work with samples”. June 16, 2011
Three things to rememberBioinformatics requires dedication and continuityBioinformatics data analysis is a full research experiment in itselfWe get the most out of our research if we work as a interdisciplinary research team throughoutJune 16, 2011ExperimentAnalysisEvaluationExperimental DesignHypothesis
Next week:June 16, 2011Abstract: An introduction to second generation sequencing will be given with focus on the production informatics: The basic approach of read-mapping and feature extraction will be introduced and challenges associated with sequencing errors discussed. http://web.qbi.uq.edu.au/labs/gseq/analysis/bioinformatics-seminar-series/
TIPPuts several images in one fileconvert -adjoin unicorn.pngunicorn.pngunicorn.pngadjoin.pdfJoins several images into one imageconvert –append unicorn.pngunicorn.pngunicorn.pngappend.pdfJune 16, 2011

More Related Content

What's hot (20)

PDF
EMBL- European Molecular Biology Laboratory
Thapar Institute of Engineering & Technology, Patiala, Punjab, India
 
PPTX
Bioinformatic, and tools by kk sahu
KAUSHAL SAHU
 
PPTX
Protein database
Rajpal Choudhary
 
PPTX
Blast and fasta
ALLIENU
 
PPT
Biological databases
Afra Fathima
 
PPTX
Multiple sequence alignment
Subhranil Bhattacharjee
 
PPTX
European molecular biology laboratory (EMBL)
Hafiz Muhammad Zeeshan Raza
 
PPTX
Swiss pdb viewer
Sivasangari Shanmugam
 
PPTX
Protein identification and analysis on ExPASy server
Ekta Gupta
 
PPTX
Entrez databases
Hafiz Muhammad Zeeshan Raza
 
PPTX
phylogenetic analysis.pptx
Dr. Vimal Priya subramanian
 
PDF
The ensembl database
Ashfaq Ahmad
 
PPTX
Nucleic acid and protein databanks
NithyaNandapal
 
PPT
Sequence Alignment In Bioinformatics
Nikesh Narayanan
 
PPTX
blast bioinformatics
Sardar Harpreet Kalsi
 
PPTX
Bioinformatics
Arockiyajainmary
 
PDF
Gene prediction methods vijay
Vijay Hemmadi
 
PPTX
Databases pathways of genomics and proteomics
Sachin Kumar
 
Bioinformatic, and tools by kk sahu
KAUSHAL SAHU
 
Protein database
Rajpal Choudhary
 
Blast and fasta
ALLIENU
 
Biological databases
Afra Fathima
 
Multiple sequence alignment
Subhranil Bhattacharjee
 
European molecular biology laboratory (EMBL)
Hafiz Muhammad Zeeshan Raza
 
Swiss pdb viewer
Sivasangari Shanmugam
 
Protein identification and analysis on ExPASy server
Ekta Gupta
 
Entrez databases
Hafiz Muhammad Zeeshan Raza
 
phylogenetic analysis.pptx
Dr. Vimal Priya subramanian
 
The ensembl database
Ashfaq Ahmad
 
Nucleic acid and protein databanks
NithyaNandapal
 
Sequence Alignment In Bioinformatics
Nikesh Narayanan
 
blast bioinformatics
Sardar Harpreet Kalsi
 
Bioinformatics
Arockiyajainmary
 
Gene prediction methods vijay
Vijay Hemmadi
 
Databases pathways of genomics and proteomics
Sachin Kumar
 

Viewers also liked (20)

PPTX
Variant (SNPs/Indels) calling in DNA sequences, Part 2
Denis C. Bauer
 
PPTX
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Denis C. Bauer
 
PPTX
Functionally annotate genomic variants
Denis C. Bauer
 
PDF
Protein function and bioinformatics
Neil Saunders
 
PDF
Introduction to Bioinformatics.
Elena Sügis
 
PDF
100505 koenig biological_databases
Meetika Gupta
 
PDF
Introduction to Bioinformatics
Alexander Niema Moshiri
 
PPT
The Seven Deadly Sins of Bioinformatics
Duncan Hull
 
PDF
Introduction to Data Mining / Bioinformatics
Gerald Lushington
 
PDF
How to write bioinformatics software people will use and cite - t.seemann - ...
Torsten Seemann
 
PPTX
Introduction to second generation sequencing
Denis C. Bauer
 
PPT
1.bioinformatics introduction 32.03.2071
RajDip Basnet
 
PPT
B.sc biochem i bobi u-1 introduction to bioinformatics
Rai University
 
PDF
Bioinformatics issues and challanges presentation at s p college
SKUASTKashmir
 
PPTX
Bioinformatics
Promila Sharan
 
PPT
Bioinformatics and Drug Discovery
Dr. Paulsharma Chakravarthy
 
PDF
Nucleic Acid Sequence databases
Pranavathiyani G
 
PPT
Biological Databases
Shweta Kagliwal
 
PDF
Bioinformatics
Nuno Barreto
 
PPTX
Major databases in bioinformatics
Vidya Kalaivani Rajkumar
 
Variant (SNPs/Indels) calling in DNA sequences, Part 2
Denis C. Bauer
 
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Denis C. Bauer
 
Functionally annotate genomic variants
Denis C. Bauer
 
Protein function and bioinformatics
Neil Saunders
 
Introduction to Bioinformatics.
Elena Sügis
 
100505 koenig biological_databases
Meetika Gupta
 
Introduction to Bioinformatics
Alexander Niema Moshiri
 
The Seven Deadly Sins of Bioinformatics
Duncan Hull
 
Introduction to Data Mining / Bioinformatics
Gerald Lushington
 
How to write bioinformatics software people will use and cite - t.seemann - ...
Torsten Seemann
 
Introduction to second generation sequencing
Denis C. Bauer
 
1.bioinformatics introduction 32.03.2071
RajDip Basnet
 
B.sc biochem i bobi u-1 introduction to bioinformatics
Rai University
 
Bioinformatics issues and challanges presentation at s p college
SKUASTKashmir
 
Bioinformatics
Promila Sharan
 
Bioinformatics and Drug Discovery
Dr. Paulsharma Chakravarthy
 
Nucleic Acid Sequence databases
Pranavathiyani G
 
Biological Databases
Shweta Kagliwal
 
Bioinformatics
Nuno Barreto
 
Major databases in bioinformatics
Vidya Kalaivani Rajkumar
 
Ad

Similar to Introduction to Bioinformatics (20)

PPTX
Bioinformatica 29-09-2011-t1-bioinformatics
Prof. Wim Van Criekinge
 
PPTX
Data analysis & integration challenges in genomics
mikaelhuss
 
PPTX
Uses of Artificial Intelligence in Bioinformatics
Pragya Pai
 
PDF
Bioinformatics
Amna Jalil
 
DOCX
Bioinformatics
Vidya Kalaivani Rajkumar
 
PPT
Explorations in bioinformatics
Douglas Joubert
 
PPTX
Bioinformatics
Bivek Rai
 
PDF
(eBook PDF) Encyclopedia of Bioinformatics and Computational Biology: ABC of ...
lrpsudzzit961
 
PDF
(eBook PDF) Encyclopedia of Bioinformatics and Computational Biology: ABC of ...
lecoqfaigyk9
 
PPT
2011-10-11 Open PHACTS at BioIT World Europe
open_phacts
 
PDF
Bioinformatics—an introduction for computer scientists
unyil96
 
PPTX
Bioinformatics workflows and study design
ElanaFertig
 
PPTX
BIOINFO unit 1.pptx
rnath286
 
PPT
Bioinformatics
biinoida
 
PPT
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
Carole Goble
 
PDF
An analysis of recent advancements in computational biology and Bioinformatic...
Pubrica
 
PPTX
Computational Genomics - Bioinformatics - IK
Ilgın Kavaklıoğulları
 
PDF
Chemistry made mobile – the expanding world of chemistry in the hand
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
PPTX
History and devolopment of bioinfomatics.ppt (1)
Madan Kumar Ca
 
PDF
Introduction to Bioinformatics-1.pdf
kigaruantony
 
Bioinformatica 29-09-2011-t1-bioinformatics
Prof. Wim Van Criekinge
 
Data analysis & integration challenges in genomics
mikaelhuss
 
Uses of Artificial Intelligence in Bioinformatics
Pragya Pai
 
Bioinformatics
Amna Jalil
 
Bioinformatics
Vidya Kalaivani Rajkumar
 
Explorations in bioinformatics
Douglas Joubert
 
Bioinformatics
Bivek Rai
 
(eBook PDF) Encyclopedia of Bioinformatics and Computational Biology: ABC of ...
lrpsudzzit961
 
(eBook PDF) Encyclopedia of Bioinformatics and Computational Biology: ABC of ...
lecoqfaigyk9
 
2011-10-11 Open PHACTS at BioIT World Europe
open_phacts
 
Bioinformatics—an introduction for computer scientists
unyil96
 
Bioinformatics workflows and study design
ElanaFertig
 
BIOINFO unit 1.pptx
rnath286
 
Bioinformatics
biinoida
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
Carole Goble
 
An analysis of recent advancements in computational biology and Bioinformatic...
Pubrica
 
Computational Genomics - Bioinformatics - IK
Ilgın Kavaklıoğulları
 
Chemistry made mobile – the expanding world of chemistry in the hand
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
History and devolopment of bioinfomatics.ppt (1)
Madan Kumar Ca
 
Introduction to Bioinformatics-1.pdf
kigaruantony
 
Ad

More from Denis C. Bauer (18)

PPTX
Cloud-native machine learning - Transforming bioinformatics research
Denis C. Bauer
 
PPTX
Translating genomics into clinical practice - 2018 AWS summit keynote
Denis C. Bauer
 
PPTX
Going Server-less for Web-Services that need to Crunch Large Volumes of Data
Denis C. Bauer
 
PPTX
How novel compute technology transforms life science research
Denis C. Bauer
 
PPTX
How novel compute technology transforms life science research
Denis C. Bauer
 
PPTX
VariantSpark: applying Spark-based machine learning methods to genomic inform...
Denis C. Bauer
 
PPTX
Population-scale high-throughput sequencing data analysis
Denis C. Bauer
 
PPTX
Trip Report Seattle
Denis C. Bauer
 
PPTX
Allelic Imbalance for Pre-capture Whole Exome Sequencing
Denis C. Bauer
 
PPTX
Centralizing sequence analysis
Denis C. Bauer
 
PPTX
Qbi Centre for Brain genomics (Informatics side)
Denis C. Bauer
 
PPTX
Differential gene expression
Denis C. Bauer
 
PPTX
Transcript detection in RNAseq
Denis C. Bauer
 
PPTX
The missing data issue for HiSeq runs
Denis C. Bauer
 
PDF
Deciphering the regulatory code in the genome
Denis C. Bauer
 
PPT
ReliF
Denis C. Bauer
 
PPT
STAR: Recombination site prediction
Denis C. Bauer
 
PPT
SUMOylation site prediction
Denis C. Bauer
 
Cloud-native machine learning - Transforming bioinformatics research
Denis C. Bauer
 
Translating genomics into clinical practice - 2018 AWS summit keynote
Denis C. Bauer
 
Going Server-less for Web-Services that need to Crunch Large Volumes of Data
Denis C. Bauer
 
How novel compute technology transforms life science research
Denis C. Bauer
 
How novel compute technology transforms life science research
Denis C. Bauer
 
VariantSpark: applying Spark-based machine learning methods to genomic inform...
Denis C. Bauer
 
Population-scale high-throughput sequencing data analysis
Denis C. Bauer
 
Trip Report Seattle
Denis C. Bauer
 
Allelic Imbalance for Pre-capture Whole Exome Sequencing
Denis C. Bauer
 
Centralizing sequence analysis
Denis C. Bauer
 
Qbi Centre for Brain genomics (Informatics side)
Denis C. Bauer
 
Differential gene expression
Denis C. Bauer
 
Transcript detection in RNAseq
Denis C. Bauer
 
The missing data issue for HiSeq runs
Denis C. Bauer
 
Deciphering the regulatory code in the genome
Denis C. Bauer
 
STAR: Recombination site prediction
Denis C. Bauer
 
SUMOylation site prediction
Denis C. Bauer
 

Recently uploaded (20)

PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PPTX
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
PPTX
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
Researching The Best Chat SDK Providers in 2025
Ray Fields
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PPTX
Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...
AgileNetwork
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PPTX
AI Code Generation Risks (Ramkumar Dilli, CIO, Myridius)
Priyanka Aash
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
Researching The Best Chat SDK Providers in 2025
Ray Fields
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...
AgileNetwork
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
AI Code Generation Risks (Ramkumar Dilli, CIO, Myridius)
Priyanka Aash
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
The Future of AI & Machine Learning.pptx
pritsen4700
 

Introduction to Bioinformatics

  • 1. Introduction to BioinformaticsA tale of myths and legends[Freevector]June 16, 2011
  • 2. June 16, 2011“Bioinformatics is the field of science in which biology, computer science, and information technology merge to form a single discipline.”National Center for Biotechnology Information(NCBI)
  • 3. Areas where bioinformatics is applied GenomicsGenomic feature predictionSequencing data analysisProteomicsProtein 3D structure modelingDrug designSystems BiologyGene set enrichmentPathway analysisPhenotypeImage analysisIntegrationJune 16, 2011
  • 4. ApproachBiological QuestionGenerate DataTranslate into a computer solvable taskDevelop an algorithmImplement algorithmRun algorithmCondense result in human readable formAnswer Biological QuestionExampleGenes regulated by protein X ChIP-Seq data“Align reads and identify clusters in the genome”Choose data structuresWrite source codeAlign readsWrite script to summarize results genome wide Report protein’s binding sitesJune 16, 2011“Bioinformatics is the field of science in which biology, computer science, and information technology merge to form a single discipline.”NCBI
  • 5. The challenges in bioinformaticsAcceptance by biological collaborators when all that matters for the publication is the biologyRetaining quality workWorkflows poorly annotated in papersPrograms poorly writtenNo reproducibilityKeeping up-to-dateNew programs are published every weekNew formats because no time to evaluate existing standardsNew databases because existing ones full of noiseJune 16, 2011
  • 6. Bioinformatics a mythical creature?June 16, 2011Christos OuzounisHead of the Computational Genomics Group at the European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Cambridge UK
  • 7. Myth #1: Anybody can do it!Assumption: Most bioinformatics analysis can be done by using web applications and commercial programs with GUIsJune 16, 2011http://www.broadinstitute.org/cancer/software/genepattern/index.htmlhttp://main.g2.bx.psu.edu/Ouzounis C. Two or three myths about bioinformatics. Bioinformatics. 2000Mar;16(3):187-9. PubMed PMID: 10869011.
  • 8. Customized answers require expert inputAnyone can do predefined analysis with web pages and off-the-shelf programs, howeverUsing tools without understanding the methodology is dangerous“Anyone” needs to understand algorithmic papersE.g. Program might produce output that has a certain bias, not knowing this the researchers could publish this artificial bias as biological result.A standard bioinformatics tool that works well for general tasks does not existE.g. Local-Alignment Algorithm (1981) vs. PCR (1983) in NGSOnly novel tools/pipelines can provide customized answers“Anyone” needs to be proficient in programming to write the required algorithms/scriptsE.g. You might have to settle for comparing your features with known genes because the program is not able to compare to novel transcriptsJune 16, 2011!Smith, Temple F.; and Waterman, Michael S. (1981). "Identification of Common Molecular Subsequences". Journal of Molecular Biology 147: 195–197.
  • 9. Myth #2: Bioinformatics is a serviceAssumption: Bioinformatics merely supports the experimental research and can be a disconnect serviceJune 16, 2011Traditional BiologyHypothesisExperimentEyeballingExperimental DesignEvaluationBiologyHigh Throughput Biology (assumption)ExperimentBiologyHypothesisExperimental DesignEvaluationData analysisBioinformatics
  • 10. Interdisciplinary analysis requires an interdisciplinary team throughout!June 16, 2011Standard data analysis can be a service task, however Having a service performed without knowing the methodology is dangerous. “Service” needs to make scientific decisions to take the assumptions under which the data was produced into account.Repeating a statistical test for all genes requires an E-value to be calculated.Producing data not suitable for the planned analysis is wasteful.“Service” needs to have scientific input in the experimental design to ensure the data can be analyzed.Comparing the distribution of mapped reads of runs with different read lengths will result in a difference that is due to the mapping bias of different read lengths.High Throughput BiologyExperimentAnalysisEvaluationExperimental DesignHypothesis
  • 11. Myth #4: Bioinformatics is quickAssumption: bioinformatics analysis can be done quickly because computers are involved.June 16, 2011http://www.ads-links.com/images/wp/keyboard-fast.gif
  • 12. Bioinformatics analysis is a scientific experiment in itselfBioinformatics is faster than manual work, howeverQuick tasks accumulated take a long timeTask: Map 15 million reads of 76 bp length against the complete human genome (hg18)Manual: couple of decadesBrute-Force: couple of yearsBLAST (1995): couple of daysModern Aligners: BWA ~ 4 hBioinformatics is a proper scientific experiment in itself requires time for experimental design, development of controls, parameter tuning, evaluation, and summarizing. June 16, 2011!Bao S, Jiang R, Kwan W, Wang B, Ma X, Song YQ. Evaluation of next-generation sequencing software in mapping and assembly. J Hum Genet. 2011 Apr 28. PubMed PMID: 21525877.
  • 13. Myth #5: All dry-lab research does the sameAssumption: bioinformatics is interchangeable with other dry lab research area because they all “analyze data”.Assumption: All biological research areas are interchangeable because they all “work with samples”. June 16, 2011
  • 14. Three things to rememberBioinformatics requires dedication and continuityBioinformatics data analysis is a full research experiment in itselfWe get the most out of our research if we work as a interdisciplinary research team throughoutJune 16, 2011ExperimentAnalysisEvaluationExperimental DesignHypothesis
  • 15. Next week:June 16, 2011Abstract: An introduction to second generation sequencing will be given with focus on the production informatics: The basic approach of read-mapping and feature extraction will be introduced and challenges associated with sequencing errors discussed. http://web.qbi.uq.edu.au/labs/gseq/analysis/bioinformatics-seminar-series/
  • 16. TIPPuts several images in one fileconvert -adjoin unicorn.pngunicorn.pngunicorn.pngadjoin.pdfJoins several images into one imageconvert –append unicorn.pngunicorn.pngunicorn.pngappend.pdfJune 16, 2011

Editor's Notes

  • #2: http://www.freevector.com/site_media/preview_images/FreeVector-Mythical-Creatures.jpg
  • #6: An experiment is reproducible until another laboratory tries to repeat it Alexander Kohn
  • #7: Discuss some of the points he raised
  • #15: your research have more impact -> teaming up with a bioinf