Showing posts with label phyloinformatics. Show all posts
Showing posts with label phyloinformatics. Show all posts

Friday, February 10, 2012

BLAST a sequence and get a tree and a map

I've updated the BLAST a sequence and get a tree tool described in a previous post to output additional details, such as a list of the sequences used to build the tree and some basic metadata (such as the taxon name, name of any associated host, publication, and geographic coordinates). If the sequences are geotagged, then you will also see a little map showing the localities. As ever, all this relies on SVG, so if you're browser doesn't support that out won't see much.

The example below is for the sequence EU399074, which falls in a cluster of "dark taxa"; in this case, DNA barcode sequences that haven't been properly labelled.

Blastmap

Thursday, February 02, 2012

Browsing TreeBASE using a genome browser-like interface

One of the things I find frustrating about TreeBASE is that there's no easy way to get an overview of what it contains. What is it's taxonomic coverage like? Is it dominated by plants and fungi, or are there lots of animal trees as well? Are the obvious gaps in our phylogenetic knowledge, or do the phylogenies it contains pretty much span the tree of life?

As part of my phyloinformatics course I've put together a simple browser to navigate through TreeBASE. The inspiration comes from genome browsers (e.g., the UCSC Genome Browser) where the genome is treated as a linear set of co-ordinates, and features of the genome are displayed as "tracks".

Hgt genome 596a ac7fe0

For my browser, I've used the order in which nodes appear in the NCBI tree as you go from left to right as the set of co-ordinates (actually, from top to bottom as my browser displays the co-ordinate axis vertically).

Browser

I then place each TreeBASE tree within this classification by taking the TreeBASE → NCBI mapping provided by TreeBASE and finding the "majority rule" taxon for each tree (in a sense, the taxa that summarises what the tree is about). Each tree is represented by a vertical line depicting the span of the corresponding NCBI taxon (corresponding to a "track" in a genome browser). Taking the majority-rule taxon rather than say, the span of the tree, makes it possible to pack the vertical lines tightly together so that they take up less space (the ordering from left to right is determined by the NCBI taxonomy).

If you mouse-over a vertical bar you can see the title of the study that published the tree. If you click on the vertical bar you'll see the tree displayed on the right (if your web browser understands SVG, that is). If you click on the background you will drill down a level in the NCBI classification. To go back up the classification, click on the arrow at the top left of the browser.

This is all very preliminary, but you can take it for a spin at http://iphylo.org/~rpage/phyloinformatics/treebase/.

Below is a short video walking you through some examples.

Monday, January 30, 2012

BLAST a sequence and get a tree

For this weeks sessions of my phyloinformatics course I'm developing some phylogeny tools. The first is a simple AJAX-based BLAST tool. I've always wanted a quick way to see a GenBank sequence in its phylogenetic context, so I've built a simple tool to that takes a GenBank accession number or GI number, submits a BLAST job, retrieves the sequences, aligns them using CLUSTALW, builds a quick and dirty neighbour-joining tree using PAUP*, then displays the tree using SVG (if your browser doesn't support this you won't see the tree). One use for this is to quikcly get a sense of whether an unnamed ("dark") taxon is related to sequences that have been identified.

Nothing fancy, but it was a chance to display the whole process in the browser without opening new windows or refreshing the page. Here's an example for the GenBank sequence FJ559186:



For the technically-minded, the calls to BLAST and the alignment and tree construction tools all use AJAX, and there's a simple Javascript timer to countdown the seconds that the NCBI BLAST web service estimates the BLAST job will take, before we poll NCBI to see if the job has in fact finished. The code is in GitHub.

Monday, January 23, 2012

Open course on phyloinformatics

As part of a postgraduate course here at the University of Glasgow I'm teaching five sessions on "phyloinformatics", which I've decided to define broadly enough to encompass most of biodiversity informatics.

Given that this module is being developed on the fly, and will make use of lots of little "toys" I've developed and discussed on this blog, I've decided to put the course notes online, along with the interactive demos and the source code. So, if you want to follow along for the next couple of weeks, here are the links:



Each course page supports comments (see the bottom of the page), so feel free to add comments, or suggestions. The notes are at a crude stage, and will be developed over the duration of the course (2 weeks). I'm also endeavouring to get all the source code for the demonstration apps into GitHub. None of these demos is polished, but they will hopefully provide some ideas for taking them further. There will be iSpecies-like mashups, iPad webapps, classification visualisations, TreeBASE search tools, geophylogenies and other phylogeny viewers.

Thursday, November 15, 2007

Phyloinformatics workshop online


Slides from the recent Phyloinformatics workshop in Edinburgh are now online at the e-Science Institute. In case the e-Science Institute site disappears I've posted the slides on slideshare.


Heiko Schmidt has also posted some photos of the proceedings, demonstrating how distraught the particpants were that I couldn't make it.

Monday, October 22, 2007

Phyloinformatics workshop - primal scream

Argh!!! The phyloinformatics workshop at Edinburgh's eScience Centre is underway (program of talks available here as an iCalendar file), and I'm stranded in Germany for personal reasons I won't bore readers with. The best and brightest gather less than an hour from my home town to talk about one of my favourite subjects, and I can't be there. Talk about frustration!

How can they they possibly proceed without yours truly to interject "it sucks" at regular intervals? What, things are going just fine? Next, you'll be suggesting that Systematic Biology can function without me as editor … wait, what's that you say? Jack's running the show without a hitch … gack, I'm redundant.