Sunday, September 11, 2011
Sunday Spinelessness - Visualising fungal communities
If you read the Sunday post last week you'll remember that I started a little "when my brains are completely destroyed by thesis wrangling and I need a break" project, and that I'm the sort of person that takes a break from one science project by playing around with another science project. That's just how sad I am. Anyway, last I week i set out to use existing records in GenBank to compare the diversity of fungal species living on the roots of various species of tree in New Zealand. And I failed. I only managed to find records for one species, silver beech, so all I could really say was that there seemed to be a lot of different fungal species on this tree.
Inevitably, I found myself needing a break from thinking about my snails this week, so had another crack at comparing fungal diversity by host. In particular, I've filtered through hundreds of records of fungi collected from New Zealand to isolate those collected from natural southern beech (Nothofagus) forests or plantation pine forests. The mycorrhizal fungi I talked about last week are generally considered to be highly host-specific and unable to form relationships with off-host species. If that's true we should be able to see that the community of fungi recorded for each forest type is quite distinct. But how can we see that phenomenon? Last week I used a graph of the frequency of different taxonomic families to show how diverse the community living on silver beech was, but taxonomic ranks above species don't represent anything real about biology or biodiversity. I have argued species are natural units of biodiversity (even if we can struggle mightily to identify those units), but most of the sequences I've found aren't annotated down to this level (in fact, most probably represent undescribed species). So, I gave up on a 'unit of biodiversity' and instead only included sequences for a particular gene loved by fungal geneticists called the Internal Transcribed Spacer. Using just these sequences, I can make a phylogenetic tree, which attempts to relate DNA sequences to each other based on their similarity.
So here's the tree, drawn as a big circle. Each tip represents a single DNA sequence and is shaded according to the forest it comes from - brown for pine, green for beech. As you can see, the pine and the beech forests have very different fungal communities. There are whole swathes of the tree that are unique to to beech forests (although, of course, that could be an artifact of the effort to which people have sampled) and whenever you see a brown branch within a predominantly green section of the tree, that branch is substantially distinct from its beech-living relatives. (click on the image to be taken to an interactive version, where you can add information on host species or change the shape of the tree):
So that's fun. There's still a lot more that could be done with the data set. I haven't included much data about the fungi themselves - it would be interesting, for instance, to see if the fungi living in the roots of trees showed more or less specificity than the those living elsewhere. It might also be possible to use these sequences to estimate the number of species they represent using some of the new-fangled species delmitation methods the DNA Barcorders have come up with. There are also other natural forest types in New Zealand that might be interesting to include. Both Manuka and Kanuka rely have mycorrhizal fungi and including those species might help us to understand if the differences displayed above are about natural v plantation forests or about host-specificity in the fungal species.
I didn't include any code snippets today, because I've set up a github repository as an 'open record book' instead. If you're interested in the process or the code that went into this you can check it out there (though I should warn you, there's nothing very clever going on).
Inevitably, I found myself needing a break from thinking about my snails this week, so had another crack at comparing fungal diversity by host. In particular, I've filtered through hundreds of records of fungi collected from New Zealand to isolate those collected from natural southern beech (Nothofagus) forests or plantation pine forests. The mycorrhizal fungi I talked about last week are generally considered to be highly host-specific and unable to form relationships with off-host species. If that's true we should be able to see that the community of fungi recorded for each forest type is quite distinct. But how can we see that phenomenon? Last week I used a graph of the frequency of different taxonomic families to show how diverse the community living on silver beech was, but taxonomic ranks above species don't represent anything real about biology or biodiversity. I have argued species are natural units of biodiversity (even if we can struggle mightily to identify those units), but most of the sequences I've found aren't annotated down to this level (in fact, most probably represent undescribed species). So, I gave up on a 'unit of biodiversity' and instead only included sequences for a particular gene loved by fungal geneticists called the Internal Transcribed Spacer. Using just these sequences, I can make a phylogenetic tree, which attempts to relate DNA sequences to each other based on their similarity.
So here's the tree, drawn as a big circle. Each tip represents a single DNA sequence and is shaded according to the forest it comes from - brown for pine, green for beech. As you can see, the pine and the beech forests have very different fungal communities. There are whole swathes of the tree that are unique to to beech forests (although, of course, that could be an artifact of the effort to which people have sampled) and whenever you see a brown branch within a predominantly green section of the tree, that branch is substantially distinct from its beech-living relatives. (click on the image to be taken to an interactive version, where you can add information on host species or change the shape of the tree):
So that's fun. There's still a lot more that could be done with the data set. I haven't included much data about the fungi themselves - it would be interesting, for instance, to see if the fungi living in the roots of trees showed more or less specificity than the those living elsewhere. It might also be possible to use these sequences to estimate the number of species they represent using some of the new-fangled species delmitation methods the DNA Barcorders have come up with. There are also other natural forest types in New Zealand that might be interesting to include. Both Manuka and Kanuka rely have mycorrhizal fungi and including those species might help us to understand if the differences displayed above are about natural v plantation forests or about host-specificity in the fungal species.
I didn't include any code snippets today, because I've set up a github repository as an 'open record book' instead. If you're interested in the process or the code that went into this you can check it out there (though I should warn you, there's nothing very clever going on).
Labels: biopython, environment and ecology, fungi, geekery, python, sci-blogs, science, sunday spinelessness
4 Comments:
I like the graph, what package is that from?
Hi Pete,
It's from iTol which is written explicity for phylogenetic trees. But I'll bet you save any cluster diagram as a tree file using write.tree() in Ape. You just need csv files for the shading after than (and there are lots of different plot types)
It's from iTol which is written explicity for phylogenetic trees. But I'll bet you save any cluster diagram as a tree file using write.tree() in Ape. You just need csv files for the shading after than (and there are lots of different plot types)
Is that the "plot(tr, ...)" bit? Is "tr" an instance of some sort of tree class?
Ah, yes. I see the "mess about interactively and occasionally copy something into a text file" method doesn't make the most reproducible research :)
yeah, 'tr' is an instances Ape's phylogenetic tree class which was made with nj() (neighbor joining on a distance matrix). I saved the 'tr' object and uploaded it to iTol to make the graph. Reading ape's documentation it can take Hclust instances and write them in the formats iTol deals in.
yeah, 'tr' is an instances Ape's phylogenetic tree class which was made with nj() (neighbor joining on a distance matrix). I saved the 'tr' object and uploaded it to iTol to make the graph. Reading ape's documentation it can take Hclust instances and write them in the formats iTol deals in.