This guide is intended to teach you how to teach you one component of metagenomic analysis: how to plot abundances at the “phylum” level for each metagenomics sample.
You are provided an instance of the R editor RStudio on AWS already waiting for you to use. To access it, navigate to the URL provided in the Google Sheet in a web browser.
Login to your assigned RSstudio instance.
In this section, you will learn how to use R to generate two plots that help visualize the similarities and differences between a subset of the Metasub samples and compare it to metagenomics samples from the Human Microbiome project..
We have provided a subset of of metagnomics data (taxa_table.csv
) and one with, for your convenience.
First we will plot our samples as a stacked-barplot. A stacked-barplot shows a set of numbers as a series of columns one on top of each other colored by a label. In our case, a taxonomic profile is a set of numerical abundances labeled by the microbial species it belongs to.
Here’s how we make this plot in R:
library(ggplot2) # These lines load additional libraries into our environment
taxa = read.csv('taxa_table.csv', header=TRUE, sep=',') # Read our taxonomic table into a computational object from a file
taxa = taxa[taxa$rank == 'phylum',] # This filters our taxonomic table to a specific taxonomic rank. One of Kingdom, Phylum, Class, Order, Family, Genus, Species. Play around with a few other ranks.
ggplot(taxa, aes(x=sample, y=percent_abundance, fill=taxon)) + # this creates a ggplot object. Can you figure out what the aes(...) section is doing?
geom_bar(stat="identity") + # this tells ggplot how we want our data to be displayed
xlab('Sample') + # These lines tell ggplot what our axis labels should be
ylab('Abundance') +
labs(fill='Phylum') +
theme(axis.text.x = element_text(angle = 90)) # this rotates the x-axis text 90 degrees
Here is a video of this step being performed: https://youtu.be/WABocs_Gu4Y.
After this is complete: