Another day, another round of new stuff learned about R. I've written a function that will calculate the means of a vector (
y) aggregated by factor (
z).
> avg = function(y,z) {
var1 = aggregate(y, by = list(z), FUN = mean, na.rm=T)
var2 = var1$x
}
The function returns a vector with the means. Likewise, I wrote a function for the calculation of standard error of the means (SEM):
> sem = function(y,z) {
var1 = aggregate(y, by = list(z), FUN = sd, na.rm=T)
var2 = aggregate(y, by = list(z), FUN = length)
var3 = var1$x / sqrt(var2$x)
}
So with these functions, I can easily calculate the means + SEM for a large dataset. For example:
> data.means = data.frame(avg(y1,z),avg(y2,z),avg(y3,z),avg(y4,z),avg(y5,z),avg(y6,z))
> data.sems = data.frame(sem(y1,z),sem(y2,z),sem(y3,z),sem(y4,z),sem(y5,z),sem(y6,z))
I can give the rows and columns appropriate headers like so:
> rownames(data.means) = c("Hibernation", "Post-hibernation", "Pre-hibernation")
> colnames(data.means) = c("endof", "dsb", "dmcbc", "dfreef", "csb", "cfreef", "cmcbc")
So the formatted data frame will look like so:
> data.means endof dsb dmcbc dfreef csb cfreef cmcbc
Hibernation 51.87500 49.06250 0.596250 22.851250 82.16667 11.875000 0.950000
Post-hibernation 33.86667 82.37778 1.233333 9.591111 85.81429 3.020000 1.335714
Pre-hibernation 13.60000 45.46250 0.551250 4.858750 84.30000 1.038333 1.058333
I want to rearrange the data so that Pre-hibernation comes first, then Hibernation and Post-hibernation. I'll first assign a variable with my desired order, then do the rearrangement.
> order = c("Pre-hibernation", "Hibernation", "Post-hibernation")
> data.means = data.means[order, , drop=F]
Now I want to plot all these data in a single bar plot. I'll have to convert the data frames into matrices because
barplot() can't use data frames.
> data.means = data.matrix(data.means, rownames.force = T)
> data.sems = data.matrix(data.sems, rownames.force = T)
Now to plot the means, SEM, and add a legend to the top right of the plot:
> means.bp = barplot(data.means, main = "FreeF By State", ylab = "FreeF", xlab = "State", ylim = range(0,120), beside = T, names.arg = colnames(data.means), axis.lty = 1,col = gray.colors(3))
> error.bar(means.bp,data.means,data.sems)
> legend("topright", rownames(data.means), fill=gray.colors(3))
I've added the argument
beside = T to the bar plot so that R will plot the data grouped into the columns present in the data matrix. So the arguments
col and
fill do practically the same thing in
barplot() and
legend(), respectively: designate the colour scheme of the bars. In this case, I'm using the preset grayscale colour scheme in R.
No comments:
Post a Comment