contact@statdoe.com

Tutorials

Step-by-Step Barplots for One Factor in R

In this tutorial we are going to build barplots to show the results of a one-factor experiment that measured the release of radon in showers with different aperture diameters. The data was published in the Environment International Journal.1

We are going to use a table with summarised data: the shower diameter, the mean and standard deviation of the radon released, and compact letter display indicating the significant differences by Tukey’s test.

You can download the summarised table here, or you can go to the tutorial on One-Way ANOVA to see how to create it.

We are going to start by loading the appropriate libraries, the readr to load the data from a csv file, the ggplot2 for the plots.

# loading the appropriate libraries
library(readr)
library(ggplot2)

# loading and checking the data
radon_summary <- read_csv("radon_summary.csv")
print(radon_summary)
## # A tibble: 6 x 4
##       D  mean    sd Tukey
##   <dbl> <dbl> <dbl> <chr>
## 1  0.37  82.8  2.06 a    
## 2  0.51  77    2.31 ab   
## 3  0.71  75    1.83 b    
## 4  1.02  71.8  3.30 b    
## 5  1.4   65    3.56 c    
## 6  1.99  62.8  2.75 c

The data file shows columns for the shower diameter (D), the mean, the standard deviation (sd), and compact letter display indicating the significant differences (Tukey).

Basic barplot

We are going to use the function ggplot to build the barplots. The first argument is the data file, radon_summary, and the second argument is the aesthetics aes, where we define the x and y variables, D and mean. However, if we run only this code, we will have a blank plot. We need also to define the geom, and is this case, geom_bar(stat = "identity") for the barplot.

# Gray-scale barplot
ggplot(radon_summary, aes(factor(D), mean)) + 
  geom_bar(stat = "identity")

The resulting plot is a bit “agressive”, with wide bars and strong grey colour. To make it smoother, we can define the bar width as 80 % of the full width (width=0.8) and define some slight transparency (alpha=0.8).

# Gray-scale barplot
ggplot(radon_summary, aes(factor(D), mean)) + 
  geom_bar(stat = "identity", width=0.8, alpha=0.8)

Adding error bars

Now let’s add the error bars using the geom_errorbar function. We must define the upper and lower limits. In this case I am using the mean ± standard deviation (sd), both from the radon_summary data set. The argument width=0.2 defines the error bar width as 20 % of the full column width.

# Gray-scale barplot
ggplot(radon_summary, aes(factor(D), mean)) + 
  geom_bar(stat = "identity", width=0.8, alpha=0.8) +
  geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), width=0.2)

Customizing x and y titles

Let’s now customise the x and y titles using the function labs.

# Gray-scale barplot
ggplot(radon_summary, aes(factor(D), mean)) + 
  geom_bar(stat = "identity", width=0.8, alpha=0.8) +
  geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), width = 0.2) + 
  labs(x="Diameter (mm)", y="Radon Released (%)")

The plot could be used as it is, but there is still some space for improvement.

Formating the overall visualisation

The next step is to change the overall theme of the plot. I have chosen the theme_bw. Additionally, I will delete the major and minor grid lines, as they are normally not used in scientific plots using the code theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()).

# Gray-scale barplot
ggplot(radon_summary, aes(factor(D), mean)) + 
  geom_bar(stat = "identity", width=0.8, alpha=0.8) +
  geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), width = 0.2) + 
  labs(x="Diameter (mm)", y="Radon Released (%)") +
  theme_bw() + 
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())

Adding the compact letter display from Tukey’s test

And finally, we can add the compact letter display to the plot using the geom_text function. The label is the column Tukey in the data file.

# Gray-scale barplot
ggplot(radon_summary, aes(factor(D), mean)) + 
  geom_bar(stat = "identity", width=0.8, alpha=0.8) +
  geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), width = 0.2) + 
  labs(x="Diameter (mm)", y="Radon Released (%)") +
  theme_bw() + 
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  geom_text(aes(label=Tukey))

As we can see, the labels of the Tukey’s test were placed at the exact same spot of the error bars. To correct it, we are going to define their position in relation to this point using the arguments nudge_x and nudge_y. I am also going to decrease the size of the letters.

# Gray-scale barplot
ggplot(radon_summary, aes(factor(D), mean)) + 
  geom_bar(stat = "identity", width=0.8, alpha=0.8) +
  geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), width = 0.2) + 
  labs(x="Diameter (mm)", y="Radon Released (%)") +
  theme_bw() + 
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  geom_text(aes(label=Tukey), nudge_x = 0.25, nudge_y = 5, size = 3)

And here we have a gray-scale barplot suitable to be used in scientific reports and presentations.

Adding colors to the barplots

To create a more attractive plot, we can add some colours. In the next example, using the code from the gray-scale barplot, I have defined the arguments fill = "steelblue" for the geom_bar() function, and color = "steelblue4" for the geom_errorbar() and geom_text() functions.

# Blue barplot
ggplot(radon_summary, aes(factor(D), mean)) + 
  geom_bar(stat = "identity", width=0.8, alpha=0.8, fill = "steelblue") +
  geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), width = 0.2, color = "steelblue4") + 
  labs(x="Diameter (mm)", y="Radon Released (%)") +
  theme_bw() + 
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  geom_text(aes(label=Tukey), nudge_x = 0.25, nudge_y = 5, size = 3, color = "steelblue4")

Barplots colored according Tukey’s test results

Another very interesting alternative is to colour the bars according to the results of Tukey’s test, meaning that results with no significant difference are presented with the same colour. Using again the last code from the gray-scale barplot, we need to define fill=Tukey in the aesthetics of the ggplot() function (first row).

# colored barplot according Tukey's test results
ggplot(radon_summary, aes(factor(D), mean, fill = Tukey)) + 
  geom_bar(stat = "identity", width=0.8, alpha=0.8) +
  geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), width = 0.2) + 
  labs(x="Diameter (mm)", y="Radon Released (%)") +
  theme_bw() + 
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  geom_text(aes(label=Tukey), nudge_x = 0.25, nudge_y = 5, size = 3)

Now the colours of the barplot indicate significant differences by Tukey’s test. As a last improvement, we can get get rid of the legend, as it is not necessary in this case, using the code show.legend = FALSE in the geom_bar() arguments, and change the colours to a more interesting pallete using the scale_fill_brewer() function. I have chosen the BrBG pallete.

# colored barplot according Tukey's test results
ggplot(radon_summary, aes(factor(D), mean, fill = Tukey)) + 
  geom_bar(stat = "identity", width=0.8, alpha=0.8, show.legend = FALSE) +
  scale_fill_brewer(palette = "BrBG") +
  geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), width = 0.2) + 
  labs(x="Diameter (mm)", y="Radon Released (%)") +
  theme_bw() + 
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  geom_text(aes(label=Tukey), nudge_x = 0.25, nudge_y = 5, size = 3)

Saving the final figure

The final look of a specific ggplot object depends on the size and aspect ratio used. The plots shown in this tutorial were build for a figure size 4×3 inches (width x height). I suggest saving the final plot as a png file with 1000 dpi resolution as shown in the code below.

# saving the final figure
ggsave("barplot.png", width = 4, height = 3, dpi = 1000)

  1. Data source: Environment International, 1992, 18(4): 363-369. https://doi.org/10.1016/0160-4120(92)90067-E↩︎


1 Response

Leave a Reply