contact@statdoe.com

Tutorials

Customising the Compact Letter Display Position

 

In this R tutorial, you are going to learn how to add and position the letters indicating significant differences among means to barplots.

If you prefer a video-tutorial, you can watch the tutorial at the YouTube channel.

Additionally, we will:

  • perform analysis of variance and Tukey’s test
  • obtain the compact letter display to indicate significant differences among means
  • build a table with the mean, the standard deviation and the compact letter display

We are going to use the results of a one-factor experiment conducted to measure and compare the effectiveness of various feed supplements on the growth rate of chickens.1 The data file (chickwts) is available in the R datasets library.

We are going to start by loading the appropriate libraries, the ggplot2 and the ggthemes for the plots, multcompView to obtain the compact letter display, and the dplyr for building a table with the summarised data.

# loading the appropriate libraries
library(ggplot2)
library(ggthemes)
library(multcompView)
library(dplyr)

 

Data analysis and table with the with the mean, the standard deviation and the compact letter display

The first step is to perform the analysis of variance, mostly known as ANOVA, using the aov function.

Then the means comparison by Tukey’s test can be run on the object resulting from the analysis of variance.

The use of letters to indicate significant differences in pairwise comparisons is called compact letter display, and can simplify the visualisation and discussion of significant differences among means. We are going to use the multcompLetters4 function from the multcompView package. The arguments are the object from an aov function and the object from the TukeyHSD function.

Finally, we are going to build a table with the mean, the standard deviation and the letters for each treatment (feed). The data in this table will be use to build the barplot.

# analysis of variance
anova <- aov(weight ~ feed, data = chickwts)

# Tukey's test
tukey <- TukeyHSD(anova)

# compact letter display
cld <- multcompLetters4(anova, tukey)

# table with factors and 3rd quantile
dt <- group_by(chickwts, feed) %>%
  summarise(w=mean(weight), sd = sd(weight)) %>%
  arrange(desc(w))

# extracting the compact letter display and adding to the Tk table
cld <- as.data.frame.list(cld$feed)
dt$cld <- cld$Letters

print(dt)
## # A tibble: 6 x 4
##   feed          w    sd cld  
##   <fct>     <dbl> <dbl> <chr>
## 1 sunflower  329.  48.8 a    
## 2 casein     324.  64.4 a    
## 3 meatmeal   277.  64.9 ab   
## 4 soybean    246.  54.1 b    
## 5 linseed    219.  52.2 bc   
## 6 horsebean  160.  38.6 c

You can see the tutorial on One-Way ANOVA if you need a more detailed explanation on the code above.

 

Barplot with error bars

We are going to use the function ggplot combined with the geom_bar() for the bars and the geom_errorbar() for the error bars. The x and y variables are the feed and the average weight (w) in the dt data set. The error bars are the average weight (w) plus or minus the standard deviation (sd). I am also using the theme_few() from the ggthemes package. Additionally, I am coloring the bars according to the mean weight value using aes(fill = w) in the geom_bar() function.

ggplot(dt, aes(feed, w)) + 
  geom_bar(stat = "identity", aes(fill = w), show.legend = FALSE) +
  geom_errorbar(aes(ymin = w-sd, ymax=w+sd), width = 0.2) +
  labs(x = "Feed Type", y = "Average Weight Gain (g)") +
  theme_few()

 

Adding the letters indicating significant differences

To add the letters indicating significant differences, we are going to add geom_text(aes(label = cld)) to the ggplot code.

ggplot(dt, aes(feed, w)) + 
  geom_bar(stat = "identity", aes(fill = w), show.legend = FALSE) +
  geom_errorbar(aes(ymin = w-sd, ymax=w+sd), width = 0.2) +
  labs(x = "Feed Type", y = "Average Weight Gain (g)") +
  geom_text(aes(label = cld)) +
  theme_few()

As we can see in the resulting plot, the letters were centered exactly at the end of the barplot, which corresponds to the x and y coordinates in the aesthetics of the ggplot ggplot(dt, aes(feed, w)).

 

Letter’s position: just above the bars and beside the error bars

To adjust the position we can use the vjust and the hjust arguments.

The vjust argument adjusts the vertical position: negative values move up while positive values move down.

The hjust argument adjusts the horizontal position: negative values move to the right while positive values move to the left.

Let’s use vjust = -0.5, hjust = -0.5 to move the letters up and to the right.

ggplot(dt, aes(feed, w)) + 
  geom_bar(stat = "identity", aes(fill = w), show.legend = FALSE) +
  geom_errorbar(aes(ymin = w-sd, ymax=w+sd), width = 0.2) +
  labs(x = "Feed Type", y = "Average Weight Gain (g)") +
  geom_text(aes(label = cld), vjust = -0.5, hjust = -0.5) +
  theme_few()

 

Letter’s position: just above the error bars

Method 1

Another possibility is to position the labels above the error bars. To do it, we can define the y coordinate for the labels as the mean plus the standard deviation y = w + sd and use again the vjust = -0.5 argument to move it up. The argument hjust should not be used since we want the letters centered on the x-axis.

Additionally, I am expanding the limits of the vertical axis using the ylim() function to avoid the letters being cut.

ggplot(dt, aes(feed, w)) + 
  geom_bar(stat = "identity", aes(fill = w), show.legend = FALSE) +
  geom_errorbar(aes(ymin = w-sd, ymax=w+sd), width = 0.2) +
  labs(x = "Feed Type", y = "Average Weight Gain (g)") +
  geom_text(aes(label = cld, y = w + sd), vjust = -0.5) +
  ylim(0,410) +
  theme_few()

 

Method 2

Alternatively, instead using the vjust argument, we can add some units to the y coordinate in the geom_text() we can just add some units to the y coordinate using y = w + sd + 20; this argument sets the y coordinate for the text labels as the mean plus the standard deviation plus 20.

The actual value, 20 in this example, depends on the y-axis for each particular figure.

ggplot(dt, aes(feed, w)) + 
  geom_bar(stat = "identity", aes(fill = w), show.legend = FALSE) +
  geom_errorbar(aes(ymin = w-sd, ymax=w+sd), width = 0.2) +
  labs(x = "Feed Type", y = "Average Weight Gain (g)") +
  geom_text(aes(label = cld, y = w + sd + 20)) +
  ylim(0,410) +
  theme_few()

 

Saving the final figure

The final look of a specific ggplot object depends on the size and aspect ratio used. The plots shown in this tutorial were built for a figure size 4.2×3 inches (width x height). I suggest saving the final plot as a png file with 1000 dpi resolution as shown in the code below.

# saving the final figure
ggsave("barplot.png", width = 4.2, height = 3, dpi = 1000)

 


  1. Source: Anonymous (1948) Biometrika, 35, 214. Reference: McNeil, D. R. (1977) Interactive Data Analysis. New York: Wiley.↩︎

 

Watch the YouTube video tutorial:

12 Responses

  1. Prasun Ray

    Thank you !! very informative and very helpful .
    I have one follow up question. What if I wish to do Duncan’s or LSD instead of Turkey’s HSD ?
    what Code/ package should I use ?
    Many Thanks,

Leave a Reply