1 Categorical vs continuous

In order to use color with your data, most importantly, you need to know if you’re dealing with a categorical or continuous variable.

  1. Do an HLO!
  2. Use forcats!

2 Colour vs fill aesthetic

Fill and colour scales in ggplot2 can use the same palettes. Some shapes such as lines only accept the colour aesthetic, while others, such as polygons, accept both colour and fill aesthetics. In the latter case, the colour refers to the border of the shape, and the fill to the interior.

## A look at all 25 symbols
df2 <- data.frame(x = 1:5, 
                  y = rep(seq(0, 24, by = 5), each = 5), 
                  z = 1:25)
s <- ggplot(df2, aes(x = x, y = y)) + scale_shape_identity() + geom_text(aes(label = z, y = y - 1)) + theme_void()
s + geom_point(aes(shape = z), size = 4) 


While all symbols have a foreground colour, symbols 21-25 also take a background colour (fill)

s + geom_point(aes(shape = z), size = 4, colour = "navy") 


s + geom_point(aes(shape = z), size = 4, colour = "navy", fill = "orchid") 

For the rest of today, we'll play with the built-in iris dataset. Let's take a peek...

glimpse(iris)
Observations: 150
Variables: 5
$ Sepal.Length (dbl) 5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9,...
$ Sepal.Width  (dbl) 3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2.9, 3.1,...
$ Petal.Length (dbl) 1.4, 1.4, 1.3, 1.5, 1.4, 1.7, 1.4, 1.5, 1.4, 1.5,...
$ Petal.Width  (dbl) 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.3, 0.2, 0.2, 0.1,...
$ Species      (fctr) setosa, setosa, setosa, setosa, setosa, setosa, ...

3 Discrete colors

3.1 Default discrete palette

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species)) + 
  geom_point(size = 3) 

We would call this a qualitative palette and it works well for these data. The addition of scale_color_hue changes nothing.

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species)) + 
  geom_point(size = 3) + 
  scale_color_hue()

3.2 Set luminance and saturation (chromaticity)

We can change these settings within the default color palette.

# All defaults
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species)) + 
  geom_point(size = 3) +
  scale_color_hue(l = 65, c = 100)

# Use luminance=45, instead of default 65
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species)) + 
  geom_point(size = 3) +
  scale_color_hue(l = 45)

# Reduce saturation (chromaticity) from 100 to 50, and increase luminance
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species)) + 
  geom_point(size = 3) +
  scale_color_hue(l = 75, c = 50)

3.3 Manually select discrete colors (scale_color_manual() and scale_fill_manual)

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species)) + 
  geom_point(size = 3) +
  scale_color_manual(values = c("dodgerblue", "firebrick1", "palegreen"))

Challenge: Why doesn't this work?

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species)) + 
  geom_point(size = 3) +
  scale_fill_manual(values = c("dodgerblue", "firebrick1", "palegreen"))

Or this?...

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, fill=Species)) + 
  geom_point(size = 3) +
  scale_fill_manual(values = c("dodgerblue", "firebrick1", "palegreen"))

Hint: think about which shape is the default for geom_point...

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species)) + 
  geom_point(size = 3, shape = 20)

Challenge: Try adding a black outline to the points and color by Species. If that was easy, change the color values for Species from the defaults.

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, fill=Species)) + 
  geom_point(size = 3, shape = 21) + 
  scale_fill_manual(values = c("dodgerblue", "firebrick1", "palegreen"))

Challenge: Using this link, find 3 new named colors, and save them as an object called my_colors outside of ggplot2; then call that object within the scale_colour_manual function.

my_colors <- c("cadetblue", "salmon", "steelblue")
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, colour=Species)) + 
  geom_point(size = 3) + 
  scale_colour_manual(values = my_colors)

Challenge: Try using non-named R colors, like hexadecimal colors (#rrggbb), like these 3: "#4C72B0", "#55A868", "#C44E52" Parse the hexadecimal string like so: #rrggbb, where rr, gg, and bb refer to color intensity in the red, green, and blue channels, respectively.

#' these are all 6 seaborn color palettes
#` from https://github.com/mwaskom/seaborn/blob/master/seaborn/palettes.py
sb_deep <- c("#4C72B0", "#55A868", "#C44E52",
        "#8172B2", "#CCB974", "#64B5CD")
sb_muted <- c("#4878CF", "#6ACC65", "#D65F5F",
         "#B47CC7", "#C4AD66", "#77BEDB")
sb_pastel <- c("#92C6FF", "#97F0AA", "#FF9F9A",
          "#D0BBFF", "#FFFEA3", "#B0E0E6")
sb_bright <- c("#003FFF", "#03ED3A", "#E8000B",
                    "#8A2BE2", "#FFC400", "#00D7FF")
sb_dark <- c("#001C7F", "#017517", "#8C0900",
        "#7600A1", "#B8860B", "#006374")
sb_colorblind <- c("#0072B2", "#009E73", "#D55E00",
                        "#CC79A7", "#F0E442", "#56B4E9")
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, colour=Species)) + 
  geom_point(size = 3) + 
  scale_colour_manual(values = sb_deep)

3.4 Built-in discrete palettes

3.4.1 Colorbrewer

library(RColorBrewer)
brewer.pal(5, "Dark2")
display.brewer.pal(5, "Dark2")
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, colour=Species)) + 
  geom_point(size = 3) +
  scale_color_brewer(palette = "Dark2")

Challenge: Again! Try adding a black outline to the points and color by Species using the Dark2 palette.

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, fill=Species)) + 
  geom_point(size = 3, shape = 21) + 
  scale_fill_brewer(palette = "Dark2")

3.4.2 Wes Anderson palettes

My favorite!

library(wesanderson)
names(wes_palettes)
 [1] "GrandBudapest"  "Moonrise1"      "Royal1"         "Moonrise2"     
 [5] "Cavalcanti"     "Royal2"         "GrandBudapest2" "Moonrise3"     
 [9] "Chevalier"      "Zissou"         "FantasticFox"   "Darjeeling"    
[13] "Rushmore"       "BottleRocket"   "Darjeeling2"   
wes_palette("GrandBudapest2")

str(wes_palette("GrandBudapest2"))
Class 'palette'  atomic [1:4] #E6A0C4 #C6CDF7 #D8A499 #7294D4
  ..- attr(*, "name")= chr "GrandBudapest2"
wes_palette("GrandBudapest2")[1:4]
[1] "#E6A0C4" "#C6CDF7" "#D8A499" "#7294D4"
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, colour=Species)) + 
  geom_point(size = 3) +
  scale_color_manual(values = wes_palette("Darjeeling"))

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, colour=Species)) + 
  geom_point(size = 3) +
  scale_color_manual(values = wes_palette("FantasticFox"))

Challenge!: What if you just don't want to use the colors in the order they are in?

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, colour=Species)) + 
  geom_point(size = 3) +
  scale_color_manual(values = wes_palette("Darjeeling")[3:5])

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, colour=Species)) + 
  geom_point(size = 3) +
  scale_color_manual(values = wes_palette("FantasticFox")[c(2, 3, 5)])

3.4.3 ggthemes palettes

library(ggthemes)
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, colour=Species)) + 
  geom_point(size = 3) +
  scale_color_fivethirtyeight()

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, colour=Species)) + 
  geom_point(size = 3) +
  scale_color_economist()

3.4.4 Palettes from the Queen Bee

install.packages("devtools")
devtools::install_github("dill/beyonce")
library(beyonce)
beyonce_palette(18)

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
  geom_point(size = 3) +
  scale_color_manual(values = beyonce_palette(18)[3:5])

Challenge!: Use beyonce_palette 18, but only the first, fourth, and fifth colors in the palette.

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
  geom_point(size = 3) +
  scale_color_manual(values = beyonce_palette(18)[c(1, 4, 5)])

3.4.5 Viridis palettes

Read more here in the viridis vignette

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
  geom_point(size = 3) +
  scale_color_viridis(discrete = TRUE)

Challenge: Use the viridis package to color the points by Species; make the outline of the points "midnightblue".

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, fill=Species)) + 
  geom_point(size = 3, shape = 21, colour = "midnightblue") + 
  scale_fill_viridis(discrete = TRUE)

3.5 Greyscale for discrete

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
  geom_point(size = 3) +
  scale_color_grey() +
  theme_bw()

Set start and end

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
  geom_point(size = 3) +
  scale_color_grey(start = 0.1, end = 1) +
  theme_bw()

ggplot(iris, aes(Sepal.Length, Sepal.Width, fill = Species)) +
  geom_point(size = 3, shape = 21) +
  scale_fill_grey(start = 0, end = 1) +
  theme_bw()

3.6 Adding layers

Challenge!: Create 3 new plots using the iris data.

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
  geom_point(size = 3)

First, add a linear regression line for each Species to your plot and color both the points and the line by Species (you may want to remove the standard error ribbon to make this plot more readable).

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
  geom_point(size = 3) +
  stat_smooth(method = "lm", se = FALSE)

Second, add a single linear regression line across all Species to your plot and color only the points by Species (again, you may want to remove the standard error ribbon to make this plot more readable). Make the regression line some color other than the default.

ggplot(iris, aes(Sepal.Length, Sepal.Width)) +
  geom_point(aes(color = Species), size = 3) +
  stat_smooth(method = "lm", se = FALSE, color = "darkgray")

Third, add a linear regression line for each Species to your plot and color only the points by Species (again, you may want to remove the standard error ribbon to make this plot more readable). Make the regression lines some color other than the default.

ggplot(iris, aes(Sepal.Length, Sepal.Width, group = Species)) +
  geom_point(aes(color = Species), size = 3) +
  stat_smooth(method = "lm", se = FALSE, color = "darkgray")

If this was all easy...

  • color by Species
  • change the default coloring to new colors
  • make the regression lines all one color
  • add a facet_wrap by Species and
  • remove the legend (since it is redundant with the facet labels).
  • (I used the theme_minimal)
ggplot(iris, aes(Sepal.Length, Sepal.Width, group = Species)) +
  geom_point(aes(color = Species), size = 3) +
  stat_smooth(method = "lm", se = FALSE, color = "darkgray") +
  facet_wrap(~Species) +
  scale_color_manual(values = sb_colorblind, guide = FALSE) +
  theme_minimal()

4 Continuous colors

4.1 Default continuous palette

Let’s map color to a continuous variable, Sepal.Width:

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Sepal.Width)) + 
  geom_point(size = 3)

4.2 Color choice with continuous variables (scale_color_gradient(), scale_color_gradient2())

You can reverse the gradient scale...

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Sepal.Width)) + 
    geom_point(size = 3) +
    scale_color_gradient(trans = "reverse")

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Sepal.Width)) + 
    geom_point(size = 3) +
    scale_color_gradient(low = "white", high = "red")

Challenge: Make this same plot using a grayscale gradient.

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Sepal.Width)) + 
    geom_point(size = 3) +
    scale_color_gradient(low = "gray90", high = "black")

So scale_color_gradient gives you a sequential gradient, but you may want a diverging color scheme instead.

# Diverging color scheme
mid <- mean(iris$Sepal.Width)
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Sepal.Width)) + 
    geom_point(size = 3) +
    scale_color_gradient2(midpoint=mid,
                      low="blue", mid="white", high="red" )

4.3 Built-in continuous palettes

4.3.1 Use RColorBrewer with scale_color_gradientn

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Sepal.Width)) + 
    geom_point(size = 3) +
    scale_color_gradientn(colours = brewer.pal(n=5, name="PuBuGn"))

Reverse the colors...

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Sepal.Width)) + 
    geom_point(size = 3) +
    scale_color_gradientn(colours = rev(brewer.pal(n=5, name="PuBuGn")))

4.3.2 Viridis

Read more here in the viridis vignette

library(viridis)

The default is the viridis palette within the viridis package!

Note! For discrete == FALSE (the default) all other arguments are as to scale_fill_gradientn or scale_color_gradientn.

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Sepal.Width)) + 
    geom_point(size = 3) +
    scale_color_viridis()

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Sepal.Width)) + 
    geom_point(size = 3) +
    scale_color_viridis(option = "magma")

Challenge: Read the help function for ?scale_color_viridis and use the "inferno" palette in reverse.

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Sepal.Width)) + 
    geom_point(size = 3) +
    scale_color_viridis(option = "inferno", begin = 1, end = 0)

5 Exporting graphics

SB!