Data Visualization


Learning Objectives

After completing this lab, you should be able to:

  1. Install and load R packages.
  2. Understand why data visualization is a useful tool for science communication.
  3. Know and apply the APA guidelines for figures in APA manuscripts.
  4. Understand the basic "grammar of graphics" used by `ggplot2`.
  5. Apply the basic "grammar of graphics" to visualize data in R.

Prepare for the Lab

Ensure that you are comfortable with the content in labs 1 and 2.

Lab Activity

Use the anscombe_long.csv data for this lab activity.


  1. If you haven’t already, install ggplot2.
  2. Load ggplot2.
  3. Import anscombe_long.csv into R as a data.frame named anscombe_long.
  4. Convert anscombe_long$dataset to a factor with levels: 1 = “I”, 2 = “II”, 3 = “III”, and 4 = “IV”.


Write a script to produce each of the following plots. All of them use the anscombe_long.csv data.

Plot 1: Histograms

Plot 2: Boxplot

Plot 3: Scatter Plot

Plot 4: Jitter Plot

Tips for Recreating Plot 4 {-}

Identify the aesthetic mappings before you start.

Use geom_jitter() to plot the points. geom_jitter() is a variant of geom_point() that adds a little… jitter. It’s useful when points would otherwise be overlapping. Note that the jitter is random, so your plot will not match this one exactly. The jitter can also affect the range of the scale axes.

Set the width of geom_jitter() to be equal to 0.25.

This plot uses theme_minimal(). You are free to use whatever theme you prefer.


Data Visualization Slides