Researcher vs. Messy Data Source: @ViviFabrien

Prepare for the Lab

We will be covering data import in the lab, which is discussed in R4DS, 11 Data import. The R4DS chapter goes into more detail than we will be learning for the lab. So, only read it if you’re interested. Be aware that it is a good reference for when you have questions about data import in the future.

Lab Activity

Create a new RStudio project called “02-tidy-data”. In the project folder create folders “data” and “src”.1

Today’s activity involves importing 3 non-tidy data sets into R and wrangling them into tidy tibbles. Complete the following steps for each of the three data sets.

  1. Import the file into R and save at as an object with a short, descriptive name.
  2. Inspect the object by printing it to the console, and with View() and glimpse().
  3. Use tidyr functions to tidy the data.

Descriptions of the (imaginary) studies and codebooks for each data set are provided below.

Study 1: Does resilience increase voluntary exposure to pain?

Description

These data come from a study designed to examine the relationship between resilience and willingness to be exposed to pain. Participants responded to the Brief Resilience Scale (BRS). They were then instructed to submerge their hand in a bucket of ice water while counting either 20, 40, or 60 seconds in their head. They were told to remove their hand as close to the specified time as they could (or as long as they could). A research assistant measured duration time and instructed participants to remove their hand if more than 90 seconds had passed.

Codebook

variable description
pid randomly generated unique identifier of each participant
age_and_sex participants self-reported age and sex
brs_mean participants mean score on the Brief Resilience Scale (BRS)
tc1 20 second condition; time with hand in ice bath
tc2 40 second condition; time with hand in ice bath
tc3 60 second condition; time with hand in ice bath

Study 2: How does life satisfaction change in the first months of marriage?

Description of Data

This data set comes from a study investigating changes in life satisfaction following marriage. Life satisfaction was measured using the Satisfaction with Life Scale (SWLS) on the day a participant was married and then again every 30 days for the first 120 days after that.

Codebook

var desc
participant_id unique participant ID
age age in years when participant was married
swls_t1 Satisfaction with Life Scale (SWLS) total on day of marriage
swls_t2 SWLS total after 30 days of marriage
swls_t3 SWLS total after 60 days of marriage
swls_t4 SWLS total after 90 days of marriage
swls_t5 SWLS total after 120 days of marriage

Study 3: Reward expectations and stress

Description of Data

These data come from an experimental study in which participants completed a stressful task. After receiving instructions, participants in the two reward conditions were told that they would receive a bonus of either $5.00 or $10.00 for completing the study. Participant stress was measured on a scale of 1 (“not stressed at all”) to 5 (“extremely stressed”) at the study outset, immediately following instructions/notification of rewards, and after completing the stress task and receiving their reward.

Codebook

variable description
rid Participant research ID
reward Size of reward: no reward = $0; small reward = $1; large reward = $10
time Measurement time: t1 = baseline; t2 = after notification of condition; t3 = after receiving reward
stress Self-report stress (scale of 1 to 5)

Homework

Practice Questions


  1. “src” is short for “source code”. It’s a reminder to think of your code as the “source” of the objects you create in R.↩︎