# The Chi-square Test

## Learning Outcomes

Upon completing this module, students will be able to:

• Conduct $$\chi^2$$ tests in R.
• Create appropriate data visualizations for comparisons of categorical data.
• Report results of $$\chi^2$$ tests in APA style.
• Accurately interpret the results of $$\chi^2$$ tests.
• Evaluate the strength of evidence provided by $$\chi^2$$ tests in studies using $$\chi^2$$ tests of independence for categorical data.

# The Study: Reminders Through Association

## Summary

Someone briefly remind us of the design of the study (focus on study 5).

## What We’re Reproducing

Results from two $$\chi^2$$ tests from study 5.

### We’ll Do Together

“…participants are more likely to follow through when they are assigned a cue-based reminder (in the forced-reminder through-association condition, 87%) than when no cue-based reminder is available (none condition, 59%), $$\chi^2$$(1, N = 305) = 30.22, p < .001.”

“…those in the costly-reminder-through-association condition were not only more likely to earn the bonus (74%) than those in the none condition (59%), $$\chi^2$$(1, N = 297) = 7.23, p = .007,…”

# Let’s Get Started

## tRy it! Setup

Complete the steps in the “Setup” portion of the lab activity.

2. Import “RTA_study5.csv” to R.
3. Convert the following variables to factors, condition, choice, and correct. Read from “codebook database.xlsx” to identify appropriate factor labels.

## Import the Data

Import “RTA_study5.csv” to R.

dta <- read.csv("data/RTA_study5.csv")

## Convert Variables to Factors

Convert the following variables to factors, condition, choice, and correct. Read from “codebook database.xlsx” to identify appropriate factor labels.

## Convert condition to Factor

The codebook tells us the levels/labels for condition.

dta$condition <- factor(dta$condition,
levels = 1:4,
labels = c("Free", "None", "Costly", "All")
)
##   Free   Costly Costly None   Free   Free   All    All    None   All
## Levels: Free None Costly All

## Convert choice to Factor

What does choice tell us?

dta$choice <- factor(dta$choice,
levels = c(0, 1),
labels = c("did not take reminder", "took reminder")
)
##   took reminder         took reminder         did not take reminder
##   did not take reminder did not take reminder did not take reminder
##   did not take reminder did not take reminder did not take reminder
##  took reminder
## Levels: did not take reminder took reminder

## Convert correct to Factor

What does correct tell us?

dta$correct <- factor(dta$correct,
levels = c(0, 1),
labels = c("incorrect", "correct")
)
##   correct   incorrect incorrect correct   correct   correct   incorrect
##   correct   correct   correct
## Levels: incorrect correct

# Reproduce Results

## Result 1

“…participants are more likely to follow through when they are assigned a cue-based reminder (in the forced-reminder through-association condition, 87%) than when no cue-based reminder is available (none condition, 59%), $$\chi^2$$(1, N = 305) = 30.22, p < .001.”

## tRy it! Subset & Drop Levels

1. Create a subset of your data.frame that includes only the relevant levels of condition.
2. Use droplevels() to drop the extra levels of condition.

Hint: You can use either | or %in% to subset with one line of code. Otherwise, you could do it in two steps.

## Subset

### Option 1: Multiple Steps

dta1 <- dta
dta1 <- subset(dta1, condition != "Free")
dta1 <- subset(dta1, condition != "Costly")

## Subset (2)

### Option 2: Using |

Remember that | means “or”.

dta1 <- subset(dta, condition == "All" | condition == "None")

Can you imagine a situation where this approach might be unwieldy?

## Subset (3)

### Option 3: Using %in%

dta1 <- subset(dta, condition %in% c("All", "None"))

## Check Your Work (an aside)

Why should you check your work as you go?

• Just because R didn’t return an error, doesn’t mean your code did what you wanted.
• Sometimes errors later in the code are the result of an unrecognized mistake earlier.
• Code that does the wrong thing but doesn’t return an error is harder to catch.
• It saves time “debugging” down the line.

## Did Our Subsetting Work?

We don’t need to see this in your code

We can check with the function all(), which returns:

• TRUE if all values in the vector are TRUE.
• FALSE if any values in the vector are FALSE.
• NA otherwise.

all(dta1$condition %in% c("All", "None")) ##  TRUE all() can be a useful tool for testing your code. ## Did Our Subsetting Work? (3) Alternatively, we can use summary(), which will count the number of times each factor level occurs. summary(dta1$condition)
##   Free   None Costly    All
##      0    153      0    152

## Drop Extra Levels

Why do we need to do this?

levels(dta1$condition) ##  "Free" "None" "Costly" "All" ## Drop Extra Levels (2) dta1$condition <- droplevels(dta1$condition) levels(dta1$condition)
##  "None" "All"