06 Paired and Indepdendent Samples t Tests

Oct. 15, 2020

Today

Reproducing results from “The Sound of Intellect” Experiment 1 and Experiment 3a.

Learning Outcomes (Paired Samples t)

Do the following for paired samples t tests:

Conduct power/sensitivity analyses.
Conduct paired samples t tests in R.
Evaluate assumption of normality.
Produce appropriate visualizations of paired data.
Report results in APA style.
Evaluate the strength of evidence they provide.

Paired Samples t

What is a Paired Samples t?

A paired samples t test is used to compare two means that were sampled from the same set of participants.

Seen in:

Pre-to-post differences.
Crossed designs.

Paired Samples t = One-Sample t

Statistically, a paired samples t test is just a one-sample t test on the difference scores.

Difference scores (AKA change scores):

\(\Delta{X} = X_2 - X_1\)

Was the average change greater than/less than/different from zero?

Example: The Sound of Intellect

Study Aims

What cues do people use to infer intellect? Differences between reading, hearing, and watching (and hearing) a job candidate’s pitch.

Research Questions

Two research questions from experiment 1 that we’re looking at.

RQ1

Do job candidates think their written pitch will be perceived more or less positively than their spoken pitch?

RQ2

Do job candidates expect their chances of being hired to be different for their written and spoken pitches?

Why does this Matter?

“Theoretically, such expectations matter because they indicate whether the cues that convey mental capacities in social interaction are obvious to those in the midst of the interaction. Practically, such expectations matter because they could guide how candidates approach potential employers. Candidates who believe their spoken pitch will be judged exactly the same as their written pitch may see no reason to seek voice time with a potential employer.”

Hypotheses

The (implied) hypotheses are…

Hypothesis 1

Candidates will predict written and spoken pitches will be perceived differently.

Hypothesis 2

Candidates will predict that employers’ interest will vary based on whether they observed the written or spoken pitches.

Basic Study Design

Record Pitch

Photo by CoWomen on Unsplash

Write Pitch

Photo by Hannah Olinger on Unsplash

Survey

Survey Questions

Experiment 1 survey

Analytic Plan

Conduct a paired samples t test comparing participants’ predicted positivity ratings for their spoken and written pitches.

Open RStudio

Install/Load Packages

These are the packages we’ll be using in the lab today.

library(readxl)
library(psych)
library(pwr)
library(tidyr)
library(ggplot2)
library(foreign)

Power/Sensitivity Analysis

Review: Power vs. Sensitivity Analysis

When/why would you conduct a power analysis?
When/why would you conduct a sensitivity analysis?

Underpowered

“…these predictions were underpowered given the sample size of only 18 candidates…” (p. 880)

How underpowered? How are we defining underpowered?

Sensitivity Analysis 50% Power

What is the smallest population effect 50% of samples of N = 18 would detect?

pwr::pwr.t.test(n = 18,
  d = NULL,
  sig.level = 0.05,
  power = .50,
  type = "paired",
  alternative = "two.sided"
)

## 
##      Paired t test power calculation 
## 
##               n = 18
##               d = 0.4897239
##       sig.level = 0.05
##           power = 0.5
##     alternative = two.sided
## 
## NOTE: n is number of *pairs*

Sensitivity Analysis 80% Power

What is the smallest population effect 80% of samples of N = 18 would detect?

pwr::pwr.t.test(n = 18,
  d = NULL,
  sig.level = 0.05,
  power = .80,
  type = "paired",
  alternative = "two.sided"
)

## 
##      Paired t test power calculation 
## 
##               n = 18
##               d = 0.7007201
##       sig.level = 0.05
##           power = 0.8
##     alternative = two.sided
## 
## NOTE: n is number of *pairs*

Sensitivity Analysis 95% Power

What is the smallest population effect 95% of samples of N = 18 would detect?

pwr::pwr.t.test(n = 18,
  d = NULL,
  sig.level = 0.05,
  power = .95,
  type = "paired",
  alternative = "two.sided"
)

## 
##      Paired t test power calculation 
## 
##               n = 18
##               d = 0.902463
##       sig.level = 0.05
##           power = 0.95
##     alternative = two.sided
## 
## NOTE: n is number of *pairs*

Explore Data

tRy it! Data Import

Use readxl::read_excel() to import the Excel file.

dta <- readxl::read_excel(
  path = "../data/Job candidate predictions.xlsx",
  sheet = 1
)

Inspect the Data Frame

dta

## # A tibble: 18 x 10
##     `P#` Company PosWrit HireWrit PosSpoke HireSpoke `Times given`   Age Gender
##    <dbl> <chr>     <dbl>    <dbl>    <dbl>     <dbl> <chr>         <dbl> <chr> 
##  1     1 Google        3        3        4         4 3 to 5           26 M     
##  2     2 BCG           4        4        3         3 0                27 M     
##  3     3 Sprint        4        4        4         4 2                31 F     
##  4     4 Micros…       4        4        3         2 3                29 F     
##  5     5 Kleine…       3        3        4         3 3                26 M     
##  6     6 Raymon…       5        5        4         4 2-3 times, n…    29 M     
##  7     7 McKins…       3        3        3         3 2 for job in…    28 F     
##  8     8 Wilson…       4        3        4         4 0                28 M     
##  9     9 Samsun…       4        4        3         3 0                32 M     
## 10    10 Kraft …       1        1        5         4 1                28 M     
## 11    11 Gates …       3        1        4         4 0                24 F     
## 12    12 Spotify       2        2        4         3 0                28 M     
## 13    13 Mattel        3        3        4         4 0                28 F     
## 14    14 Coca C…       3        3        5         5 0                28 F     
## 15    15 Accent…       3        3        2         2 1                30 M     
## 16    16 MetLife       2        2        3         3 0                27 M     
## 17    17 McKins…       3        2        3         2 2                32 F     
## 18    18 Kaiser…       4        4        3         2 2 to 3           27 M     
## # … with 1 more variable: Ethnicity <chr>

Structure of the Data Frame

str(dta)

## tibble [18 × 10] (S3: tbl_df/tbl/data.frame)
##  $ P#         : num [1:18] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Company    : chr [1:18] "Google" "BCG" "Sprint" "Microsoft" ...
##  $ PosWrit    : num [1:18] 3 4 4 4 3 5 3 4 4 1 ...
##  $ HireWrit   : num [1:18] 3 4 4 4 3 5 3 3 4 1 ...
##  $ PosSpoke   : num [1:18] 4 3 4 3 4 4 3 4 3 5 ...
##  $ HireSpoke  : num [1:18] 4 3 4 2 3 4 3 4 3 4 ...
##  $ Times given: chr [1:18] "3 to 5" "0" "2" "3" ...
##  $ Age        : num [1:18] 26 27 31 29 26 29 28 28 32 28 ...
##  $ Gender     : chr [1:18] "M" "M" "F" "F" ...
##  $ Ethnicity  : chr [1:18] "Asian American" "White European" "Indian-American (Sub-continent)" "Indian" ...

Summary of the Data Frame

summary(dta)

##        P#          Company             PosWrit         HireWrit   
##  Min.   : 1.00   Length:18          Min.   :1.000   Min.   :1.00  
##  1st Qu.: 5.25   Class :character   1st Qu.:3.000   1st Qu.:2.25  
##  Median : 9.50   Mode  :character   Median :3.000   Median :3.00  
##  Mean   : 9.50                      Mean   :3.222   Mean   :3.00  
##  3rd Qu.:13.75                      3rd Qu.:4.000   3rd Qu.:4.00  
##  Max.   :18.00                      Max.   :5.000   Max.   :5.00  
##     PosSpoke       HireSpoke     Times given             Age       
##  Min.   :2.000   Min.   :2.000   Length:18          Min.   :24.00  
##  1st Qu.:3.000   1st Qu.:3.000   Class :character   1st Qu.:27.00  
##  Median :4.000   Median :3.000   Mode  :character   Median :28.00  
##  Mean   :3.611   Mean   :3.278                      Mean   :28.22  
##  3rd Qu.:4.000   3rd Qu.:4.000                      3rd Qu.:29.00  
##  Max.   :5.000   Max.   :5.000                      Max.   :32.00  
##     Gender           Ethnicity        
##  Length:18          Length:18         
##  Class :character   Class :character  
##  Mode  :character   Mode  :character  
##                                       
##                                       
##

tRy it! M and SD

Compute means and standard deviations of how positively participants expected to be evaluated.

mean(dta$PosSpoke)
sd(dta$PosSpoke)

mean(dta$PosWrit)
sd(dta$PosWrit)

Using `psych::describe()`

This is a handy function that will describe columns in your data frame.

psych::describe(dta[, c("PosSpoke", "PosWrit")])

##          vars  n mean   sd median trimmed  mad min max range  skew kurtosis
## PosSpoke    1 18 3.61 0.78      4    3.62 1.48   2   5     3  0.01    -0.67
## PosWrit     2 18 3.22 0.94      3    3.25 1.48   1   5     4 -0.42    -0.11
##            se
## PosSpoke 0.18
## PosWrit  0.22

Conduct Tests

Job Candidates’ Predictions (Results)

Together

These participants did not predict that they would be evaluated differently when employers listened to their spoken pitches (M = 3.61, SD = 0.78) than when employers read their written pitches (M = 3.22, SD = 0.94), paired t(17) = 1.20, p = .25, d = 0.45.

For Practice

“They also did not expect any difference in their likelihood of getting hired depending on whether employers listened to their spoken pitches (M = 3.28, SD = 0.89) or read their written pitches (M = 3.00, SD = 1.08), paired t(17) = 0.80, p = .44, d = 0.29.”

Reproducing Findings

tRy it! Difference Scores

Add a column to your data frame that contains the difference between expected evaluation of spoken and written pitches.

dta$PosDiff <- dta$PosSpoke - dta$PosWrit

tRy it! Paired t Test (Option 1)

There are two ways to do a paired-samples t test. The first is to do a one-sample t test of the difference scores.

Conduct a one sample t test against the null hypothesis that the mean of the difference scores is 0.

Paired t Test (Option 1)

result1 <- t.test(x = dta$PosDiff)

result1

## 
##  One Sample t-test
## 
## data:  dta$PosDiff
## t = 1.1974, df = 17, p-value = 0.2476
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -0.2963399  1.0741177
## sample estimates:
## mean of x 
## 0.3888889

tRy it! Paired t Test (Option 2)

Try the second way of getting the same result, which is using t.test() with the argument paired = TRUE.

Paired t Test (Option 2)

result2 <- t.test(x = dta$PosSpoke, y = dta$PosWrit, paired = TRUE)

result2

## 
##  Paired t-test
## 
## data:  dta$PosSpoke and dta$PosWrit
## t = 1.1974, df = 17, p-value = 0.2476
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.2963399  1.0741177
## sample estimates:
## mean of the differences 
##               0.3888889

Compare the Results

## 
##  One Sample t-test
## 
## data:  dta$PosDiff
## t = 1.1974, df = 17, p-value = 0.2476
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -0.2963399  1.0741177
## sample estimates:
## mean of x 
## 0.3888889

## 
##  Paired t-test
## 
## data:  dta$PosSpoke and dta$PosWrit
## t = 1.1974, df = 17, p-value = 0.2476
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.2963399  1.0741177
## sample estimates:
## mean of the differences 
##               0.3888889

Assumptions of Paired t

The assumptions are the same as for a one-sample t, but they are assumptions about the difference scores.

Difference scores come from a normally distributed population.
Observations (i.e., difference scores) are independent.

Evaluating Assumptions

We can assess how tenable the normality assumption is in the same ways we did for the one-sample t test:

Shapiro–Wilk Test.
Visual inspection of histogram and q–q plot.
Skew and kurtosis values.

Shapiro–Wilk Test

shapiro.test(dta$PosDiff)

## 
##  Shapiro-Wilk normality test
## 
## data:  dta$PosDiff
## W = 0.86361, p-value = 0.01398

Histogram

hist(dta$PosDiff)

Histogram

ggplot2 Histogram

ggplot(dta, aes(PosDiff)) +
  geom_histogram(binwidth = 1)

ggplot2 Histogram

Q–Q Plot

qqnorm(dta$PosDiff, ylim = c(-2, 4))
qqline(dta$PosDiff)

Q–Q Plot

Skewness

Kurtosis

Kurtosis has to do with the tails of the distribution.

Leptokurtic

Has positive excess kurtosis.
Fat tailed (AKA heavy tailed).
More outliers than expected.

Platykurtic

Has negative excess kurtosis.
Thin tailed.
Fewer outliers than expected.

Skew and Kurtosis Values

psych::describe(dta$PosDiff)

##    vars  n mean   sd median trimmed  mad min max range skew kurtosis   se
## X1    1 18 0.39 1.38      0    0.25 1.48  -1   4     5 0.86     0.18 0.32

Does this Seem Normal?

Do the difference scores appear to be drawn from a normally distributed population?
How does this affect our interpretation of the results?

Effect Sizes for Paired t Test

Average Change

The mean of the difference scores (which is the same as the mean difference).

Cohen’s d_z

Remember Cohen’s d_z from last week?

\(d_z =\frac{M - \mu}{SD}\)

Since a paired samples t is just a one samples t of the difference scores, Cohen recommended d_z as a standardized ES.

Computing d_z for Paired Data

mean(dta$PosDiff) / sd(dta$PosDiff)

## [1] 0.2822268

Computing d_z from t

psych::t2d(t = result1$statistic, n1 = 18)

##         t 
## 0.2822268

Cohen’s d

With paired data, it is also common to report Cohen’s d instead of d_z.

Con: Ignores the characteristic of the design.
Pro: A more familiar statistic; kind of comparable with independent designs.

Computing Cohen’s d

\(d = \frac{M_1 - M_2}{\sqrt{\frac{SD_1^2 + SD_2^2}{2}}}\)

m1 <- mean(dta$PosSpoke)
sd1 <- sd(dta$PosSpoke)
m2 <- mean(dta$PosWrit)
sd2 <- sd(dta$PosWrit)

mean_diff <- m1 - m2

pooled_sd <- sqrt((sd1^2 + sd2^2) / 2)

mean_diff / pooled_sd

## [1] 0.4500317

Cohen’s d from t

You need to use the t statistic from an independent samples t test.

ind_t_result <- t.test(x = dta$PosSpoke, y = dta$PosWrit, paired = FALSE)

ind_t_result$statistic

##        t 
## 1.350095

psych::t2d(t = ind_t_result$statistic, n1 = 18, n2 = 18)

##         t 
## 0.4500317

Report & Interpret Results

APA Style Reporting

Interpret the Results

What conclusion(s) can you draw from these results?
Did participants predict that they would be evaluated the same?
How does the sensitivity analysis affect our interpretation?

Visualizing Change

Plot Number 1

Mean for both pitches.
95% CI of means for each pitch.
Average change (slope of line).
All the data points.

Rearrange the Data

The plot has type of pitch on the x-axis and positivity on the y-axis. But our data are not laid out in this way.

## # A tibble: 18 x 3
##     `P#` PosSpoke PosWrit
##    <dbl>    <dbl>   <dbl>
##  1     1        4       3
##  2     2        3       4
##  3     3        4       4
##  4     4        3       4
##  5     5        4       3
##  6     6        4       5
##  7     7        3       3
##  8     8        4       4
##  9     9        3       4
## 10    10        5       1
## 11    11        4       3
## 12    12        4       2
## 13    13        4       3
## 14    14        5       3
## 15    15        2       3
## 16    16        3       2
## 17    17        3       3
## 18    18        3       4

##    pid   Pitch Positivity
## 1    1  Spoken          4
## 2    2  Spoken          3
## 3    3  Spoken          4
## 4    4  Spoken          3
## 5    5  Spoken          4
## 6    6  Spoken          4
## 7    7  Spoken          3
## 8    8  Spoken          4
## 9    9  Spoken          3
## 10  10  Spoken          5
## 11  11  Spoken          4
## 12  12  Spoken          4
## 13  13  Spoken          4
## 14  14  Spoken          5
## 15  15  Spoken          2
## 16  16  Spoken          3
## 17  17  Spoken          3
## 18  18  Spoken          3
## 19   1 Written          3
## 20   2 Written          4
## 21   3 Written          4
## 22   4 Written          4
## 23   5 Written          3
## 24   6 Written          5
## 25   7 Written          3
## 26   8 Written          4
## 27   9 Written          4
## 28  10 Written          1
## 29  11 Written          3
## 30  12 Written          2
## 31  13 Written          3
## 32  14 Written          3
## 33  15 Written          3
## 34  16 Written          2
## 35  17 Written          3
## 36  18 Written          4

Lengthen Our Data

dta_long <- data.frame(
  pid = rep(dta$`P#`, 2),
  Pitch = rep(c("Spoken", "Written"), each = 18),
  Positivity = c(dta$PosSpoke, dta$PosWrit)
)

dta_long

##    pid   Pitch Positivity
## 1    1  Spoken          4
## 2    2  Spoken          3
## 3    3  Spoken          4
## 4    4  Spoken          3
## 5    5  Spoken          4
## 6    6  Spoken          4
## 7    7  Spoken          3
## 8    8  Spoken          4
## 9    9  Spoken          3
## 10  10  Spoken          5
## 11  11  Spoken          4
## 12  12  Spoken          4
## 13  13  Spoken          4
## 14  14  Spoken          5
## 15  15  Spoken          2
## 16  16  Spoken          3
## 17  17  Spoken          3
## 18  18  Spoken          3
## 19   1 Written          3
## 20   2 Written          4
## 21   3 Written          4
## 22   4 Written          4
## 23   5 Written          3
## 24   6 Written          5
## 25   7 Written          3
## 26   8 Written          4
## 27   9 Written          4
## 28  10 Written          1
## 29  11 Written          3
## 30  12 Written          2
## 31  13 Written          3
## 32  14 Written          3
## 33  15 Written          3
## 34  16 Written          2
## 35  17 Written          3
## 36  18 Written          4

Plot the Data

Plot 1 Data

ggplot(data = dta_long)

Plot 1 Data (Plot)

Plot 1 Aesthetics

ggplot(data = dta_long, aes(x = Pitch, y = Positivity))

Plot 1 Aesthetics (Plot)

Plot 1 Geoms (Jitter)

ggplot(data = dta_long, aes(x = Pitch, y = Positivity)) +
  geom_jitter(width = 0.25, height = 0.1, alpha = 0.5)

Plot 1 Jitter (Plot)

Plot 1 Statistics (Pointrange)

ggplot(data = dta_long, aes(x = Pitch, y = Positivity)) +
  geom_jitter(width = 0.25, height = 0.1, alpha = 0.5) +
  stat_summary(fun.data = mean_cl_normal, geom = "pointrange")

Plot 1 Pointrange (Plot)

Plot 1 Statistics (Line)

ggplot(data = dta_long, aes(x = Pitch, y = Positivity)) +
  geom_jitter(width = 0.25, height = 0.1, alpha = 0.5) +
  stat_summary(aes(group = NA), fun.data = mean_cl_normal, geom = "line") +
  stat_summary(fun.data = mean_cl_normal, geom = "pointrange")

Plot 1 Line (Plot)

Plot 1 Design Elements

ggplot(data = dta_long, aes(x = Pitch, y = Positivity)) +
  geom_jitter(width = 0.25, height = 0.1, alpha = 0.5) +
  stat_summary(aes(group = NA), fun.data = mean_cl_normal, geom = "line") +
  stat_summary(fun.data = mean_cl_normal, geom = "pointrange") +
  theme_minimal()

Plot 1 Design Elements (Plot)

Plot Number 2

Plot 2

All the data.
Each participant’s change score.

tRy it! Plot 2

Take 5 minutes to attempt to recreate plot 2 (don’t worry about the jitter—we’ll do that together).

Plot 2 (No Jitter)

Jitter Attempt 1

jitter <- position_jitter(width = .1, height = .1, seed = 1234L)

ggplot(dta_long, aes(x = Pitch, y = Positivity, group = pid)) +
  geom_line(position = jitter, colour = "#c9c9c9") +
  geom_point(position = jitter) +
  theme_minimal()

Jitter Fail!

Adjust the Data Instead

dta_long$PositivityJitter <- jitter(dta_long$Positivity, factor = .5)

dta_long

##    pid   Pitch Positivity PositivityJitter
## 1    1  Spoken          4         3.979290
## 2    2  Spoken          3         3.045460
## 3    3  Spoken          4         3.964090
## 4    4  Spoken          3         2.953722
## 5    5  Spoken          4         3.982265
## 6    6  Spoken          4         3.970553
## 7    7  Spoken          3         2.996247
## 8    8  Spoken          4         3.997742
## 9    9  Spoken          3         3.025605
## 10  10  Spoken          5         4.973715
## 11  11  Spoken          4         3.913898
## 12  12  Spoken          4         3.908666
## 13  13  Spoken          4         3.904526
## 14  14  Spoken          5         5.026285
## 15  15  Spoken          2         2.068756
## 16  16  Spoken          3         3.073923
## 17  17  Spoken          3         3.073315
## 18  18  Spoken          3         3.043185
## 19   1 Written          3         3.076402
## 20   2 Written          4         4.029323
## 21   3 Written          4         4.034346
## 22   4 Written          4         4.076246
## 23   5 Written          3         2.952741
## 24   6 Written          5         5.061069
## 25   7 Written          3         2.916593
## 26   8 Written          4         3.930198
## 27   9 Written          4         4.045028
## 28  10 Written          1         1.028545
## 29  11 Written          3         3.001938
## 30  12 Written          2         1.951113
## 31  13 Written          3         2.934935
## 32  14 Written          3         3.063972
## 33  15 Written          3         3.033724
## 34  16 Written          2         2.074818
## 35  17 Written          3         2.968841
## 36  18 Written          4         3.936012

Jitter Attempt 2

ggplot(dta_long, aes(x = Pitch, y = PositivityJitter, group = pid)) +
  geom_line(colour = "#c9c9c9") +
  geom_point() +
  labs(y = "Positivity") +
  theme_minimal()

Jitter Success!

Dodging: `position_dodge`

Dodging moves elements side-to-side. Unlike jitter, it is not random. It spaces objects evenly.

ggplot(dta_long, aes(x = Pitch, y = PositivityJitter, group = pid)) +
  geom_line(position = position_dodge(width = .1), colour = "#c9c9c9") +
  geom_point(position = position_dodge(width = .1)) +
  labs(y = "Positivity") +
  theme_minimal()

Jitter and Dodge

Design Elements

ggplot(dta_long, aes(x = Pitch, y = PositivityJitter, group = pid)) +
  scale_y_continuous(minor_breaks = NULL) +
  geom_line(position = position_dodge(width = .1), colour = "#c9c9c9") +
  geom_point(position = position_dodge(width = .1)) +
  labs(y = "Positivity") +
  theme_minimal(base_family = "Fira Sans")

Finished!

Independent Samples t Tests

What Does this Answer?

Used when we want to determine whether two independent samples were drawn from the same population.

Experiment 3a

Design

“In Experiment 3a, we recruited four trained stage actors to read all 18 pitches.”

“Evaluators were 265 visitors to the Museum of Science and Industry in Chicago (mean age = 35.03 years, SD = 14.40; 124 males), who agreed to participate in exchange for a food item.”

Design (2)

“We randomly assigned participants serving as potential employers (evaluators) to one of three conditions: Those in the writing condition read a written pitch, those in the female-speaker condition listened to one of the female actors reading a written pitch, and those in the male-speaker condition listened to one of the male actors reading a written pitch.”

RQ/Hypothesis We’re Testing

RQ: Does the gender of the speaker affect how positively a pitch is perceived?

Hypothesis: Gender of the speaker will affect how positively a pitch is perceived.

Not Reproducing this Result

“Evaluators had more negative impressions of male speakers (M = 5.79, SD = 1.78) than of female speakers, t(262) = −2.12, p = .04, 95% CI of the difference = [−1.03, −0.06], d = 0.26.”

But we’re getting at something similar.

What Are We Doing?

Do a t test comparing participants who rated a female actor to those who rated a male actor. Authors did some form of planned contrast, which we will learn about when we learn ANOVA.

Power/Sensitivity

Sensitivity Analysis

pwr::pwr.t.test(
  n = 216,
  d = NULL,
  sig.level = 0.05,
  power = 0.80,
  type = "two.sample"
)

## 
##      Two-sample t test power calculation 
## 
##               n = 216
##               d = 0.2701842
##       sig.level = 0.05
##           power = 0.8
##     alternative = two.sided
## 
## NOTE: n is number in *each* group

Explore the Data

Data Import

dta3a <- read.csv("study3a-edited.csv")

Inspect the Data

Use head(), summary(), and str() to inspect the data frame.

The Data

head(dta3a)

##   Pnum actor_gender impression
## 1    9       Female   7.333333
## 2   12         Male   2.333333
## 3   69         Male   7.000000
## 4   85       Female   7.666667
## 5  114         Male   5.666667
## 6  133       Female   5.000000

Summary

summary(dta3a)

##       Pnum        actor_gender         impression    
##  Min.   :  1.00   Length:216         Min.   :0.3333  
##  1st Qu.: 65.75   Class :character   1st Qu.:5.0000  
##  Median :135.50   Mode  :character   Median :6.3333  
##  Mean   :133.93                      Mean   :6.0613  
##  3rd Qu.:198.25                      3rd Qu.:7.3333  
##  Max.   :270.00                      Max.   :9.3333  
##                                      NA's   :4

Structure

str(dta3a)

## 'data.frame':    216 obs. of  3 variables:
##  $ Pnum        : int  9 12 69 85 114 133 137 143 213 218 ...
##  $ actor_gender: chr  "Female" "Male" "Male" "Female" ...
##  $ impression  : num  7.33 2.33 7 7.67 5.67 ...

Convert to Factor

Convert actor_gender to a factor.

dta3a$actor_gender <- factor(dta3a$actor_gender,
  levels = c("Male", "Female")
)

Group Descriptives

Use tapply() to apply psych::describe() to impression for each level of actor_gender.

tapply(X = dta3a$impression,
  INDEX = dta3a$actor_gender,
  FUN = psych::describe
)

## $Male
##    vars   n mean   sd median trimmed  mad  min  max range  skew kurtosis   se
## X1    1 106 5.79 1.78      6    5.96 1.48 0.33 8.67  8.33 -0.84     0.18 0.17
## 
## $Female
##    vars   n mean   sd median trimmed  mad  min  max range  skew kurtosis   se
## X1    1 106 6.33 1.82   6.67     6.5 1.48 0.67 9.33  8.67 -0.85     0.38 0.18

Conducting the Statistical Test

We’ll use the formula notation.

t.test(formula = impression ~ actor_gender,
  data = dta3a
)

## 
##  Welch Two Sample t-test
## 
## data:  impression by actor_gender
## t = -2.2016, df = 209.92, p-value = 0.02879
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.0311577 -0.0568926
## sample estimates:
##   mean in group Male mean in group Female 
##             5.789308             6.333333

Assumptions

Assumptions of Independent t Test

Very similar to before and we add a new assumption (sort of!).

Both groups are sampled from a normally distributed population.
Homogeneity of variance/homoscedasticity.
Observations are independent.

Normality

Test this the same as before, but on each independent sample separately.

Homogeneity of Variance

This is the assumption that the variance of both groups is equal. This assumption is violated if measurement is more or less accurate for one of the groups.

Tests of Homogeneity of Variance

There is a statistical test of this assumption, but I’m not going to teach it to you, because there is a better approach: use Welch’s t test!

Welch’s t Test

Welch’s t test adjusts the degrees of freedom to account for heterogeneity of variance (AKA heteroscedasticity).

Look at these results:

## 
##  Welch Two Sample t-test
## 
## data:  impression by actor_gender
## t = -2.2016, df = 209.92, p-value = 0.02879
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.0311577 -0.0568926
## sample estimates:
##   mean in group Male mean in group Female 
##             5.789308             6.333333

Effect Sizes

What to Report?

Report M and SD for each group, and Cohen’s d. You know how to do this:

psych::t2d(t = 2.2016, n = 216)

## [1] 0.2995998

Data Vis

Visualizing Independent Samples

Report Results

In APA Style

Participants had more negative impressions of pitches read by males (M = 5.79, SD = 1.78) than females (M = 6.33, SD = 1.82), t(209.92) = 2.20, p = .03, 95% CI [−1.03, −0.06], d = 0.30.

06 Paired and Indepdendent Samples t Tests

Oct. 15, 2020

Today

Learning Outcomes (Paired Samples t)

Paired Samples t

What is a Paired Samples t?

Paired Samples t = One-Sample t

Example: The Sound of Intellect

Study Aims

Research Questions

RQ1

RQ2

Why does this Matter?

Hypotheses

Hypothesis 1

Hypothesis 2

Basic Study Design

Record Pitch

Write Pitch

Survey

Survey Questions

Analytic Plan

Open RStudio

Install/Load Packages

Power/Sensitivity Analysis

Review: Power vs. Sensitivity Analysis

Underpowered

Sensitivity Analysis 50% Power

Sensitivity Analysis 80% Power

Sensitivity Analysis 95% Power

Explore Data

tRy it! Data Import

Inspect the Data Frame

Structure of the Data Frame

Summary of the Data Frame

tRy it! M and SD

Using psych::describe()

Conduct Tests

Job Candidates’ Predictions (Results)

Together

For Practice

Reproducing Findings

tRy it! Difference Scores

tRy it! Paired t Test (Option 1)

Paired t Test (Option 1)

tRy it! Paired t Test (Option 2)

Paired t Test (Option 2)

Compare the Results

Assumptions of Paired t

Assumptions of Paired t

Evaluating Assumptions

Shapiro–Wilk Test

Histogram

Histogram

ggplot2 Histogram

ggplot2 Histogram

Q–Q Plot

Q–Q Plot

Skewness

Kurtosis

Leptokurtic

Platykurtic

Skew and Kurtosis Values

Does this Seem Normal?

Effect Sizes for Paired t Test

Average Change

Cohen’s dz

Computing dz for Paired Data

Computing dz from t

Cohen’s d

Computing Cohen’s d

Cohen’s d from t

Report & Interpret Results

APA Style Reporting

Interpret the Results

Visualizing Change

Plot Number 1

Plot Number 1

Rearrange the Data

Lengthen Our Data

Using `psych::describe()`

Cohen’s d_z

Computing d_z for Paired Data

Computing d_z from t

Dodging: `position_dodge`