Reproducing these effects from Lloyd et al., 2018.
“Similar to previous findings in the deception detection literature, sensitivity scores (M = .15; SD = .98) averaged across targets and participants were slightly better than chance (i.e., 0), t(401) = 3.053, p = .002, 95% CI [.05, .25] d = .30.”
“Consistent with past meta-analyses in the deception detection literature, accuracy scores (M = .52, SD = .13) were slightly better than chance (i.e., .5), t(401) = 3.045, p = .002, 95% CI [.01, .03], d = .30.”
“Because we are interested in the effects of target and participant gender, only those who disclosed self-disclosed their gender were included in analyses (N = 402).”
How do we know which participants did not self disclose gender?
Check to the codebook to identify which participants did not self-disclose their gender.
gender
column?
filter_$
column. Codebook says 1 means gender was disclosed.
filter_$
if gender was not disclosed? How could you check?
Use subset()
or [
to drop participants who did not self-disclose gender.
[
[
subset()
t.test()
Open the documentation for t.test()
.
Performs one and two sample t-tests on vectors of data.
t.test(x, ...)
## Default S3 method:
t.test(x, y = NULL,
alternative = c("two.sided", "less", "greater"),
mu = 0, paired = FALSE, var.equal = FALSE,
conf.level = 0.95, ...)
## S3 method for class 'formula'
t.test(formula, data, subset, na.action, ...)
Why are there three usages???
Consider the following:
Notice how you can solve both equations by treating +
differently.
The +
(a function) does something different depending on its arguments
Some functions will behave differently depending on the class(es) of their argument(s). For example:
## Ps videoset age race
## Min. : 1.0 Min. : 1.00 Min. :18.00 Min. :1.000
## 1st Qu.:101.2 1st Qu.: 5.25 1st Qu.:26.00 1st Qu.:5.000
## Median :202.5 Median :11.00 Median :31.00 Median :5.000
## Mean :203.3 Mean :10.52 Mean :34.47 Mean :4.666
## 3rd Qu.:303.8 3rd Qu.:15.00 3rd Qu.:40.00 3rd Qu.:5.000
## Max. :485.0 Max. :20.00 Max. :78.00 Max. :7.000
## NA's :1
## race_TEXT gender f_dprime f_crit
## Length:402 Mode:logical Min. :-3.0008 Min. :-2.3263
## Class :character NA's:402 1st Qu.:-0.6745 1st Qu.: 0.0000
## Mode :character Median : 0.0000 Median : 0.3372
## Mean : 0.2450 Mean : 0.2701
## 3rd Qu.: 0.6745 3rd Qu.: 0.6745
## Max. : 4.6527 Max. : 2.3263
##
## m_dprime m_crit f_accuracy m_accuracy
## Min. :-4.65270 Min. :-2.3263 Min. :0.1250 Min. :0.0000
## 1st Qu.:-0.67449 1st Qu.: 0.0000 1st Qu.:0.3750 1st Qu.:0.3750
## Median : 0.00000 Median : 0.3372 Median :0.5000 Median :0.5000
## Mean : 0.05301 Mean : 0.4740 Mean :0.5323 Mean :0.5062
## 3rd Qu.: 0.67449 3rd Qu.: 1.1632 3rd Qu.:0.6250 3rd Qu.:0.6250
## Max. : 4.65270 Max. : 2.3263 Max. :1.0000 Max. :1.0000
##
## accuracy_tot dprime_tot crit_tot filter_.
## Min. :0.2500 Min. :-2.330 Min. :-1.5000 Min. :1
## 1st Qu.:0.4400 1st Qu.:-0.340 1st Qu.: 0.0000 1st Qu.:1
## Median :0.5000 Median : 0.000 Median : 0.3400 Median :1
## Mean :0.5205 Mean : 0.149 Mean : 0.3722 Mean :1
## 3rd Qu.:0.6300 3rd Qu.: 0.830 3rd Qu.: 0.5800 3rd Qu.:1
## Max. :0.8100 Max. : 2.660 Max. : 2.3300 Max. :1
##
summary()
will choose the appropriate method for you.
t.test
Documentationt.test(x, ...)
## Default S3 method:
t.test(x, y = NULL,
alternative = c("two.sided", "less", "greater"),
mu = 0, paired = FALSE, var.equal = FALSE,
conf.level = 0.95, ...)
## S3 method for class 'formula'
t.test(formula, data, subset, na.action, ...)
We’re using the default method today.
Argument | Value |
---|---|
x | a (non-empty) numeric vector of data values. |
y | an optional (non-empty) numeric vector of data values. |
alternative | a character string specifying the alternative hypothesis, must be one of “two.sided” (default), “greater” or “less”. You can specify just the initial letter. |
mu | a number indicating the true value of the mean (or difference in means if you are performing a two sample test). |
paired | a logical indicating whether you want a paired t-test. |
var.equal | a logical variable indicating whether to treat the two variances as being equal. If TRUE then the pooled variance is used to estimate the variance otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom is used. |
conf.level | confidence level of the interval. |
Use t.test()
to conduct a one-sample t test comparing sensitivity to 0. Use values for the arguments: x
, alternative
, mu
, and conf.level
, to match the results from Lloyd et al.
Assign a name to the resulting R object.
##
## One Sample t-test
##
## data: dta$dprime_tot
## t = 3.0542, df = 401, p-value = 0.002407
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## 0.05310512 0.24495458
## sample estimates:
## mean of x
## 0.1490299
From Lloyd et al.: t(401) = 3.053, p = .002, 95% CI [.05, .25] d = .30.
Scores on the outcome variable are normally distributed in the population from which our sample was drawn.
Can we know answer this question conclusively?
We do have approaches that help us decide whether the assumption is tenable.
Null hypothesis significance test against the null sample scores are drawn from a normally distributed population.
##
## Shapiro-Wilk normality test
##
## data: dta$dprime_tot
## W = 0.99014, p-value = 0.008507
How do we interpret this?
I.e., does the sample look normal?
Create a Histogram of dprime_tot
Plots theoretical quantiles on the x and sample quantiles on the y.
If theory matches sample, points will fall on the line y = x (or y = x + c, where c is a constant).
Use qqnorm()
.
Not going over this but here is the code if you’re interested.
null_dist <- function(x) {
dnorm(x, 0, 0.978)
}
alt_dist <- function(x) {
dnorm(x, 0.149, 0.978)
}
ggplot(dta, aes(dprime_tot)) +
scale_x_continuous(breaks = seq(-4, 4, by = 1)) +
geom_histogram(aes(y = ..density..), binwidth = .3, fill = "#dee2e6",
colour = "#212529") +
geom_function(fun = alt_dist, linetype = 1) +
geom_function(fun = null_dist, linetype = 2) +
theme_minimal(base_family = "Fira Sans") +
theme(
axis.text.y = element_blank()
) +
labs(
title = "Distribution of sensitivity scores",
subtitle = paste(
"Curves show theoretical distributions under the null",
"and alternative hypotheses"
),
x = NULL,
y = NULL
)
We won’t go over this, but here is the code for those interested.
dta_long <- stack(dta, select = c("f_dprime", "m_dprime", "dprime_tot"))
levels(dta_long$ind) <- c("Female", "Male", "Both")
ggplot(dta_long, aes(x = ind, y = values)) +
geom_hline(yintercept = 0, colour = "#e9e9e9", size = 2) +
stat_summary(fun.data = mean_cl_normal,
geom = "errorbar",
width = 0.05
) +
stat_summary(fun.data = mean_cl_normal,
fun.args = list(conf.int = .90),
geom = "linerange",
size = 1.5
) +
stat_summary(fun = mean,
geom = "point",
size = 3,
shape = 21,
fill = "white"
) +
coord_flip() +
theme_minimal() +
labs(x = "Target Gender", y = "Sensitivity")
@allison_horst
Most commonly reported effect sizes for a one-sample t test are:
The mean effect size in the original units of measurement. For example,
“sensitivity scores (M = .15; SD = .98) averaged across targets and participants were slightly better than chance (i.e., 0), t(401) = 3.053, p = .002, 95% CI [.05, .25] d = .30.”
Yes. Basically always report:
“…per-cell sample sizes, observed cell means, […] and cell standard deviations…”
Remember to interpret the raw effect size as well.
## [1] 0.1490299
## [1] 0.9783241
\(d = \frac{M_1 - M_2}{\sigma}\)
Where \(\sigma\) is the pooled standard deviation. That is:
\(\sqrt{\frac{SD_1^2 + SD_2^2}{2}}\)
Effect size in the units of pooled standard deviations.
Let’s look at the formula again:
\(d = \frac{M_1 - M_2}{\sqrt{\frac{1}{2}(SD_1^2 + SD_2^2)}}\)
For a one-sample t test, what is:
Knowing this, we can simplify the formula for the one-sample case to…
\(d_z = \frac{M - \mu}{SD}\)
So for one sample, assuming SD1 = SD2, Cohen’s d and Cohen’s dz are the same.
Since μ = 0, dz is simply, \(M/SD\).
Does this match what Lloyd et al. reported? No. We’ll show why later.
We can convert the t value to a d value using t2d()
from the package psych
. t2d()
computes d in up to 3 different ways, depending on whether values are supplied to the arguments n
, n1
, and n2
.
t2d()
library(psych)
at the top of your script.Cohen’s d for independent samples.
## t
## 0.3046636
library(pwr) # This goes at the top of your script!
pwr.t.test(n = NULL,
d = 0.5,
sig.level = 0.05,
power = 0.95,
type = "one.sample",
alternative = "two.sided"
)
##
## One-sample t test power calculation
##
## n = 53.94061
## d = 0.5
## sig.level = 0.05
## power = 0.95
## alternative = two.sided
pwr.t.test(n = 402,
d = NULL,
sig.level = 0.05,
power = 0.95,
type = "one.sample",
alternative = "two.sided"
)
##
## One-sample t test power calculation
##
## n = 402
## d = 0.1802227
## sig.level = 0.05
## power = 0.95
## alternative = two.sided
The authors do this very well!
“Similar to previous findings in the deception detection literature, sensitivity scores (M = .15, SD = .98) averaged across targets and participants were slightly better than chance (i.e., 0), t(401) = 3.053, p = .002, 95% CI [.05, .25], d = .30.”
“My problem is that I have been persecuted by an integer. For seven years this number has followed me around, has intruded in my most private data, and has assaulted me from the pages of our most public journals. […] There is, to quote a famous senator, a design behind it, some pattern governing its appearances. Either there really is something unusual about the number or else I am suffering from delusions of persecution.”
“[Miller] made the one and only hole-in-one of his life at the age of 77, on the seventh green. He made it with a seven iron. He loved that.”
Make a prediction.
My digit span is…