08 Regression

Oct. 29, 2020

We’re more than halfway there!

Housekeeping

  • Thanks for the positive feedback on class last week!
  • When do we answer emails?

Common mistakes: Lab report (two-sample t test)

Common Mistakes: Introduction

  • Justify your hypothesis
  • Make your hypothesis about a population

Common Mistakes: Method

  • What should be included in your method section?
    • Who were the participants? (Just you!)
    • What exactly is a digit/letter span and how do you measure it?
    • How were the measurements taken? (RStudio)
  • What should NOT be included in your method section?
    • Anything incorrect or inaccurate
    • Trivial details about how you stored the data
  • Welch’s two-sample t test vs. independent samples t test

Common Mistakes: Power Analysis

Important information for a power analysis:

  • The statistical test.
  • Alpha level.
  • Power.
  • Population effect size.

Common Mistakes: Discussion

  • Interpret effect size
  • Generalizability, external validity, power, are unrelated
    • Generalizability: how well does my sample represent the population?
    • External validity: would my effect occur outside the lab?
    • Power: how reliable is my result?

Common mistakes: Correlation assignment

  • SO MANY of you didn’t find variable means and SDs 😞
  • Report the r value, not the t value
  • Conclusion should be broad/big-picture

Regression

The Big Reveal

Everything we have covered so far and will cover (even next semester!) falls under the framework of general linear models.

Data = Model + Error

General Linear Models

  • Can have categorical or continuous predictors
  • Can have categorical or continuous outcomes
  • Next on PSYO 372… multiple predictors and/or outcomes!

Model comparison

Model

Predictor variable(s) have non-zero association with outcome variable(s)

  • Grouping variable for two-sample t tests, chi-square
  • Predictor variable in correlation


Null Model

No relationships between variables

The lm() Function

The lm() function can be used to conduct any analysis that falls under the general linear model framework.

Syntax:

  • Outcome ~ predictor
  • Outcome ~ predictor1 + predictor2…

tRy it!

Syntax for correlation assignment from last week:

What happens when you use the summary() function on corr_model?
What about plot()?

The anova() function

  • Compares a series of nested models
  • Only one model: Compares to the null model
  • Can compare a series of two or more models

Heteroscedasticity

Heteroscedasticity
Examples?

Dealing with heteroscedasticity

## 
## t test of coefficients:
## 
##                     Estimate  Std. Error t value Pr(>|t|)    
## (Intercept)       4.1877e+00  9.7535e-02 42.9357  < 2e-16 ***
## Household_Income -5.2058e-06  1.3776e-06 -3.7791  0.00019 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1