Numeracy

Statistics: The Basics of Correlations





1. What Are Correlations?

  • Definition: Correlation is a statistical measure that identifies the association between two variables—whether one changes as the other changes.
  • Purpose: Useful for making predictions when one variable is hard to measure.
  • Caveats:
  • Correlation Causation.
  • Relationships may vary across different ranges of the data.

2. Types of Associations

  • Positive Association: High values of one variable occur with high values of the other (e.g., studying more leads to higher grades).
  • Negative Association: High values of one variable occur with low values of the other (e.g., exercise frequency and body fat percentage).
  • No Association: No consistent relationship between variables.
  • Strength of Association:
  • Strong: Small changes in one variable closely correspond to changes in the other.
  • Weak: A large change in one variable is needed to see a noticeable change in the other.

3. Correlation vs. Causation

  • Correlation: Indicates association but not cause-effect.
  • Causation: Requires evidence that one variable directly causes changes in the other.
  • Example:
    • Correlation: People shopping online buy more ready meals.
    • Likely Explanation: Time constraints influence both behaviors.

4. Identifying Correlations with Graphs

  • Scatter Plots: Visual representation of relationships between variables.
  • Positive Linear Relationship: Points trend upwards (\?).
  • Negative Linear Relationship: Points trend downwards (\?).
  • No Relationship: Points scattered randomly.
  • Non-linear Patterns:
  • U-Shaped: Positive then negative trend or vice versa.
  • Exponential: Rapid increases (doubling effect).

5. Statistical Tests for Correlation

  • Types of Data & Tests:
  • Categorical Data: Use chi-squared test ((\chi^2)) to determine independence of variables.
  • Continuous Data: Use Pearson correlation for linear relationships.
  • Ranked Data: Use Kendall rank or Spearman's rank correlation.

6. Steps for Statistical Testing

  1. Choose the Right Test: Depends on data type and distribution.
  2. Calculate Test Statistic: Use a formula specific to the chosen test.
  3. Compare to Significance Levels: Use statistical tables or software to confirm the likelihood of a genuine relationship.
  4. Significance Levels:
  5. 5% significance: 95% confidence relationship is not due to chance.
  6. 1% significance: 99% confidence.

7. Tools for Testing

  • Manual: Use formulas and statistical tables.
  • Software: Modern tools (e.g., SPSS, R, Excel) calculate test statistics and significance levels automatically.

Summing it up

  • Understand Patterns: Correlation provides insight into relationships but requires careful interpretation.
  • Be Cautious: Always distinguish correlation from causation.
  • Use Visuals & Tests: Scatter plots and statistical tests complement each other to validate findings.

If you liked this, consider supporting us by checking out Tiny Skills - 250+ Top Work & Personal Skills Made Easy