Parametric and Non-Parametric Tests of Independence

Learning Objectives Coverage

LO1: Explain parametric and nonparametric tests of the hypothesis that the population correlation coefficient equals zero, and determine whether the hypothesis is rejected at a given level of significance

Core Concept

This topic applies the hypothesis testing framework specifically to questions about relationships between variables. Tests of independence determine whether two variables have a statistically significant relationship, using either parametric (distribution-based) or nonparametric (rank-based) methods. The correlation concepts from Topic 3 are now placed into a formal inferential framework, enabling us to distinguish genuine relationships from noise.

Formulas & Calculations

Pearson correlation: r = sXY/(sX × sY) formula
Test statistic: t = r√(n-2)/√(1-r²) with df = n-2 formula exam-focus
Spearman rank: rs = 1 - (6Σdi²)/(n(n²-1)) formula

HP 12C steps:

Correlation t-test:
[r] ENTER [n] 2 - √x ×
1 [r] x² - √x ÷

Practical Examples

Traditional Finance Example: Testing correlation between two mutual funds
- Sample: 36 monthly returns, r = 0.43
- t = 0.43√(36-2)/√(1-0.43²) = 2.77
- Critical value at 5%: ±2.032
- Decision: Reject H₀, significant correlation exists
Interpretation: Funds move together, important for diversification decisions

DeFi Application

Protocol example: Testing correlation between ETH and DeFi token returns
Implementation: Use Spearman rank due to non-normal crypto returns
Advantages/Challenges: High volatility and outliers make nonparametric tests more reliable

LO2: Explain tests of independence based on contingency table data

Core Concept

Definition: Chi-square test analyzes relationships between categorical variables using contingency tables
Why it matters: Evaluates dependencies between discrete outcomes like investment decisions, risk categories, or protocol types
Key components: Observed frequencies, expected frequencies, chi-square statistic, degrees of freedom

Formulas & Calculations

Chi-square statistic: χ² = Σ[(Oij - Eij)²/Eij] formula exam-focus
Expected frequency: Eij = (Row i total × Column j total)/Grand total formula
Degrees of freedom: df = (r-1)(c-1)
Standardized residual: (Oij - Eij)/√Eij
HP 12C steps: Manual calculation required for each cell

Practical Examples

Traditional Finance Example: ETF classification analysis
- 1,594 ETFs classified by size and investment type
- 3×3 contingency table (large/mid/small × growth/blend/value)
- χ² = 32.08 with df = 4
- Critical value at 5%: 9.488
- Decision: Reject independence, size and style are related
Interpretation: Investment style depends on market cap focus

DeFi Application

Protocol example: Testing relationship between protocol type (DEX/Lending/Yield) and risk level (Low/Medium/High)
Implementation: Create contingency table from protocol classifications
Advantages/Challenges: On-chain transparency allows complete population analysis

Core Concepts Summary (80/20 Principle)

Must-Know Concepts

Pearson Correlation: Measures linear relationship, assumes normality
Spearman Rank: Robust to outliers, works with non-normal data
Chi-Square Test: Tests independence for categorical variables
Test Selection: Parametric when assumptions met, nonparametric when robust needed

Quick Reference Table

Test Type	Data Type	Assumption	Test Statistic	DeFi Use Case
Pearson	Continuous	Normal	t = r√(n-2)/√(1-r²)	Stable token correlations
Spearman	Ranked/Ordinal	None	rs formula	Volatile token analysis
Chi-square	Categorical	Independence	χ² = Σ(O-E)²/E	Protocol type vs risk
Contingency	Discrete	Random sample	Same as χ²	Wallet behavior patterns

Comprehensive Formula Sheet

Essential Formulas

Pearson Correlation Coefficient:
r = Σ[(xi - x̄)(yi - ȳ)]/√[Σ(xi - x̄)²Σ(yi - ȳ)²]
Alternative: r = sXY/(sX × sY)
Where: sXY = covariance, sX, sY = standard deviations
Used for: Linear relationship between normally distributed variables

Correlation t-test:
t = r√(n-2)/√(1-r²)
df = n - 2
Where: r = sample correlation, n = sample size
Used for: Testing H₀: ρ = 0

Spearman Rank Correlation:
rs = 1 - (6Σdi²)/(n(n²-1))
Where: di = difference in ranks for observation i
Used for: Non-normal data or ordinal variables

Chi-Square Test of Independence:
χ² = ΣΣ[(Oij - Eij)²/Eij]
Expected: Eij = (Row i total × Column j total)/Grand total
df = (r-1)(c-1)
Used for: Testing independence of categorical variables

Standardized Residual:
zij = (Oij - Eij)/√Eij
Used for: Identifying which cells contribute most to χ²

HP 12C Calculator Sequences

Pearson Correlation t-statistic:
RPN Steps: [r] ENTER [n] 2 - √x × 1 [r] x² - √x ÷
Example: 0.43 ENTER 36 2 - √x × 1 0.43 x² - √x ÷ = 2.77

Spearman Rank (manual):
1. Rank X values: smallest = 1
2. Rank Y values: smallest = 1
3. Calculate di = rank(Xi) - rank(Yi)
4. Square each di
5. Sum all di²
6. Apply formula: 1 - (6×sum)/(n×(n²-1))

Chi-square cell calculation:
RPN Steps: [O] ENTER [E] - x² [E] ÷
Example: 425 ENTER 400 - x² 400 ÷ = 1.5625

Practice Problems

Basic Level (Understanding)

Problem: Test correlation between two assets with r = 0.35, n = 25
- Given: r = 0.35, n = 25, α = 5% (two-tailed)
- Find: Test statistic and decision
- Solution:
  - t = 0.35√(25-2)/√(1-0.35²) = 0.35×4.796/0.937 = 1.79
  - Critical values: ±2.069 (df = 23)
  - |1.79| < 2.069
- Answer: Fail to reject H₀; correlation not significant at 5%

Intermediate Level (Application)

Problem: Compare Pearson and Spearman for crypto returns with outliers
- Given: 30 daily returns, Pearson r = 0.65, Spearman rs = 0.45
- Find: Which correlation is more appropriate and test significance
- Solution:
  - Outliers present → Spearman more appropriate
  - t = 0.45√(30-2)/√(1-0.45²) = 2.66
  - Critical value at 5%: ±2.048
- Answer: Spearman shows significant correlation; more reliable with outliers

Advanced Level (Analysis)

Problem: Analyze DeFi protocol categorization (3×4 contingency table)
- Given:
  - Rows: Protocol type (DEX/Lending/Derivatives)
  - Columns: TVL quartiles (Q1/Q2/Q3/Q4)
  - 500 total protocols
- Find: Test independence and identify patterns
- Solution:
  - Calculate expected frequencies for each cell
  - Compute χ² statistic
  - df = (3-1)(4-1) = 6
  - Critical value at 5%: 12.592
  - Calculate standardized residuals
- Answer: If χ² > 12.592, protocol type and TVL are dependent

DeFi Applications & Real-World Examples

Traditional Finance Context

Institution Example: Portfolio managers test correlations for diversification benefits
Market Application: Risk managers use contingency tables for stress test scenarios
Historical Case: 2008 crisis revealed hidden correlations missed by normal-period analysis

DeFi Parallels

Protocol Implementation: Yearn Finance uses correlation analysis for vault strategy selection
Smart Contract Logic: Risk assessment protocols categorize positions using contingency analysis
Advantages: Complete transaction history enables robust correlation estimates
Limitations: Short history and regime changes affect correlation stability

Case Studies

Case 1: LP Position Risk Classification defi-application
- Background: Categorizing Uniswap v3 positions by concentration and impermanent loss
- Analysis: Chi-square test on 2×3 table (concentrated/wide × low/medium/high IL)
- Outcomes: Strong dependence found (χ² = 45.3, p < 0.001)
- Lessons learned: Position width significantly affects IL risk profile

Common Pitfalls & Exam Tips

Frequent Mistakes

Mistake 1: Using Pearson with non-normal data - Check distribution first
Mistake 2: Wrong df for chi-square - Remember (r-1)(c-1) not r×c
Mistake 3: Interpreting correlation as causation - Correlation ≠ causation

Exam Strategy

Time management: 3-4 minutes for correlation tests, 5-6 for contingency tables
Question patterns: Often asks to choose between parametric/nonparametric
Quick checks: Spearman values always between -1 and 1

Key Takeaways

Essential Points

✓ Pearson tests linear relationships assuming normality ✓ Spearman uses ranks, robust to outliers and non-normality ✓ Chi-square tests independence of categorical variables ✓ Expected frequencies = (row total × column total)/grand total ✓ Choice of test depends on data characteristics and assumptions

Memory Aids

Mnemonic: “PRSC” - Pearson Regular, Spearman Ranks, Chi-square Categories
Visual: Contingency table with observed over expected in each cell
Analogy: Correlation like dance partners - moving together (positive) or opposite (negative)

Cross-References & Additional Resources

Prerequisite: Hypothesis Testing (Topic 8) for test framework
Related: Simple Linear Regression (Topic 10) extends correlation to prediction
Advanced: Big Data Techniques (Topic 11) for large-scale correlation analysis

Source Materials

Primary Reading: Volume 1, Chapter 9, Pages 1-24
Key Sections: Correlation tests (p.5-12), Contingency tables (p.15-20)
Practice Questions: End-of-chapter problems 1-15

External Resources

Videos: StatQuest’s “Pearson vs Spearman Correlation”
Articles: “A Guide to Appropriate Use of Correlation” - BMJ Statistics
Tools: Python pandas.DataFrame.corr(method=‘spearman’), R’s chisq.test()

Review Checklist

Before moving on, ensure you can:

Calculate and test Pearson correlation coefficient
Apply Spearman rank correlation for non-normal data
Construct contingency tables and calculate expected frequencies
Perform chi-square test of independence
Choose appropriate test based on data characteristics

Home

Explorer

Topic 9: Parametric and Non-Parametric Tests of Independence

Parametric and Non-Parametric Tests of Independence

Learning Objectives Coverage

LO1: Explain parametric and nonparametric tests of the hypothesis that the population correlation coefficient equals zero, and determine whether the hypothesis is rejected at a given level of significance

Core Concept

Formulas & Calculations

Practical Examples

DeFi Application

LO2: Explain tests of independence based on contingency table data

Core Concept

Formulas & Calculations

Practical Examples

DeFi Application

Core Concepts Summary (80/20 Principle)

Must-Know Concepts

Quick Reference Table

Comprehensive Formula Sheet

Essential Formulas

HP 12C Calculator Sequences

Practice Problems

Basic Level (Understanding)

Intermediate Level (Application)

Advanced Level (Analysis)

DeFi Applications & Real-World Examples

Traditional Finance Context

DeFi Parallels

Case Studies

Common Pitfalls & Exam Tips

Frequent Mistakes

Exam Strategy

Key Takeaways

Essential Points

Memory Aids

Cross-References & Additional Resources

Related Topics

Source Materials

External Resources

Review Checklist

Graph View

Table of Contents

Backlinks