Hypothesis Testing

Learning Objectives Coverage

LO1: Explain hypothesis testing and its components, including statistical significance, Type I and Type II errors, and the power of a test

Core Concept

Hypothesis testing is a statistical procedure for deciding whether to reject a claim about a population parameter based on sample evidence. Building on the sampling distributions and standard errors from the previous topic, hypothesis testing provides a rigorous framework for data-driven decision making in investment analysis, risk management, and performance evaluation. The key components are the null hypothesis (H₀), alternative hypothesis (Ha), test statistic, significance level (α), p-value, and power.

Formulas & Calculations

  • Type I Error (α): Probability of rejecting true H₀ exam-focus
  • Type II Error (β): Probability of failing to reject false H₀ exam-focus
  • Power: 1 - β = Probability of correctly rejecting false H₀ formula
  • HP 12C steps: Not directly applicable; requires statistical tables or software

Practical Examples

  • Traditional Finance Example: Testing if a fund’s Sharpe ratio exceeds benchmark
    • H₀: Sharpe ≤ 0.5
    • Hₐ: Sharpe > 0.5
    • α = 5% means 5% chance of falsely claiming outperformance
  • Calculation walkthrough: Power increases with larger effect size and sample size
  • Interpretation: Balance between Type I and Type II errors based on cost of mistakes

DeFi Application

  • Protocol example: Testing if Uniswap v3 concentrated liquidity improves capital efficiency defi-application
  • Implementation: Compare fee returns before/after concentration features
  • Advantages/Challenges: On-chain data provides exact population parameters vs. sampling uncertainty

LO2: Construct hypothesis tests and determine their statistical significance, the associated Type I and Type II errors, and power of the test given a significance level

Core Concept

The six-step hypothesis testing process provides a systematic framework that ensures rigorous, reproducible statistical analysis. Memorizing and internalizing this process is essential for the exam: State hypotheses, Choose test, Set alpha, Decision rule, Calculate, Decide. exam-focus

Formulas & Calculations

  • Test Statistics:
    • Single mean: t = (X̄ - μ₀)/(s/√n)
    • Difference in means: t = (X̄₁ - X̄₂)/√(sp²/n₁ + sp²/n₂)
    • Single variance: χ² = (n-1)s²/σ₀²
    • Correlation: t = r√(n-2)/√(1-r²)
  • HP 12C steps: hp12c
    t-statistic calculation:
    [X̄] [μ₀] -
    [s] [n] √x ÷
    ÷
    
  • Common variations: One-tailed vs. two-tailed tests

Practical Examples

  • Traditional Finance Example: Sendar Equity Fund monthly returns
    • Sample: 24 months, mean = 1.50%, std dev = 3.60%
    • Testing H₀: μ = 1.1% vs. Hₐ: μ ≠ 1.1% (two-tailed)
    • t = (1.50 - 1.10)/(3.60/√24) = 0.544
    • Critical values at 5%: ±2.069
    • Decision: Fail to reject (0.544 < 2.069)
  • Interpretation: No statistical evidence that returns differ from 1.1%

DeFi Application

  • Protocol example: Testing if Aave v3 efficiency mode reduces liquidation risk defi-application
  • Implementation: Compare liquidation rates before/after efficiency mode
  • Advantages/Challenges: Smart contract data provides complete transaction history

LO3: Compare and contrast parametric and nonparametric tests, and describe situations where each is the more appropriate type of test

Core Concept

  • Definition: Parametric tests assume specific distributions; nonparametric tests are distribution-free
  • Why it matters: Choosing the right test ensures valid statistical inference
  • Key components: Distribution assumptions, sample size requirements, data types

Formulas & Calculations

  • Parametric Tests: t-test, F-test, χ²-test (assume normal distribution)
  • Nonparametric Tests:
    • Wilcoxon signed-rank (alternative to one-sample t-test)
    • Mann-Whitney U (alternative to two-sample t-test)
    • Spearman rank correlation (alternative to Pearson correlation)
  • HP 12C steps: Nonparametric tests typically require rank calculations

Practical Examples

  • Traditional Finance Example: Testing median returns with outliers
    • Data: Hedge fund returns with extreme values
    • Parametric approach: t-test may be misleading due to outliers
    • Nonparametric approach: Wilcoxon test on ranks
  • Interpretation: Nonparametric test more robust when normality violated

DeFi Application

  • Protocol example: Analyzing gas fee distributions (highly skewed)
  • Implementation: Use Mann-Whitney U test for weekend vs. weekday comparison
  • Advantages/Challenges: Crypto data often exhibits fat tails requiring nonparametric methods

Core Concepts Summary (80/20 Principle)

Must-Know Concepts

  1. Hypothesis Testing Framework: Six-step process from hypothesis to decision
  2. Type I vs. Type II Errors: α = reject true H₀, β = fail to reject false H₀
  3. P-value Interpretation: Probability of observing test statistic if H₀ true
  4. Test Selection: Parametric when assumptions met, nonparametric when robust needed

Quick Reference Table

Test TypeUse CaseTest StatisticDeFi Application
One-sample t-testSingle meant = (X̄-μ₀)/(s/√n)Protocol APY claims
Two-sample t-testCompare meansPooled variance tA/B testing features
F-testCompare variancesF = s₁²/s₂²Risk comparison
WilcoxonNon-normal dataRank-basedGas fee analysis

Comprehensive Formula Sheet

Essential Formulas

Single Mean t-test:
t = (X̄ - μ₀)/(s/√n)
Where: X̄ = sample mean, μ₀ = hypothesized mean, s = sample std dev, n = sample size
df = n - 1
Used for: Testing if population mean equals specific value

Difference in Means (Independent):
t = (X̄₁ - X̄₂)/√(sp²/n₁ + sp²/n₂)
Pooled variance: sp² = [(n₁-1)s₁² + (n₂-1)s₂²]/(n₁+n₂-2)
df = n₁ + n₂ - 2
Used for: Comparing two population means

Paired Comparisons:
t = (d̄ - μd₀)/(sd/√n)
Where: d̄ = mean difference, sd = std dev of differences
df = n - 1
Used for: Before/after comparisons

Variance Test:
χ² = (n-1)s²/σ₀²
df = n - 1
Used for: Testing if variance equals specific value

F-test for Variances:
F = s₁²/s₂²
df = (n₁-1, n₂-1)
Used for: Comparing two variances

Correlation Test:
t = r√(n-2)/√(1-r²)
df = n - 2
Used for: Testing if correlation is significant

HP 12C Calculator Sequences

t-statistic (single mean):
RPN Steps: [X̄] ENTER [μ₀] - [s] ENTER [n] √x ÷ ÷
Example: 1.50 ENTER 1.10 - 3.60 ENTER 24 √x ÷ ÷ = 0.544

Pooled Variance:
RPN Steps: [n₁] 1 - [s₁] x² × [n₂] 1 - [s₂] x² × + [n₁] [n₂] + 2 - ÷
Example: For n₁=30, s₁=5, n₂=40, s₂=6

Chi-square statistic:
RPN Steps: [n] 1 - [s] x² × [σ₀] x² ÷
Example: 25 1 - 4.5 x² × 5 x² ÷ = 19.44

Practice Problems

Basic Level (Understanding)

  1. Problem: Test if mean return = 2% with sample mean 2.5%, s = 3%, n = 36
    • Given: X̄ = 2.5%, μ₀ = 2%, s = 3%, n = 36
    • Find: Test statistic and decision at α = 5%
    • Solution:
      • t = (2.5 - 2)/(3/√36) = 0.5/0.5 = 1.0
      • Critical value (two-tailed, df=35): ±2.03
      • |1.0| < 2.03
    • Answer: Fail to reject H₀; insufficient evidence that mean ≠ 2%

Intermediate Level (Application)

  1. Problem: Compare volatility of DeFi vs. TradFi assets (F-test)
    • Given: DeFi: n₁ = 50, s₁ = 8%; TradFi: n₂ = 60, s₂ = 4%
    • Find: Test if DeFi variance > TradFi variance at α = 1%
    • Solution:
      • F = (8²)/(4²) = 64/16 = 4.0
      • Critical value F₀.₀₁,₄₉,₅₉ ≈ 1.76
      • 4.0 > 1.76
    • Answer: Reject H₀; DeFi significantly more volatile

Advanced Level (Analysis)

  1. Problem: Multi-step analysis of trading strategy performance
    • Given:
      • Strategy returns: 18 months, mean = 3.2%, s = 5.1%
      • Benchmark: μ = 2.0%, σ = 4.0%
      • Risk-free rate = 0.5%
    • Find: Test outperformance, compare Sharpe ratios, assess power
    • Solution:
      • Performance test: t = (3.2-2.0)/(5.1/√18) = 1.0
      • Sharpe comparison requires additional calculations
      • Power analysis depends on true difference
    • Answer: Strategy shows positive but not significant outperformance at 5% level

DeFi Applications & Real-World Examples

Traditional Finance Context

  • Institution Example: Investment banks use hypothesis testing for trading strategy validation
  • Market Application: Regulators test market efficiency and manipulation claims
  • Historical Case: LTCM tested correlation assumptions that failed in crisis

DeFi Parallels

  • Protocol Implementation: Compound uses statistical tests for interest rate model calibration
  • Smart Contract Logic: Oracles implement outlier detection using hypothesis tests
  • Advantages: Complete data transparency enables exact population testing
  • Limitations: Short history limits power for long-term pattern detection

Case Studies

  1. Case 1: MEV Impact on DEX Trading
    • Background: Testing if MEV bots increase slippage
    • Analysis: Two-sample t-test comparing trades with/without MEV
    • Outcomes: Significant difference found (p < 0.001)
    • Lessons learned: Statistical evidence supports MEV protection mechanisms

Common Pitfalls & Exam Tips

Frequent Mistakes

  • Mistake 1: Confusing one-tailed and two-tailed tests - Check alternative hypothesis carefully
  • Mistake 2: Using wrong degrees of freedom - Remember n-1 for single sample, n₁+n₂-2 for pooled
  • Mistake 3: Misinterpreting p-values - p-value is NOT probability H₀ is true

Exam Strategy

  • Time management: 4-5 minutes per hypothesis testing question
  • Question patterns: Often combined with confidence intervals and regression
  • Quick checks: Ensure test statistic and critical values have same number of tails

Key Takeaways

Essential Points

✓ Hypothesis testing provides framework for statistical decision making ✓ Type I error (α) is controllable; Type II error (β) depends on true parameter ✓ P-value < α means reject H₀; provides exact significance level ✓ Parametric tests more powerful but require distribution assumptions ✓ Nonparametric tests robust to outliers and non-normality

Memory Aids

  • Mnemonic: “HATSPD” - Hypotheses, Alpha, Test statistic, Significance, P-value, Decision
  • Visual: Type I/II error quadrant showing decision outcomes
  • Analogy: Court trial - H₀ is innocence, evidence must be “beyond reasonable doubt”

Cross-References & Additional Resources

Source Materials

  • Primary Reading: Volume 1, Chapter 8, Pages 1-35
  • Key Sections: Six-step process (p.5-8), Type I/II errors (p.10-14), Nonparametric tests (p.28-32)
  • Practice Questions: End-of-chapter problems 1-20

External Resources

  • Videos: StatQuest hypothesis testing series on YouTube
  • Articles: “Statistical Power Analysis” by Cohen
  • Tools: R’s stats package, Python’s scipy.stats for test implementations

Review Checklist

Before moving on, ensure you can:

  • Execute the six-step hypothesis testing process
  • Calculate t, F, and χ² test statistics
  • Interpret p-values and make appropriate decisions
  • Choose between parametric and nonparametric tests
  • Apply hypothesis testing to DeFi protocol analysis