Hypothesis Testing
Learning Objectives Coverage
LO1: Explain hypothesis testing and its components, including statistical significance, Type I and Type II errors, and the power of a test
Core Concept
Hypothesis testing is a statistical procedure for deciding whether to reject a claim about a population parameter based on sample evidence. Building on the sampling distributions and standard errors from the previous topic, hypothesis testing provides a rigorous framework for data-driven decision making in investment analysis, risk management, and performance evaluation. The key components are the null hypothesis (H₀), alternative hypothesis (Ha), test statistic, significance level (α), p-value, and power.
Formulas & Calculations
- Type I Error (α): Probability of rejecting true H₀ exam-focus
- Type II Error (β): Probability of failing to reject false H₀ exam-focus
- Power: 1 - β = Probability of correctly rejecting false H₀ formula
- HP 12C steps: Not directly applicable; requires statistical tables or software
Practical Examples
- Traditional Finance Example: Testing if a fund’s Sharpe ratio exceeds benchmark
- H₀: Sharpe ≤ 0.5
- Hₐ: Sharpe > 0.5
- α = 5% means 5% chance of falsely claiming outperformance
- Calculation walkthrough: Power increases with larger effect size and sample size
- Interpretation: Balance between Type I and Type II errors based on cost of mistakes
DeFi Application
- Protocol example: Testing if Uniswap v3 concentrated liquidity improves capital efficiency defi-application
- Implementation: Compare fee returns before/after concentration features
- Advantages/Challenges: On-chain data provides exact population parameters vs. sampling uncertainty
LO2: Construct hypothesis tests and determine their statistical significance, the associated Type I and Type II errors, and power of the test given a significance level
Core Concept
The six-step hypothesis testing process provides a systematic framework that ensures rigorous, reproducible statistical analysis. Memorizing and internalizing this process is essential for the exam: State hypotheses, Choose test, Set alpha, Decision rule, Calculate, Decide. exam-focus
Formulas & Calculations
- Test Statistics:
- Single mean: t = (X̄ - μ₀)/(s/√n)
- Difference in means: t = (X̄₁ - X̄₂)/√(sp²/n₁ + sp²/n₂)
- Single variance: χ² = (n-1)s²/σ₀²
- Correlation: t = r√(n-2)/√(1-r²)
- HP 12C steps: hp12c
t-statistic calculation: [X̄] [μ₀] - [s] [n] √x ÷ ÷ - Common variations: One-tailed vs. two-tailed tests
Practical Examples
- Traditional Finance Example: Sendar Equity Fund monthly returns
- Sample: 24 months, mean = 1.50%, std dev = 3.60%
- Testing H₀: μ = 1.1% vs. Hₐ: μ ≠ 1.1% (two-tailed)
- t = (1.50 - 1.10)/(3.60/√24) = 0.544
- Critical values at 5%: ±2.069
- Decision: Fail to reject (0.544 < 2.069)
- Interpretation: No statistical evidence that returns differ from 1.1%
DeFi Application
- Protocol example: Testing if Aave v3 efficiency mode reduces liquidation risk defi-application
- Implementation: Compare liquidation rates before/after efficiency mode
- Advantages/Challenges: Smart contract data provides complete transaction history
LO3: Compare and contrast parametric and nonparametric tests, and describe situations where each is the more appropriate type of test
Core Concept
- Definition: Parametric tests assume specific distributions; nonparametric tests are distribution-free
- Why it matters: Choosing the right test ensures valid statistical inference
- Key components: Distribution assumptions, sample size requirements, data types
Formulas & Calculations
- Parametric Tests: t-test, F-test, χ²-test (assume normal distribution)
- Nonparametric Tests:
- Wilcoxon signed-rank (alternative to one-sample t-test)
- Mann-Whitney U (alternative to two-sample t-test)
- Spearman rank correlation (alternative to Pearson correlation)
- HP 12C steps: Nonparametric tests typically require rank calculations
Practical Examples
- Traditional Finance Example: Testing median returns with outliers
- Data: Hedge fund returns with extreme values
- Parametric approach: t-test may be misleading due to outliers
- Nonparametric approach: Wilcoxon test on ranks
- Interpretation: Nonparametric test more robust when normality violated
DeFi Application
- Protocol example: Analyzing gas fee distributions (highly skewed)
- Implementation: Use Mann-Whitney U test for weekend vs. weekday comparison
- Advantages/Challenges: Crypto data often exhibits fat tails requiring nonparametric methods
Core Concepts Summary (80/20 Principle)
Must-Know Concepts
- Hypothesis Testing Framework: Six-step process from hypothesis to decision
- Type I vs. Type II Errors: α = reject true H₀, β = fail to reject false H₀
- P-value Interpretation: Probability of observing test statistic if H₀ true
- Test Selection: Parametric when assumptions met, nonparametric when robust needed
Quick Reference Table
| Test Type | Use Case | Test Statistic | DeFi Application |
|---|---|---|---|
| One-sample t-test | Single mean | t = (X̄-μ₀)/(s/√n) | Protocol APY claims |
| Two-sample t-test | Compare means | Pooled variance t | A/B testing features |
| F-test | Compare variances | F = s₁²/s₂² | Risk comparison |
| Wilcoxon | Non-normal data | Rank-based | Gas fee analysis |
Comprehensive Formula Sheet
Essential Formulas
Single Mean t-test:
t = (X̄ - μ₀)/(s/√n)
Where: X̄ = sample mean, μ₀ = hypothesized mean, s = sample std dev, n = sample size
df = n - 1
Used for: Testing if population mean equals specific value
Difference in Means (Independent):
t = (X̄₁ - X̄₂)/√(sp²/n₁ + sp²/n₂)
Pooled variance: sp² = [(n₁-1)s₁² + (n₂-1)s₂²]/(n₁+n₂-2)
df = n₁ + n₂ - 2
Used for: Comparing two population means
Paired Comparisons:
t = (d̄ - μd₀)/(sd/√n)
Where: d̄ = mean difference, sd = std dev of differences
df = n - 1
Used for: Before/after comparisons
Variance Test:
χ² = (n-1)s²/σ₀²
df = n - 1
Used for: Testing if variance equals specific value
F-test for Variances:
F = s₁²/s₂²
df = (n₁-1, n₂-1)
Used for: Comparing two variances
Correlation Test:
t = r√(n-2)/√(1-r²)
df = n - 2
Used for: Testing if correlation is significant
HP 12C Calculator Sequences
t-statistic (single mean):
RPN Steps: [X̄] ENTER [μ₀] - [s] ENTER [n] √x ÷ ÷
Example: 1.50 ENTER 1.10 - 3.60 ENTER 24 √x ÷ ÷ = 0.544
Pooled Variance:
RPN Steps: [n₁] 1 - [s₁] x² × [n₂] 1 - [s₂] x² × + [n₁] [n₂] + 2 - ÷
Example: For n₁=30, s₁=5, n₂=40, s₂=6
Chi-square statistic:
RPN Steps: [n] 1 - [s] x² × [σ₀] x² ÷
Example: 25 1 - 4.5 x² × 5 x² ÷ = 19.44
Practice Problems
Basic Level (Understanding)
- Problem: Test if mean return = 2% with sample mean 2.5%, s = 3%, n = 36
- Given: X̄ = 2.5%, μ₀ = 2%, s = 3%, n = 36
- Find: Test statistic and decision at α = 5%
- Solution:
- t = (2.5 - 2)/(3/√36) = 0.5/0.5 = 1.0
- Critical value (two-tailed, df=35): ±2.03
- |1.0| < 2.03
- Answer: Fail to reject H₀; insufficient evidence that mean ≠ 2%
Intermediate Level (Application)
- Problem: Compare volatility of DeFi vs. TradFi assets (F-test)
- Given: DeFi: n₁ = 50, s₁ = 8%; TradFi: n₂ = 60, s₂ = 4%
- Find: Test if DeFi variance > TradFi variance at α = 1%
- Solution:
- F = (8²)/(4²) = 64/16 = 4.0
- Critical value F₀.₀₁,₄₉,₅₉ ≈ 1.76
- 4.0 > 1.76
- Answer: Reject H₀; DeFi significantly more volatile
Advanced Level (Analysis)
- Problem: Multi-step analysis of trading strategy performance
- Given:
- Strategy returns: 18 months, mean = 3.2%, s = 5.1%
- Benchmark: μ = 2.0%, σ = 4.0%
- Risk-free rate = 0.5%
- Find: Test outperformance, compare Sharpe ratios, assess power
- Solution:
- Performance test: t = (3.2-2.0)/(5.1/√18) = 1.0
- Sharpe comparison requires additional calculations
- Power analysis depends on true difference
- Answer: Strategy shows positive but not significant outperformance at 5% level
- Given:
DeFi Applications & Real-World Examples
Traditional Finance Context
- Institution Example: Investment banks use hypothesis testing for trading strategy validation
- Market Application: Regulators test market efficiency and manipulation claims
- Historical Case: LTCM tested correlation assumptions that failed in crisis
DeFi Parallels
- Protocol Implementation: Compound uses statistical tests for interest rate model calibration
- Smart Contract Logic: Oracles implement outlier detection using hypothesis tests
- Advantages: Complete data transparency enables exact population testing
- Limitations: Short history limits power for long-term pattern detection
Case Studies
- Case 1: MEV Impact on DEX Trading
- Background: Testing if MEV bots increase slippage
- Analysis: Two-sample t-test comparing trades with/without MEV
- Outcomes: Significant difference found (p < 0.001)
- Lessons learned: Statistical evidence supports MEV protection mechanisms
Common Pitfalls & Exam Tips
Frequent Mistakes
- Mistake 1: Confusing one-tailed and two-tailed tests - Check alternative hypothesis carefully
- Mistake 2: Using wrong degrees of freedom - Remember n-1 for single sample, n₁+n₂-2 for pooled
- Mistake 3: Misinterpreting p-values - p-value is NOT probability H₀ is true
Exam Strategy
- Time management: 4-5 minutes per hypothesis testing question
- Question patterns: Often combined with confidence intervals and regression
- Quick checks: Ensure test statistic and critical values have same number of tails
Key Takeaways
Essential Points
✓ Hypothesis testing provides framework for statistical decision making ✓ Type I error (α) is controllable; Type II error (β) depends on true parameter ✓ P-value < α means reject H₀; provides exact significance level ✓ Parametric tests more powerful but require distribution assumptions ✓ Nonparametric tests robust to outliers and non-normality
Memory Aids
- Mnemonic: “HATSPD” - Hypotheses, Alpha, Test statistic, Significance, P-value, Decision
- Visual: Type I/II error quadrant showing decision outcomes
- Analogy: Court trial - H₀ is innocence, evidence must be “beyond reasonable doubt”
Cross-References & Additional Resources
Related Topics
- Prerequisite: Estimation and Inference (Topic 7) for sampling distributions
- Related: Parametric and Non-Parametric Tests (Topic 9) extends these concepts
- Advanced: Regression Analysis (Topic 10) uses hypothesis tests for coefficients
Source Materials
- Primary Reading: Volume 1, Chapter 8, Pages 1-35
- Key Sections: Six-step process (p.5-8), Type I/II errors (p.10-14), Nonparametric tests (p.28-32)
- Practice Questions: End-of-chapter problems 1-20
External Resources
- Videos: StatQuest hypothesis testing series on YouTube
- Articles: “Statistical Power Analysis” by Cohen
- Tools: R’s stats package, Python’s scipy.stats for test implementations
Review Checklist
Before moving on, ensure you can:
- Execute the six-step hypothesis testing process
- Calculate t, F, and χ² test statistics
- Interpret p-values and make appropriate decisions
- Choose between parametric and nonparametric tests
- Apply hypothesis testing to DeFi protocol analysis