Statistical Analysis

This page demonstrates various statistical visualization techniques commonly used in academic research. All visualizations are interactive and can be exported to PDF with professional formatting^[1].

Distribution Analysis

Understanding data distributions is fundamental to statistical analysis. Below we explore various ways to visualize and analyze distributions^[2].

Normal Distribution Comparison

Box Plots and Violin Plots

Box plots provide a concise summary of distributions, showing median, quartiles, and outliers^[3].

Correlation and Regression

Scatter Plot Matrix

Exploring relationships between multiple variables simultaneously^[4].

Hypothesis Testing

t-Test Visualization

Visualizing the results of statistical tests helps communicate significance^[5].

Confidence Intervals

Parameter Estimation with Confidence Intervals

Effect Sizes and Power Analysis

Cohen's d Effect Size

Multiple Comparisons

ANOVA-style Visualization

Summary Statistics Table

Statistical Methods Notes

The visualizations on this page demonstrate fundamental statistical concepts used in research^[6]. Each plot is designed to be both informative and publication-ready.

Key Takeaways

Distribution Analysis: Always examine your data's distribution before choosing statistical tests
Effect Sizes: Report effect sizes alongside p-values for practical significance
Confidence Intervals: Provide more information than p-values alone
Multiple Comparisons: Adjust for multiple testing to avoid false positives
Visualization: Good statistical graphics can reveal patterns that summary statistics miss

All statistical visualizations use D3.js and Observable Plot for rendering. The underlying calculations follow standard statistical formulas as described in Cohen (1988) and Tukey (1977). ↩︎
Distribution analysis forms the foundation of statistical inference. Normal distributions are assumed by many parametric tests, making distribution checking a critical first step. See Shapiro-Wilk test for formal normality testing. ↩︎
Box plots were popularized by John Tukey in his 1977 book "Exploratory Data Analysis". They efficiently display the five-number summary: minimum, Q1, median, Q3, and maximum, plus outliers. ↩︎
Correlation does not imply causation. The correlation coefficient r measures linear association strength, ranging from -1 to +1. The coefficient of determination R² represents the proportion of variance explained. ↩︎
The t-test, developed by William Sealy Gosset under the pseudonym "Student", compares means between two groups. It assumes normal distributions and equal variances (for the standard version). ↩︎
For comprehensive statistical methodology, see: Wasserman, L. (2004). "All of Statistics: A Concise Course in Statistical Inference". Springer. ISBN: 978-0-387-40272-7. ↩︎