Chapter 11: SciPy Statistical Significance Tests

Statistical significance tests in SciPy, the way I would explain it if we were sitting together with a Jupyter notebook open and some real data on the screen.

What people mean by “SciPy Statistical Significance Tests” = the hypothesis testing functions inside scipy.stats

These are tools that help you answer questions like:

  • “Is the average height in this group really different from 170 cm?”
  • “Do drug A and drug B give different recovery times?”
  • “Is this coin fair, or is it biased?”
  • “Are these two samples drawn from the same distribution?”
  • “Do the variances differ between groups?”

Almost all of them follow this pattern:

  • You give data (one or more samples)
  • You get a test statistic (a number that measures how extreme the data looks under the null hypothesis)
  • You get a p-value (probability of seeing data this extreme — or more extreme — if the null hypothesis is true)
  • Often also: confidence intervals, effect sizes, etc.

Rule of thumb in 2026: If p < 0.05 (or your chosen α), we say the result is statistically significant (reject null hypothesis). But always report the actual p-value, effect size, and think about practical importance — never just “significant / not significant”.

Most Popular Hypothesis Tests in scipy.stats (SciPy 1.17.0 — early 2026)

Test name / purpose Function Null hypothesis (H₀) When to use it (real-world example) Parametric? Paired?
One-sample t-test ttest_1samp Mean = given value Is average IQ in class = 100? Yes
Independent two-sample t-test ttest_ind Means of two groups are equal Do men and women differ in average salary? Yes No
Paired t-test ttest_rel Mean difference = 0 (paired/related samples) Before vs after treatment on same patients Yes Yes
Mann-Whitney U (rank-sum) test mannwhitneyu Distributions are the same (stochastically equal) Non-normal data, compare two independent groups No No
Wilcoxon signed-rank test wilcoxon Median difference = 0 (paired) Non-normal paired data No Yes
Kolmogorov-Smirnov test (1-sample or 2-sample) kstest Sample follows given dist / two samples same dist Goodness-of-fit or compare distributions No
Normal distribution test normaltest Sample comes from normal distribution Check normality assumption before t-test
Shapiro-Wilk test (normality) shapiro Sample comes from normal distribution Alternative normality test (good for n < 5000)
Chi-square test of independence chi2_contingency Variables in contingency table are independent Is gender independent of voting preference?
Fisher’s exact test fisher_exact No association in 2×2 table Small counts in contingency table
Permutation test (very flexible) permutation_test Statistic same under random permutations Custom statistic, no parametric assumption No Varies

Let’s do real, copy-paste examples (Jupyter style)

Always start your notebook like this:

Python

Example 1 — One-sample t-test (classic beginner test)

Question: Is the mean reaction time in this experiment significantly different from 250 ms (industry standard)?

Python

→ Typical output: p ≈ 0.000 something → significant slowing.

Also get confidence interval:

Python

Example 2 — Independent t-test (ttest_ind) — most used in papers

Question: Do two teaching methods give different test scores?

Python

→ If p < 0.05 → evidence that method B gives higher scores.

Example 3 — Paired t-test (before-after on same subjects)

Python

→ Very powerful because removes between-subject variability.

Example 4 — Non-parametric: Mann-Whitney U (when data not normal)

Python

→ Good when normality assumption fails.

Example 5 — Quick normality check before parametric tests

Python

→ If p > 0.05 → fail to reject normality (but never “prove” normality!)

Teacher’s Practical Advice (2026 edition)

  1. Always check assumptions (normality with normaltest/shapiro, equal variance with levene or bartlett)
  2. Prefer Welch’s t-test (equal_var=False) unless you have strong reason to assume equal variances
  3. For small samples / non-normal → go non-parametric (mannwhitneyu, wilcoxon)
  4. Report exact p-value, test statistic, degrees of freedom (when available), and effect size (Cohen’s d = stats.ttest_ind(…).statistic * np.sqrt(1/n1 + 1/n2))
  5. Use permutation_test when nothing else fits — very flexible
  6. Read the docstring! → stats.ttest_ind? in Jupyter — excellent examples

Official tutorial section (still gold in 2026): https://docs.scipy.org/doc/scipy/tutorial/stats/hypothesis_tests.html

Which test are you actually trying to run right now (or planning to)?

  • Comparing two groups?
  • Before-after?
  • Normality check?
  • Chi-square for categorical?
  • Something custom?

Tell me your data/scenario and we’ll write the exact code + interpretation together. 🚀

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *