
Authors: Patrick Boily and Hanan Ather (University of Ottawa)
Randomness is all around us. Probability theory is the mathematical framework that allows us to analyze chance events in a logically sound manner. The probability of an event is a number indicating how likely that event will occur. This number is always between 0 and 1, where 0 indicates impossibility and 1 indicates certainty.
A classic example of a probabilistic experiment is a fair coin toss, in which the two possible outcomes are heads or tails. In this case, the probability of flipping a head or a tail is 1/2. In an actual series of coin tosses, we may get more or less than exactly 50% heads. But as the number of flips increases, the long-run frequency of heads is bound to get closer and closer to 50%.
The Law of Large Numbers states that as the number of independent trials increases, the sample proportion \(\hat{p}\) converges to the true probability \(p\) with standard error \(SE = \sqrt{\frac{p(1-p)}{n}}\).
Drag the theoretical bars to adjust probability
🎯 Ready to demonstrate the Law of Large Numbers
This fundamental theorem shows that sample means converge to population means as \(n \to \infty\).
💡 Key Concept: The standard error decreases proportionally to \(\frac{1}{\sqrt{n}}\)
For any discrete distribution, its expectation indicates the long-run average outcome, while its variance measures how spread out the outcomes are around the expectation.
Drag the green bars above to adjust the probability of each face on the die. Then roll the die (once or multiple times) to see how the running sample mean (red) and sample variance (teal) converge to their true values.
We begin with the formal statement of the Central Limit Theorem, which underpins the demo below:
Theorem (Central Limit Theorem). Let be the sample mean of an i.i.d. sample of size drawn from an unknown population with finite mean and variance . Then as , the standardized variable
More precisely, for any real :
Important conditions: • The are independent and identically distributed (i.i.d.). • is finite. • is finite.
Remarks: • It works even if the original are not normal. • Convergence speed depends on skewness/kurtosis; is a common rule of thumb. • This justifies using normals to approximate the distribution of sample means.
Below, you’ll see this in action: as you increase your sample size, the histogram of means “morphs” toward the familiar bell curve.
The Central Limit Theorem states that the distribution of sample means approaches a normal distribution as sample size increases, regardless of the population distribution shape.
🎯 Ready to explore the Central Limit Theorem?
Click "Drop Samples" to begin your journey to 30 samples — where the magic of CLT reveals itself! 🌟
One of the main goals of statistics is to estimate unknown parameters. To approximate these parameters, we choose an estimator, which is simply any function of randomly sampled observations.
To illustrate this idea, we will estimate the value of π by uniformly dropping samples on a square containing an inscribed circle. Notice that the value of π can be expressed as a ratio of areas.
We can estimate this ratio with our samples. Let m be the number of samples within our circle and n the total number of samples dropped. We define our estimator π̂ as:
It can be shown that this estimator has the desirable properties of being unbiased and consistent.
This component is under development.
Suppose that on your most recent visit to the doctor's office, you decide to get tested for a rare disease. If you are unlucky enough to receive a positive result, the logical next question is, "Given the test result, what is the probability that I actually have this disease?" (Medical tests are, after all, not perfectly accurate.)
Bayes' Theorem tells us exactly how to compute this probability:
P(Disease|+) = (P(+|Disease)P(Disease)) / P(+)
As the equation indicates, the posterior probability of having the disease given that the test was positive depends on the prior probability of the disease P(Disease). Think of this as the incidence of the disease in the general population.
Explore how prior beliefs and test accuracy combine to determine the probability of disease given a test result. Hover over the probability tables to see which patients contribute to each outcome.
Drag bars to adjust disease prevalence
Drag bars to adjust test accuracy
| Test Negative (−) | Test Positive (+) |
|---|---|
| 0.7 | 0.3 |
| Health Status | Given Test (−) | Given Test (+) |
|---|---|---|
| Healthy | 0.964 | 0.75 |
| Disease | 0.036 | 0.25 |
Understanding continuous probability
Unlike discrete distributions where we can assign probability to individual outcomes, continuous distributions pose a challenge: What's the probability of getting exactly π when measuring?