MAT 2377

Probability and Statistics for Engineers

Authors: Patrick Boily and Hanan Ather (University of Ottawa)

Probability Concepts

Chance Events

Randomness is all around us. Probability theory is the mathematical framework that allows us to analyze chance events in a logically sound manner. The probability of an event is a number indicating how likely that event will occur. This number is always between 0 and 1, where 0 indicates impossibility and 1 indicates certainty.

A classic example of a probabilistic experiment is a fair coin toss, in which the two possible outcomes are heads or tails. In this case, the probability of flipping a head or a tail is 1/2. In an actual series of coin tosses, we may get more or less than exactly 50% heads. But as the number of flips increases, the long-run frequency of heads is bound to get closer and closer to 50%.

Law of Large Numbers: Bernoulli Trials

The Law of Large Numbers states that as the number of independent trials increases, the sample proportion $\hat{p}$ converges to the true probability $p$ with standard error $SE = \sqrt{\frac{p(1-p)}{n}}$.

Parameters

True Probability $p$

50%

Drag the theoretical bars to adjust probability

Sample Size per Trial

trials

Sample Statistics

Total Trials $(n)$:0

Successes (Heads):0

Failures (Tails):0

Sample Proportion $\hat{p}$:0.0000

True Probability $p$:0.5000

Absolute Error:0.0000

Statistical Insights

🎯 Ready to demonstrate the Law of Large Numbers

This fundamental theorem shows that sample means converge to population means as $n \to \infty$.

💡 Key Concept: The standard error decreases proportionally to $\frac{1}{\sqrt{n}}$

Expectation & Variance

For any discrete distribution, its expectation indicates the long-run average outcome, while its variance measures how spread out the outcomes are around the expectation.

Drag the green bars above to adjust the probability of each face on the die. Then roll the die (once or multiple times) to see how the running sample mean (red) and sample variance (teal) converge to their true values.

🎲 Expectation & Variance Explorer

Sampling Controls

Batch Size: 10

Statistics

True Distribution

Expected Value:3.50

Variance:2.92

E[X²]:15.17

Sample Distribution (0 rolls)

Sample Mean:—

Sample Variance:—

Roll Counts

⚀0

⚁0

⚂0

⚃0

⚄0

⚅0

Expected Value

Variance

True Values

Central Limit Theorem

We begin with the formal statement of the Central Limit Theorem, which underpins the demo below:

Theorem (Central Limit Theorem). Let $\bar X$ be the sample mean of an i.i.d. sample of size $n$ drawn from an unknown population with finite mean $\mu$ and variance $\sigma^2$ . Then as $n \to \infty$ , the standardized variable

Z_n = \frac{\bar X - \mu}{\sigma / \sqrt{n}} \;\to\; \mathcal{N}(0,1)

More precisely, for any real $z$ :

\lim_{n\to\infty} P(Z_n \le z) = \Phi(z)

Important conditions: • The $X_i$ are independent and identically distributed (i.i.d.). • $E[X_i] = \mu$ is finite. • $\mathrm{Var}(X_i) = \sigma^2$ is finite.

Remarks: • It works even if the original $X_i$ are not normal. • Convergence speed depends on skewness/kurtosis; $n \ge 30$ is a common rule of thumb. • This justifies using normals to approximate the distribution of sample means.

Below, you’ll see this in action: as you increase your sample size, the histogram of means “morphs” toward the familiar bell curve.

Central Limit Theorem Demonstration

The Central Limit Theorem states that the distribution of sample means approaches a normal distribution as sample size increases, regardless of the population distribution shape.

Beta(α, β) → Normal(μ, σ/√n)

Alpha (α)2.0

Beta (β)5.0

Sample Size (n)10

Number of Samples10

Show normal distribution overlay

Distribution Statistics

Theoretical (Beta)

Mean (μ):0.286

Std Error (σ/√n):0.051

CLT Insights

🎯 Ready to explore the Central Limit Theorem?

Click "Drop Samples" to begin your journey to 30 samples — where the magic of CLT reveals itself! 🌟

Point Estimation

One of the main goals of statistics is to estimate unknown parameters. To approximate these parameters, we choose an estimator, which is simply any function of randomly sampled observations.

To illustrate this idea, we will estimate the value of π by uniformly dropping samples on a square containing an inscribed circle. Notice that the value of π can be expressed as a ratio of areas.

S_circle = πr²

S_square = 4r²

⇒ π = 4 S_circle / S_square

We can estimate this ratio with our samples. Let m be the number of samples within our circle and n the total number of samples dropped. We define our estimator π̂ as:

π̂ = 4m/n

It can be shown that this estimator has the desirable properties of being unbiased and consistent.

m (inside circle): 0

n (total samples): 0

π̂ (estimate): —

π (actual): 3.1416

Points inside circle

Points outside circle

Confidence Interval

Confidence Interval Explorer

This component is under development.

Bootstrapping

Bootstrapping Simulation

Distribution:

μ:

σ:

Sample Size:10

Bayesian Inference

Suppose that on your most recent visit to the doctor's office, you decide to get tested for a rare disease. If you are unlucky enough to receive a positive result, the logical next question is, "Given the test result, what is the probability that I actually have this disease?" (Medical tests are, after all, not perfectly accurate.)

Bayes' Theorem tells us exactly how to compute this probability:

P(Disease|+) = (P(+|Disease)P(Disease)) / P(+)

As the equation indicates, the posterior probability of having the disease given that the test was positive depends on the prior probability of the disease P(Disease). Think of this as the incidence of the disease in the general population.

Prior (Beta Distribution)

p:0.5alpha:beta:

Heads: 0Tails: 0

Explore how prior beliefs and test accuracy combine to determine the probability of disease given a test result. Hover over the probability tables to see which patients contribute to each outcome.

Bayesian Inference Simulation

Prior Probabilities

Drag bars to adjust disease prevalence

Test Characteristics

Drag bars to adjust test accuracy

Population

Marginal Probabilities
Test Negative (−)	Test Positive (+)
0.7	0.3

Posterior Probabilities
Health Status	Given Test (−)	Given Test (+)
Healthy	0.964	0.75
Disease	0.036	0.25

Continuous Distributions

Understanding Probability Density Functions

Learning Journey1/7

Why PDFs Matter

Understanding continuous probability

Unlike discrete distributions where we can assign probability to individual outcomes, continuous distributions pose a challenge: What's the probability of getting exactly π when measuring?

With infinite precision, P(X = exactly π) = 0
Instead, we ask: What's P(3.14 ≤ X ≤ 3.15)?
PDFs give us probability through area under the curve