Discrete Probability Distributions

Learning Objectives

  • Understand the concepts of probability mass functions (PMF) and cumulative distribution functions (CDF).
  • Calculate the expected value (mean) and variance of a discrete random variable.
  • Apply the Binomial and Poisson distributions to model relevant civil engineering scenarios.
  • Apply the Geometric and Negative Binomial distributions for scenarios involving trials to achieve successes.
  • Use the Hypergeometric distribution to model probabilities when sampling without replacement from finite batches.

Discrete probability distributions are foundational for modeling scenarios where outcomes are countable, such as the number of structural defects, rainfall events, or equipment failures. This lesson covers key distributions including Binomial, Poisson, Geometric, Negative Binomial, and Hypergeometric, and how to apply them to solve practical engineering problems.

Introduction to Random Variables

When analyzing engineering data, we often deal with variables whose outcomes are determined by chance. A random variable is a numerical description of the outcome of an experiment. A discrete random variable can take on a countable number of distinct values (e.g., the number of potholes on a 10km stretch of road, the number of defective bricks in a pallet, or the number of extreme weather events in a year). Understanding these variables allows engineers to quantify risk and predict the frequency of specific events.

Probability Mass Functions and Mathematical Expectation

Probability Functions and Expectation

The foundational math behind discrete random variables.

Probability Mass Function (PMF), f(x)f(x) or P(X=x)P(X=x)

A function that assigns a probability to each possible value of a discrete random variable.

PMF Conditions

Cumulative Distribution Function (CDF), F(x)F(x)

The probability that the random variable XX will take a value less than or equal to xx.

Cumulative Distribution Function

Formula for calculating the CDF of a discrete random variable.

F(x)=P(X≀x)=βˆ‘t≀xf(t)F(x) = P(X \le x) = \sum_{t \le x} f(t)

Variables

SymbolDescriptionUnit
F(x)F(x)Cumulative Distribution Function-
XXDiscrete random variable-
xxSpecific value of the random variable-
f(t)f(t)Probability Mass Function evaluated at t-

Mathematical Expectation

Mathematical Expectation and Variance

The expected value represents the theoretical mean of the random variable.

Expected Value (Mean), ΞΌ\mu or E[X]E[X]

The long-run average value of the random variable over infinitely many trials. It is the center of the probability distribution.

Expected Value (Mean)

Formula for calculating the expected value.

ΞΌ=E[X]=βˆ‘xxβ‹…f(x)\mu = E[X] = \sum_{x} x \cdot f(x)

Variables

SymbolDescriptionUnit
ΞΌ\muExpected value or mean-
E[X]E[X]Expected value of random variable X-
xxSpecific value of the random variable-
f(x)f(x)Probability Mass Function evaluated at x-

Variance, Οƒ2\sigma^2 or V(X)V(X)

A measure of the dispersion or spread of the probability distribution around the mean.

Variance

Formula for calculating the variance.

Οƒ2=E[(Xβˆ’ΞΌ)2]=βˆ‘x(xβˆ’ΞΌ)2β‹…f(x)\sigma^2 = E[(X - \mu)^2] = \sum_{x} (x - \mu)^2 \cdot f(x)

Variables

SymbolDescriptionUnit
Οƒ2\sigma^2Variance-
EEExpected value operator-
XXDiscrete random variable-
ΞΌ\muExpected value or mean-
xxSpecific value of the random variable-
f(x)f(x)Probability Mass Function evaluated at x-

Computational Formula for Variance

An alternative, easier formula for calculating variance.

Οƒ2=E[X2]βˆ’(E[X])2\sigma^2 = E[X^2] - (E[X])^2

Variables

SymbolDescriptionUnit
Οƒ2\sigma^2Variance-
E[X2]E[X^2]Expected value of the square of X-
E[X]E[X]Expected value of X-

Common Discrete Distributions in Engineering

Common Discrete Distributions

Specific models used to describe common engineering scenarios.

The Binomial Distribution

Introduction to the Binomial Distribution

The Binomial Distribution models the number of successes in a fixed number of independent trials. In civil engineering, this could represent the number of concrete cylinders that pass a compressive strength test out of a batch of 10, or the number of days a construction project is delayed due to weather in a given month.

Binomial Distribution

A distribution that models the probability of exactly xx successes in nn independent trials.

Binomial Distribution Conditions

Binomial Distribution Formula

Formula to calculate the probability of exactly x successes in n trials.

P(X=x)=(nx)px(1βˆ’p)nβˆ’xforΒ x=0,1,…,nP(X = x) = \binom{n}{x} p^x (1-p)^{n-x} \quad \text{for } x = 0, 1, \dots, n

Variables

SymbolDescriptionUnit
P(X=x)P(X = x)Probability of exactly x successes-
nnTotal number of trials-
xxNumber of successes-
ppProbability of success on a single trial-
(1βˆ’p)(1-p)Probability of failure on a single trial-

Binomial Mean and Variance

Formulas for the mean and variance of a Binomial distribution.

ΞΌ=npandΟƒ2=np(1βˆ’p)\mu = np \quad \text{and} \quad \sigma^2 = np(1-p)

Variables

SymbolDescriptionUnit
ΞΌ\muExpected value or mean-
Οƒ2\sigma^2Variance-
nnTotal number of trials-
ppProbability of success on a single trial-

The Poisson Distribution

Introduction to the Poisson Distribution

The Poisson Distribution models the number of events occurring in a fixed interval of time or space. This is highly applicable for modeling rare, independent events over continuous intervals, such as the number of heavy trucks passing a bridge toll per hour, or the number of micro-cracks along a 50-meter steel beam.

Poisson Distribution

Used for rare events where the exact number of trials nn is effectively infinite and pp is very small, but the average rate of occurrence (Ξ»\lambda) is known. Examples include the number of traffic accidents per month at a given intersection, or the number of flaws in a 100m reel of fiber optic cable.

Poisson Distribution Formula

Formula to calculate the probability of exactly x events occurring in a given interval.

P(X=x)=Ξ»xeβˆ’Ξ»x!forΒ x=0,1,2,…P(X = x) = \frac{\lambda^x e^{-\lambda}}{x!} \quad \text{for } x = 0, 1, 2, \dots

Variables

SymbolDescriptionUnit
P(X=x)P(X = x)Probability of exactly x events-
Ξ»\lambdaAverage rate of occurrence in the interval-
eeEuler's number (approximately 2.71828)-
xxNumber of events-

Poisson Mean and Variance

Formulas for the mean and variance of a Poisson distribution.

μ=λandσ2=λ\mu = \lambda \quad \text{and} \quad \sigma^2 = \lambda

Variables

SymbolDescriptionUnit
ΞΌ\muExpected value or mean-
Οƒ2\sigma^2Variance-
Ξ»\lambdaAverage rate of occurrence in the interval-

Unique Property of Poisson Distribution

A unique property of the Poisson distribution is that its mean equals its variance.

The Geometric and Negative Binomial Distributions

Geometric and Negative Binomial Models

The Geometric and Negative Binomial distributions model the number of trials needed to achieve a specific number of successes. For instance, testing identical soil samples until the first one fails a compaction test (Geometric), or testing until exactly three samples fail (Negative Binomial).

Geometric Distribution

Models the number of independent trials XX needed to get the first success. (e.g., How many times must we test a newly designed joint until we observe the first failure, assuming a constant failure probability pp?)

Geometric Distribution Formula

Formula for calculating the probability of getting the first success on the x-th trial.

P(X=x)=(1βˆ’p)xβˆ’1pforΒ x=1,2,3,…P(X = x) = (1-p)^{x-1}p \quad \text{for } x = 1, 2, 3, \dots

Variables

SymbolDescriptionUnit
P(X=x)P(X = x)Probability of first success on the x-th trial-
ppProbability of success on a single trial-
xxTrial number on which the first success occurs-

Geometric Mean and Variance

Formulas for calculating the mean and variance of a geometric distribution.

ΞΌ=1pandΟƒ2=1βˆ’pp2\mu = \frac{1}{p} \quad \text{and} \quad \sigma^2 = \frac{1-p}{p^2}

Variables

SymbolDescriptionUnit
ΞΌ\muExpected value or mean-
Οƒ2\sigma^2Variance-
ppProbability of success on a single trial-

Negative Binomial Distribution

A generalization of the geometric distribution. It models the number of independent trials XX needed to get exactly rr successes.

Negative Binomial Distribution Formula

Formula for calculating the probability of getting the r-th success on the x-th trial.

P(X=x)=(xβˆ’1rβˆ’1)pr(1βˆ’p)xβˆ’rforΒ x=r,r+1,…P(X = x) = \binom{x-1}{r-1} p^r (1-p)^{x-r} \quad \text{for } x = r, r+1, \dots

Variables

SymbolDescriptionUnit
P(X=x)P(X = x)Probability of r-th success on the x-th trial-
xxTrial number on which the r-th success occurs-
rrTotal number of successes desired-
ppProbability of success on a single trial-

Negative Binomial Mean and Variance

Formulas for calculating the mean and variance of a negative binomial distribution.

ΞΌ=rpandΟƒ2=r(1βˆ’p)p2\mu = \frac{r}{p} \quad \text{and} \quad \sigma^2 = \frac{r(1-p)}{p^2}

Variables

SymbolDescriptionUnit
ΞΌ\muExpected value or mean-
Οƒ2\sigma^2Variance-
rrTotal number of successes desired-
ppProbability of success on a single trial-

The Hypergeometric Distribution

Sampling Without Replacement

The Hypergeometric Distribution models sampling without replacement from a finite population. This is critical in quality control where inspecting an item destroys it or removes it from the batch, changing the probability for subsequent selections.

Hypergeometric Distribution

Unlike the Binomial distribution where pp is constant (sampling with replacement), the Hypergeometric distribution is used when sampling without replacement from a finite population of size NN, containing exactly KK successes. (e.g., Selecting 5 concrete cylinders from a batch of 50, where 3 are known to be defective).

Hypergeometric Distribution Formula

Formula for calculating hypergeometric probabilities.

P(X=x)=(Kx)(Nβˆ’Knβˆ’x)(Nn)P(X = x) = \frac{\binom{K}{x} \binom{N-K}{n-x}}{\binom{N}{n}}

Variables

SymbolDescriptionUnit
P(X=x)P(X = x)Probability of exactly x successes in the sample-
NNTotal population size-
KKTotal number of successes in the population-
nnSample size-
xxNumber of successes in the sample-

Hypergeometric Mean and Variance

Formulas for the mean and variance of a hypergeometric distribution.

ΞΌ=n(KN)andΟƒ2=n(KN)(Nβˆ’KN)(Nβˆ’nNβˆ’1)\mu = n \left(\frac{K}{N}\right) \quad \text{and} \quad \sigma^2 = n \left(\frac{K}{N}\right) \left(\frac{N-K}{N}\right) \left(\frac{N-n}{N-1}\right)

Variables

SymbolDescriptionUnit
ΞΌ\muExpected value or mean-
Οƒ2\sigma^2Variance-
NNTotal population size-
KKTotal number of successes in the population-
nnSample size-

Interactive Simulation

Interact with the simulation below to visualize various discrete probability distributions.

Engineering Data Analysis

Discrete Probability Distributions Explorer

Loading chart...
P(X=x)=(10x)(0.50)x(0.50)10βˆ’xP(X=x) = \binom{10}{x} (0.50)^x (0.50)^{10-x}
10
0.50

Theoretical Properties

Mean (ΞΌ\mu):5.00
Variance (Οƒ2\sigma^2):2.50
Std Dev (Οƒ\sigma):1.58

Interactive Simulation

Compare the geometric and hypergeometric distributions under different parameters to see how sampling without replacement alters success probabilities.

Engineering Data Analysis β€’ Topic 5

Discrete Probability Distributions Sandbox

Success Prob (pp)0.30
Loading chart...
Mean (Expected Value)3.333
Variance7.778
Key Takeaways
  • Random Variables: Numerical values assigned to experimental outcomes.
  • Expected Value (E[X]E[X]): The long-run average of a discrete distribution.
  • Binomial: Used for independent trials with exactly two outcomes (success/failure) and constant probability pp.
  • Poisson: Used for modeling the number of rare events occurring within a continuous interval (time, area, volume).
  • Geometric/Negative Binomial: Focuses on the number of trials needed to achieve a specified number of successes.
  • Hypergeometric: Used for finite populations when sampling without replacement (probability changes trial-to-trial).