Estimation

Estimation

Learning Objectives

Understand the purpose of point estimation and the properties of good estimators.
Calculate and interpret confidence intervals for means, proportions, and variances.
Differentiate between confidence intervals, prediction intervals, and tolerance intervals.

Introduction to Estimation

In statistical inference, we use sample data to draw conclusions about the entire population. Estimation is the process of determining the most likely value (or range of values) for an unknown population parameter, such as the true mean compressive strength of a concrete mix ( $\mu$ ) or the true proportion of defective rivets in a bridge ( $\pi$ ).

Point Estimation

A point estimate is a single value calculated from sample data. For example, the sample mean $\bar{x}$ is a point estimate of the population mean $\mu$ . The sample variance $s^2$ is a point estimate of the population variance $\sigma^2$ .

Properties of a Good Estimator

Not all estimators are created equal. A good estimator should be:

Unbiased: The expected value of the estimator equals the true population parameter (e.g., $E[\bar{X}] = \mu$ ). If we take many samples, the average of our estimates will center exactly on the true value.
Minimum Variance (Efficient): Among all unbiased estimators, the one with the smallest variance (tightest spread) is preferred. It consistently provides estimates closer to the true value.

Methods of Point Estimation

How do statisticians derive these formulas (like $\bar{x}$ or $s^2$ ) in the first place?

Method of Moments: Equates sample moments (like the sample mean or variance) to population moments to solve for unknown parameters.
Maximum Likelihood Estimation (MLE): Finds the parameter value that makes the observed sample data the most "likely" to have occurred. It is the most robust and widely used mathematical method for deriving estimators.

Interval Estimation (Confidence Intervals)

Because a point estimate will almost never exactly equal the true parameter due to sampling error, we construct a Confidence Interval (CI). A CI provides a range of values and a level of confidence (e.g., 95%) that the true parameter lies within that range.

General Confidence Interval Structure

The basic formula used to construct confidence intervals.

\text{Point Estimate} \pm (\text{Critical Value} \times \text{Standard Error})

Margin of Error

The term $(\text{Critical Value} \times \text{Standard Error})$ is called the Margin of Error ( $E$ ).

Confidence Interval for the Mean ( $\mu$ ) - Case 1: Population Variance ( $\sigma^2$ ) Known

If we know the true standard deviation $\sigma$ (rare in practice, but possible with extensive historical data), we use the Standard Normal ( $Z$ ) distribution.

CI for Mean (Variance Known)

Confidence interval when population variance is known.

\bar{x} \pm Z_{\alpha/2} \left( \frac{\sigma}{\sqrt{n}} \right)

Variables for CI Mean (Variance Known)

$\bar{x}$ : Sample mean
$Z_{\alpha/2}$ : Critical value from Standard Normal distribution
$\sigma$ : Population standard deviation
$n$ : Sample size

Confidence Interval for the Mean ( $\mu$ ) - Case 2: Population Variance ( $\sigma^2$ ) Unknown

This is the most common scenario. We must estimate $\sigma$ using the sample standard deviation $s$ . Because of this added uncertainty, we use the wider Student's t-distribution with $n-1$ degrees of freedom.

CI for Mean (Variance Unknown)

Confidence interval when population variance is unknown.

\bar{x} \pm t_{\alpha/2, n-1} \left( \frac{s}{\sqrt{n}} \right)

Variables for CI Mean (Variance Unknown)

$\bar{x}$ : Sample mean
$t_{\alpha/2, n-1}$ : Critical value from Student's t-distribution with $n-1$ degrees of freedom
$s$ : Sample standard deviation
$n$ : Sample size

Large Sample Approximation

As sample size $n$ gets very large, the t-distribution converges to the Z-distribution, and the two formulas yield nearly identical results.

Confidence Interval for the Difference Between Two Means ( $\mu_1 - \mu_2$ )

Used to compare the averages of two distinct populations (e.g., comparing the strength of concrete from two different suppliers).

CI for Difference of Means (Variances Known)

Confidence interval comparing two means with known variances.

(\bar{x}_1 - \bar{x}_2) \pm Z_{\alpha/2} \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}

Variables for CI for Difference of Means

$\bar{x}_1, \bar{x}_2$ : Sample means of populations 1 and 2
$Z_{\alpha/2}$ : Critical value from Standard Normal distribution
$\sigma_1^2, \sigma_2^2$ : Population variances of populations 1 and 2
$n_1, n_2$ : Sample sizes of populations 1 and 2

Confidence Interval for a Proportion ( $\pi$ or $p$ )

Used for categorical data (e.g., the percentage of structural beams failing an inspection). Let $p$ be the sample proportion (number of successes divided by sample size). If the sample size is large enough (both $np \ge 5$ and $n(1-p) \ge 5$ ), the sampling distribution of $p$ is approximately normal.

CI for a Single Proportion

Confidence interval for a single population proportion.

p \pm Z_{\alpha/2} \sqrt{\frac{p(1-p)}{n}}

Variables for CI for a Single Proportion

$p$ : Sample proportion
$Z_{\alpha/2}$ : Critical value from Standard Normal distribution
$n$ : Sample size

Confidence Interval for the Variance ( $\sigma^2$ )

Estimating the variability of a process, crucial for quality control. Because the sampling distribution of the sample variance ( $s^2$ ) is not symmetric, we use the heavily skewed Chi-Square ( $\chi^2$ ) distribution with $n-1$ degrees of freedom. The interval is not symmetric around $s^2$ .

CI for a Single Variance

Confidence interval for a single population variance.

\frac{(n-1)s^2}{\chi^2_{\alpha/2}} \le \sigma^2 \le \frac{(n-1)s^2}{\chi^2_{1-\alpha/2}}

Variables

Symbol	Description	Unit
$s^2$	Sample variance	-
$n$	Sample size	-
$\chi^2$	Critical values from Chi-Square distribution	-
$\sigma^2$	Population variance	-

Prediction and Tolerance Intervals

While a confidence interval bounds a population parameter (like the mean $\mu$ ), engineers often need to bound future individual measurements.

Prediction Interval

Provides a range that is highly likely (e.g., 95% confidence) to contain a single future observation drawn from the same population. Because an individual observation is much more variable than a sample mean, a prediction interval is always significantly wider than a confidence interval.

Prediction Interval Formula

Interval for a single future observation from a normal population.

\bar{x} \pm t_{\alpha/2, n-1} \cdot s \sqrt{1 + \frac{1}{n}}

Tolerance Interval

Provides a range that is highly likely to contain a specified proportion of the entire population (e.g., 99% of all concrete batches produced). It captures the natural variability of the process. If a 95% tolerance interval for concrete strength is [28 MPa, 35 MPa], we are confident that 95% of all individual batches will fall in this range.

Interactive Simulation

Interact with the simulation below to understand confidence intervals and estimation.

Engineering Data Analysis

Confidence Interval & Parameter Estimation

Loading chart...

95% Confidence Interval

z_{\alpha/2} = z_{0.025} = 1.96

[2416.84, 2583.16]

\bar{x} \pm E = 2500.0 \pm 83.16

(Margin of Error)

Interpretation: We are 95% confident that the true population mean lies between 2416.84 and 2583.16. Increasing sample size (

n

) reduces the standard error, making the interval narrower.

Sample Mean (

\\bar{x}

)2500

Standard Deviation (

s

)300

Sample Size (

n

)50

Standard Error: $\sigma_{\bar{x}} = s / \sqrt{n} = 42.426$

Confidence Level (

1 - \\alpha

)

Interactive Simulation

Use the sample size calculator below to compute the minimum sample size required to estimate a population mean or proportion within a target margin of error.

Engineering Data Analysis • Topic 9

Sample Size Calculator

Estimation Target

Confidence Level

Margin of Error (

E

)0.050

Std Dev (

\sigma

)0.25

Required Sample Sizen = 97

Governing Formula

n = \left(\frac{Z_{\alpha/2} \cdot \sigma}{E}\right)^2

Loading chart...

Key Takeaways

Point Estimate: A single value (e.g., $\bar{x}$ , $s^2$ ) used to estimate a population parameter ( $\mu$ , $\sigma^2$ ). Good estimators are unbiased and have minimum variance.
Confidence Interval (CI): A range of plausible values for a population parameter, incorporating a margin of error based on a chosen confidence level (e.g., 95%).
Mean CI (Unknown $\sigma$ ): The most common scenario; uses the sample standard deviation $s$ and the $t$ -distribution.
Proportion CI: Used for categorical (success/failure) data; relies on the normal approximation ( $Z$ -distribution) for large samples.
Variance CI: Asymmetric interval built using the Chi-Square ( $\chi^2$ ) distribution.
Prediction vs. Confidence: A CI estimates the average, while a Prediction Interval estimates a single future value, making it much wider.

Previous TopicSampling Distributions - Examples & Applications

Quiz Me

Next TopicEstimation - Examples & Applications

Prev Next

Quiz Me

Learning Objectives

Introduction to Estimation

Point Estimation

Properties of a Good Estimator

Methods of Point Estimation

Interval Estimation (Confidence Intervals)

General Confidence Interval Structure

Margin of Error

Confidence Interval for the Mean (μ\muμ) - Case 1: Population Variance (σ2\sigma^2σ2) Known

CI for Mean (Variance Known)

Variables for CI Mean (Variance Known)

Confidence Interval for the Mean (μ\muμ) - Case 2: Population Variance (σ2\sigma^2σ2) Unknown

CI for Mean (Variance Unknown)

Variables for CI Mean (Variance Unknown)

Large Sample Approximation

Confidence Interval for the Difference Between Two Means (μ1−μ2\mu_1 - \mu_2μ1​−μ2​)

CI for Difference of Means (Variances Known)

Variables for CI for Difference of Means

Confidence Interval for a Proportion (π\piπ or ppp)

CI for a Single Proportion

Variables for CI for a Single Proportion

Confidence Interval for the Variance (σ2\sigma^2σ2)

CI for a Single Variance

Prediction and Tolerance Intervals

Prediction Interval

Prediction Interval Formula

Tolerance Interval

Interactive Simulation

Engineering Data Analysis

Interactive Simulation

Engineering Data Analysis • Topic 9

Confidence Interval for the Mean ( $\mu$ ) - Case 1: Population Variance ( $\sigma^2$ ) Known

Confidence Interval for the Mean ( $\mu$ ) - Case 2: Population Variance ( $\sigma^2$ ) Unknown

Confidence Interval for the Difference Between Two Means ( $\mu_1 - \mu_2$ )

Confidence Interval for a Proportion ( $\pi$ or $p$ )

Confidence Interval for the Variance ( $\sigma^2$ )