Statistical Hydrology
Learning Objectives
- Understand the concepts of Return Period and Exceedance Probability for hydrologic events.
- Calculate Risk and Reliability for hydraulic structures over their design life.
- Perform Frequency Analysis using general frequency equations.
- Apply probability distributions like Gumbel's, Log-Pearson Type III, and Log-Normal to historical data.
- Utilize Plotting Positions, Confidence Limits, and L-Moments for robust statistical estimation.
- Define and understand the application of the Probable Maximum Flood (PMF).
Applying probability theory to hydrologic events to predict return periods, risk, and frequencies.
Introduction
Hydrologic events (floods, droughts, storms) are stochastic (random) in nature. Statistical Hydrology uses probability theory to analyze historical data and predict the likelihood of future extreme events.
Return Period ()
Return Period (Recurrence Interval)
The average time interval between events equal to or exceeding a certain magnitude ().
Return Period vs. Probability
Exceedance Probability ()
The probability that an event of magnitude will occur in any given year. For example:
- 100-year flood: , so (1% chance of occurring in any single year).
- 50-year flood: , so (2% chance).
Risk ()
The probability that an event with return period will occur at least once in a project life of years.
Risk Equation
Calculates the probability that an event with return period T occurs at least once in n years.
Variables
| Symbol | Description | Unit |
|---|---|---|
| Risk of occurrence | dimensionless | |
| Exceedance Probability | dimensionless | |
| Project life in years | ||
| Return period |
Reliability
The probability that the event will not occur in years.
Reliability Equation
Calculates the probability that an event will not occur in a project life of n years.
Variables
| Symbol | Description | Unit |
|---|---|---|
| Probability of non-occurrence | dimensionless | |
| Risk | dimensionless | |
| Exceedance probability | dimensionless | |
| Project life in years |
Interactive Simulation
Use the simulation below to explore the relationship between return period, design life, and the calculated risk and reliability.
Frequency Analysis
Used to relate the magnitude of extreme events to their frequency of occurrence using probability distributions.
General Frequency Equation
Relates the magnitude of extreme events to their frequency of occurrence.
Variables
| Symbol | Description | Unit |
|---|---|---|
| Value of variate with return period T (e.g., peak discharge) | varies | |
| Mean of the data series | varies | |
| Standard deviation of the data series | varies | |
| Frequency factor, depends on the probability distribution and T | dimensionless |
- Gumbel's Extreme Value Distribution (Type I)
Commonly used for flood frequency analysis.
Gumbel's Frequency Factor
Calculates the frequency factor for Gumbel's extreme value distribution.
Variables
| Symbol | Description | Unit |
|---|---|---|
| Frequency factor | dimensionless | |
| Reduced variate for return period T | dimensionless | |
| Reduced mean, dependent on sample size N | dimensionless | |
| Reduced standard deviation, dependent on sample size N | dimensionless |
Reduced Variate ()
- Log-Pearson Type III Distribution
The standard method for flood frequency analysis in the United States (USGS Bulletin 17B/17C). It applies the general frequency equation to the logarithms of the discharge values ().
Log-Pearson III Equation
Applies general frequency equation to the logarithms of discharge values.
Variables
| Symbol | Description | Unit |
|---|---|---|
| Logarithm of the variate value with return period T | dimensionless | |
| Mean of the log-transformed data | dimensionless | |
| Frequency factor, function of return period T and skewness coefficient C_s | dimensionless | |
| Standard deviation of the log-transformed data | dimensionless |
- Log-Normal Distribution
A special case of the Log-Pearson Type III distribution where the skewness coefficient of the logarithmic data is exactly zero ().
Log-Normal Equation
A special case of Log-Pearson III where skewness is zero.
Variables
| Symbol | Description | Unit |
|---|---|---|
| Variate value (\ln x) for return period T | dimensionless | |
| Mean of the logarithms | dimensionless | |
| Standard normal deviate for return period T | dimensionless | |
| Standard deviation of the logarithms | dimensionless |
Note: The actual variate value is then calculated as .
Plotting Positions
To graphically plot a probability distribution from empirical data, the data points (e.g., annual peak floods) must be ranked in descending order ( is the largest event). An empirical exceedance probability () is then assigned to each rank using a plotting position formula.
Weibull Plotting Position
Assigns an empirical exceedance probability to ranked historical data.
Variables
| Symbol | Description | Unit |
|---|---|---|
| Empirical exceedance probability | dimensionless | |
| Rank of the event in descending order | dimensionless | |
| Total number of years of record |
The corresponding Return Period is calculated as . Other formulas include Gringorten and Cunnane.
Confidence Limits
Statistical estimates have inherent uncertainty because they are based on a finite sample of historical data. Confidence limits provide a range within which the true value is expected to lie with a specified probability (e.g., 95% confidence).
Standard Error
The standard error of estimate quantifies the uncertainty in the calculated magnitude . The confidence interval is typically , where is the standard normal variate for the desired confidence level, and is the standard error.
L-Moments in Hydrology
Traditional product moments (mean, variance, skewness) are highly sensitive to outliers in small datasets, which is common in flood records. L-moments are an advanced statistical tool used to estimate distribution parameters more robustly.
Advantages of L-Moments
L-moments are linear combinations of probability weighted moments (PWMs). Because they are linear, they do not square or cube the data values, making them far less susceptible to the influence of extreme outliers compared to traditional variance or skewness. They provide more reliable parameter estimates for distributions like the Generalized Extreme Value (GEV) distribution.
Probable Maximum Flood (PMF)
Probable Maximum Flood (PMF)
The most severe flood considered physically possible in a particular drainage basin, based on comprehensive hydrometeorological analysis of maximum precipitation and hydrologic factors favorable for maximum runoff.
Unlike a 100-year or 500-year flood derived from statistical frequency analysis, the PMF is an absolute theoretical upper bound. It is generated by routing the Probable Maximum Precipitation (PMP) through the basin's hydrologic model, assuming worst-case antecedent soil moisture conditions and peak snowmelt (if applicable).
Design Application
The PMF is strictly used for designing the spillways of high-hazard dams, where structural failure would result in unacceptable loss of human life and catastrophic downstream damage. By designing for the PMF, engineers ensure the dam will never overtop under any foreseeable physical conditions, effectively eliminating the risk of hydrologic failure.
Risk and Reliability
When designing hydraulic structures, engineers must assess the probability that a design event will be exceeded over the lifetime of the structure.
Risk Equation
Calculates the risk for a design event over the lifetime of a structure.
Variables
| Symbol | Description | Unit |
|---|---|---|
| Risk, probability that the event will occur at least once in n years | dimensionless | |
| Probability of occurrence in any single year (P = 1/T) | dimensionless | |
| Design life of the structure |
Reliability
Reliability is the probability that the structure will not fail (i.e., the design event will not be exceeded) during its design life. It is simply .
- Hydrologic events cannot be predicted with absolute certainty due to their inherent randomness.
- Statistical Hydrology applies probability theory to historical data to estimate the likelihood and magnitude of future extreme events (floods, droughts).
- Return Period () is the statistical average time interval between occurrences of an event of a specific magnitude.
- It is the mathematical inverse of the Annual Exceedance Probability (): .
- Risk () is the probability that an event will occur at least once during a project's design life ().
- Even a 100-year flood has a 1% chance of occurring in any given year, meaning it could theoretically happen in consecutive years.
- Frequency Analysis fits historical data to theoretical probability distributions to extrapolate extreme events beyond the recorded timeframe.
- The General Frequency Equation () scales the mean by a frequency factor and standard deviation .
- Gumbel's Extreme Value Type I is traditionally used for maximum annual flood series.
- The Log-Pearson Type III distribution is the standard method mandated by US federal agencies for flood frequency analysis.
- Plotting Positions like the Weibull Formula () assign empirical probabilities to ranked historical data for graphical comparison against theoretical distributions.
- Statistical estimates are uncertain because they rely on finite historical sample sizes.
- Confidence Limits define a bound (e.g., 95%) within which the true magnitude of an event is expected to lie.
- The width of the confidence interval depends on the Standard Error (), which decreases as the length of the historical data record increases.
- The Probable Maximum Flood (PMF) is the absolute physical upper limit of flooding for a basin, derived deterministically from the PMP, rather than statistically.
- High-hazard dam spillways are designed to safely pass the PMF to ensure zero risk of catastrophic overtopping.