Systems of Linear Equations
Learning Objectives
- Understand the matrix representation of linear systems and basic solutions like Cramer's rule.
- Perform direct methods such as Gauss Elimination, Gauss-Jordan, and LU Decomposition.
- Implement pivoting strategies to mitigate division by zero and round-off errors.
- Evaluate the condition of a matrix using vector and matrix norms, and understand ill-conditioning.
- Solve systems utilizing SVD and the Thomas Algorithm.
- Apply iterative refinement and iterative methods (Jacobi, Gauss-Seidel, SOR) for large sparse matrices.
System of Linear Equations
A collection of one or more linear equations involving the same set of variables. In structural engineering, this often appears as the stiffness matrix equation .
Solving systems of simultaneous linear algebraic equations is essential in many fields of engineering. A general system can be represented in matrix form as . For example, in structural analysis, determining displacements at various nodes requires solving large systems where is the global stiffness matrix, is the vector of unknown nodal displacements, and is the vector of applied forces.
Matrix Fundamentals and Cramer's Rule
Before employing numerical methods, it is crucial to understand basic matrix operations (addition, multiplication) and properties (determinant, trace). A system has a unique solution if and only if the determinant of the coefficient matrix is non-zero ().
Cramer's Rule
An explicit formula for the solution of a system of linear equations with as many equations as unknowns, valid whenever the system has a unique solution.
Cramer's Rule
A direct analytical method that expresses the solution in terms of determinants. While conceptually simple, Cramer's rule is computationally impractical for systems larger than due to the factorial growth in computing determinants ( operations).
Cramer's Rule
Calculates the unknown using determinants.
Variables
| Symbol | Description | Unit |
|---|---|---|
| The -th unknown variable | - | |
| Determinant of matrix with the -th column replaced by vector | - | |
| Determinant of the original coefficient matrix | - |
Gauss Elimination
Gauss elimination is a direct method that involves two phases: forward elimination and back substitution.
Gauss Elimination Steps
- Forward Elimination: Manipulate the equations to eliminate unknowns and create an upper triangular system.
- Back Substitution: Solve for the unknowns starting from the last equation and working backward to the first.
Computational Complexity (FLOPs)
To compare algorithms, we count the floating-point operations (FLOPs), primarily multiplications and divisions.
- Forward Elimination: Requires approximately FLOPs.
- Back Substitution: Requires approximately FLOPs.
- Total: Thus, Gauss Elimination is an operation, specifically dominated by the multiplications. This cubic scaling means doubling the size of a system multiplies the execution time by eight.
Gauss Elimination Pitfalls
Gauss elimination is susceptible to division by zero and round-off errors. Pivoting (swapping rows so that the largest available element is used as the pivot) is often necessary to avoid these issues.
Interactive Simulation
Explore the steps of Forward Elimination and Back Substitution using the simulation below.
Gauss Elimination Process
Step-by-step visualization of solving a 3x3 system of linear equations using naive Gauss elimination.
Initial Augmented Matrix [A|B]
| 3.0000 | -0.1000 | -0.2000 | 7.8500 | |||
| 0.1000 | 7.0000 | -0.3000 | -19.3000 | |||
| 0.3000 | -0.2000 | 10.0000 | 71.4000 |
iCurrent Action
Identify pivot element a_11 (3).
Solution Vector x
Pivoting Strategies
Pivoting
The process of rearranging rows (partial pivoting) or both rows and columns (complete pivoting) of a matrix to place the largest possible absolute value on the diagonal before an elimination step.
In Gauss elimination, division by zero or a very small number can lead to severe round-off errors. Pivoting is a technique to rearrange the rows or columns of the coefficient matrix to place a larger absolute value in the diagonal pivot position.
Types of Pivoting
- Partial Pivoting: The most common approach. Before eliminating variables in column , it searches the remaining rows for the largest absolute element and swaps row with that row. This ensures the multiplier is always , minimizing error growth.
- Complete Pivoting: Searches both the remaining rows and columns for the largest absolute element to use as the pivot. This involves swapping both rows and columns. While numerically more stable, the extensive search and bookkeeping make it computationally expensive and rarely used compared to partial pivoting.
Gauss-Jordan
A variation of Gauss elimination where, when an unknown is eliminated, it is eliminated from all other equations rather than just the subsequent ones. The result is an identity matrix, making back substitution unnecessary. However, the procedure requires roughly more operations ( multiplications) than standard Gauss Elimination, making it less efficient for solving standard systems.
Matrix Inversion
The inverse of a square matrix , denoted as , is a matrix such that , where is the identity matrix. If the inverse is known, the solution to the system is simply .
Gauss-Jordan for Inversion
Gauss-Jordan elimination can be used to determine the inverse of a matrix by augmenting the original matrix with the identity matrix and performing the elimination steps.
LU Decomposition
LU decomposition separates the time-consuming elimination of the matrix from the manipulations of the right-hand side . This is enormously advantageous when solving the same system for many different load vectors because the expensive elimination step only happens once.
LU Decomposition Process
The matrix is decomposed into a lower triangular matrix and an upper triangular matrix .
The system becomes . We can then solve for in using forward substitution, and then solve for in using back substitution.
There are specific algorithms for computing the LU factors:
- Doolittle's Method: Specifies that all diagonal elements of the lower triangular matrix must be (i.e., ).
- Crout's Method: Specifies that all diagonal elements of the upper triangular matrix must be (i.e., ).
LU Decomposition Equation
Represents a matrix as the product of lower and upper triangular matrices.
Variables
| Symbol | Description | Unit |
|---|---|---|
| Original coefficient matrix | - | |
| Lower triangular matrix | - | |
| Upper triangular matrix | - |
Cholesky Decomposition
If the matrix is symmetric and positive definite, it can be decomposed into a lower triangular matrix and its transpose. This method is highly efficient, requiring roughly operations, about half the execution time and storage of standard LU decomposition.
Cholesky Decomposition Equation
Decomposes a symmetric positive-definite matrix into a lower triangular matrix and its transpose.
Variables
| Symbol | Description | Unit |
|---|---|---|
| Symmetric, positive-definite coefficient matrix | - | |
| Lower triangular matrix | - | |
| Transpose of the lower triangular matrix | - |
Singular Value Decomposition (SVD)
SVD is one of the most powerful matrix factorization techniques, particularly useful for non-square or ill-conditioned matrices where LU or Cholesky decomposition fails.
Singular Value Decomposition Concepts
Any matrix can be factored into three matrices.
SVD can easily compute the pseudoinverse of a matrix to solve least-squares problems and explicitly determine the rank, null space, and condition number of (where ).
SVD Formula
Factors an matrix into singular vectors and singular values.
Variables
| Symbol | Description | Unit |
|---|---|---|
| Original matrix | - | |
| orthogonal matrix of left singular vectors | - | |
| diagonal matrix of singular values | - | |
| Transpose of an orthogonal matrix of right singular vectors | - |
Thomas Algorithm for Tridiagonal Systems
Many engineering problems, particularly those involving 1D partial differential equations (like the heat equation) or cubic splines, result in tridiagonal systems of linear equations. A tridiagonal matrix has non-zero elements only on the main diagonal, the first diagonal below this, and the first diagonal above this.
Thomas Algorithm
The Thomas algorithm is a highly efficient, simplified form of Gaussian elimination specifically designed for tridiagonal systems. It operates in two passes:
- Forward Elimination: Eliminates the lower diagonal elements.
- Backward Substitution: Solves for the unknowns starting from the last equation.
Because it avoids operations on known zero elements, the computational cost is strictly proportional to the number of equations () rather than for standard Gaussian elimination. This allows solving millions of equations in milliseconds.
Vector Norms
To quantify the "size" of a matrix, we first must define the "length" or "magnitude" of a vector in higher dimensions using a vector norm.
Vector Norm
A function that assigns a strictly positive length or size to each vector in a vector space, measuring its magnitude.
Common Vector Norms
A vector norm must satisfy basic scaling and triangle inequality rules. Common examples include:
- Euclidean Norm ( norm): The standard physical length of a vector, defined as .
- Sum Norm ( norm): The sum of the absolute values of the elements, .
- Max Norm ( norm): The single largest absolute value element in the vector, .
Matrix Norms
Building on vector norms, matrix norms quantify how much a matrix can stretch or amplify a vector. A matrix norm is a real number that provides a measure of the "size" or magnitude of the matrix.
Matrix Norm
A quantity representing the "size" or stretching power of a matrix, extending the concept of vector norms to matrices.
Common Matrix Norms
- Row-Sum Norm (): The maximum of the sums of the absolute values of the elements in each row.
- Column-Sum Norm (): The maximum of the sums of the absolute values of the elements in each column.
- Spectral Norm (): The square root of the largest eigenvalue of .
The formal Condition Number is defined as . If , the matrix is well-conditioned. If , it is ill-conditioned.
Ill-Conditioning
Condition Number
A measure of how sensitive the solution of a linear system is to small changes or errors in the input data. A large condition number indicates an ill-conditioned system.
A system is ill-conditioned if small changes in the coefficients (e.g., due to round-off errors) result in large changes in the solution. This implies the matrix is close to being singular (determinant near zero). The condition number quantifies this: a large condition number indicates an ill-conditioned system. The condition number is formally connected to matrix norms as , and in SVD terms, as the ratio of the largest to smallest singular value. Techniques like pivoting and using higher precision arithmetic are required to mitigate these effects.
Sparse Matrices
Sparse Matrix
A matrix in which most of the elements are zero. In contrast, if most of the elements are non-zero, then the matrix is considered dense.
A matrix is sparse if a vast majority of its elements are zero. Typical examples arise in the finite difference and finite element methods (like tridiagonal and banded matrices).
Sparse Matrix Techniques
Instead of storing elements, sparse matrix representations (like Compressed Sparse Row, CSR) store only the non-zero elements and their indices. This drastically reduces memory usage from to where is the number of non-zeros. Iterative methods (like Conjugate Gradient or Gauss-Seidel) are particularly well-suited for sparse systems because they only require matrix-vector multiplication, preserving the sparsity structure.
Iterative Refinement
Iterative refinement is a technique used to improve the accuracy of solutions obtained by direct methods (like LU decomposition) when dealing with ill-conditioned systems, essentially using double precision to polish a single-precision result.
Iterative Refinement Steps
- Calculate an initial approximate solution using a direct method.
- Compute the residual vector using higher precision arithmetic.
- Solve the correction system using the already computed LU factors.
- Update the solution: .
This process can be repeated until the correction becomes negligible.
Iterative Methods
Iterative Method
A mathematical procedure that uses an initial guess to generate a sequence of improving approximate solutions for a class of problems, in which the -th approximation is derived from the previous ones.
Unlike direct methods, iterative methods start with a guess and continually refine the estimate. They are particularly useful for large systems containing many zero elements (sparse matrices).
Jacobi and Gauss-Seidel Methods
- Jacobi Method: Computes all new values using only the old values from the previous iteration.
- Gauss-Seidel Method: Uses the most recently computed values as soon as they are available, generally converging faster than the Jacobi method.
Gauss-Seidel Iteration Formula
Computes the new value using the most recently computed values.
Variables
| Symbol | Description | Unit |
|---|---|---|
| New estimate of the -th unknown at iteration | - | |
| Recently computed estimates in the current iteration | - | |
| Old estimates from the previous iteration | - | |
| Coefficient of the system matrix | - | |
| Constant term of the -th equation | - |
Convergence Requirements
For both methods to converge reliably, the system must typically be diagonally dominant (the absolute value of the diagonal element is greater than the sum of the absolute values of the other elements in the row). Furthermore, the rigorous mathematical condition for the convergence of any linear stationary iterative method (like Jacobi or Gauss-Seidel) is that the spectral radius (the maximum absolute eigenvalue) of the iteration matrix must be strictly less than ().
Successive Over-Relaxation (SOR)
To accelerate the convergence of the Gauss-Seidel method, relaxation techniques apply a weighting factor . If , the method is called over-relaxation and accelerates convergence.
Successive Over-Relaxation Equation
Applies a relaxation factor to accelerate iterative convergence.
Variables
| Symbol | Description | Unit |
|---|---|---|
| Accelerated estimate of the unknown | - | |
| Estimate obtained from the standard iterative method | - | |
| Previous estimate | - | |
| Relaxation weighting factor ( for over-relaxation) | - |
- Cramer's Rule provides an exact analytical solution but is computationally inefficient for large systems ( operations).
- Gauss elimination and LU decomposition are direct methods that require operations. They yield exact solutions barring round-off errors.
- Partial Pivoting swaps rows to place the largest element on the diagonal, which is essential in direct methods to prevent division by zero and minimize round-off errors.
- LU decomposition is highly efficient for multiple right-hand sides since the elimination is only performed once. Cholesky decomposition optimizes this for symmetric positive-definite matrices ().
- Singular Value Decomposition (SVD) provides a robust factorization for ill-conditioned or non-square matrices and directly yields the pseudoinverse.
- Thomas Algorithm solves tridiagonal systems efficiently in time.
- Sparse Matrices store only non-zero elements, dramatically reducing memory usage and making iterative solvers highly attractive.
- Ill-conditioned systems are highly sensitive to round-off errors and have large condition numbers, often quantified using Vector and Matrix Norms (L1, L2, L-infinity).
- Iterative Refinement can polish solutions from direct methods by using higher precision on the residual.
- Jacobi and Gauss-Seidel are iterative methods effective for large, sparse, diagonally dominant systems. Convergence requires the spectral radius of the iteration matrix to be less than 1.
- Successive Over-Relaxation (SOR) accelerates the convergence of iterative methods.