P573: Pivoting in LU Factorization

Pivoting in LU Factorization

LU Notation and Warrnings

At an intermediate stage of LU factorization, the 2D array A that originally contained the matrix A has a mix of L, U, and partially processed parts of A. Because of that, mixing the terminology of array versus matrix is almost unavoidable. At the start, array A contains the matrix A. At the end, it is a data structure representing the matrices L and U. All LU factorization algorithms for dense matrices work in-place, so in general assume that A refers to the array.

Another common nomenclature: at an intermediate stage of LU factorization, the parts of the L and U factors that have already been computed are called the reduced part. The rest of the array, usually its trailing principle submatrix, is called the unreduced part.

Partial Pivoting in LU Factorization

When using any of the LU factorization methods (rank-1 update or matrix-vector product) of LU factorization, dividing by A(k,k) in the scaling operation will cause problems if the absolute value |A(k,k)| is small. To avoid this, algorithms search for a large value in the unreduced part of the array A, then swap rows and sometimes columns to move it to the (k,k) diagonal position.

Full pivoting means bringing the largest element in absolute value into that position by swapping rows and columns in the matrix to put the largest element in the (k,k) diagonal position. Partial pivoting means searching for the largest only among the entries in the k-th column, and so only require row swaps. Full pivoting is rarely used because it requires O(n³) searching and data movement operations¹. Also, LU factorization is "usually stable" when partial pivoting is used. Some classes of problems (e.g., two point boundary value problems in ODEs) can make it fail, but most often partial pivoting suffices. In general when the term "pivoting" is used, it refers to partial pivoting, not full pivoting. After pivoting is done to move a larger (or largest) value into the (k,k) diagonal position, it is called the "pivot" or "pivot entry" or "pivot element". In any case it is the diagonal element A(k,k), chosen in an attempt to avoid dividing by a too-small value when scaling the subdiagonal part of column k.

Including partial pivoting in the rank-1 version of LU factorization gives:

LU Factorization: Rank-1 Update with Pivoting Form

for k = 1:n
  p = index of max element in |A(k:n,k)|
  piv(k) = p
  swap rows k and p of A: A(k,:) ↔ A(p,:)
  A(k+1:n,k) = A(k+1:n,k)/A(k,k)
  A(k+1:n,k+1:n) = A(k+1:n,k+1:n) − A(k+1:n,k) * A(k,k+1:n)
end for

The first step defines p as the index of the largest element in absolute value in the diagonal and subdiagonal elements in column k. The outermost loop on k runs to n, so that piv(n) will be defined. If the LU factorization is then used to solve Ax = b, that last entry should never be referenced; correct indexing of loops should lead to it never being used. Mathematically, for a square matrix piv(n) = n must hold (why?). Also, exchange of rows k and p of the array A means the entire rows are swapped, not just the parts of the remaining trailing submatrix. The corresponding parts of the previously computed portions of L and U must also be swapped, since since they are overwriting A. It is another magic feature of LU factorization that this makes the factors come out in the right order as well.

The Matlab function LU1p.m implements this with different indexing variables. However, notice three important features:

swapping two rows does not require a temporary variable the way that C or C++ does. Instead the swap is done in one line: A([pivot(r) r], 1:n) = A([r pivot(r)], 1:n). The same syntax can be used in Fortran, but it requires a comma to separate the row numbers, viz., A([pivot(r), r], 1:n) = A([r, pivot(r)], 1:n).
when finding the pivot element using max() and abs(), it is possible that more than one entry in the column has the maximal absolute value. In that case, Matlab will return all such elements and their indices to s and I: [s, I] = max(abs(A(r:m, r))). To guard against this, the Matlab function uses indexing s(1) and I(1) to reference the the first maximal element and its index.
when A(r:m,r) is passed to the max() function, it does not know that the indexing into that vector starts at r and goes to m. Instead it assumes the indexing starts at 1 and goes to m-r. So when the index I is returned, we have to shift it by adding r-1 so that it will be in the correct range: pivot(r) = I(1)+r-1. This is not specific to Matlab, and the same shift has to be done in C, C++, or Fortran.

LU Factorization: Matrix-Vector Product with Pivoting for a Rectangular Matrix

Including pivoting with the matrix-vector product version and letting the matrix be m × n with m ≥ n gives:

for k = 1:n−1
   B(k:m,k) = B(k:m,k) − B(k:m,1:k−1) * B(1:k−1,k)
   p = index of max element in |B(k:m,k)|
   piv(k) = p
   swap rows k and p of B: B(k,:) ↔ B(p,:)
   B(k,k+1:n) = B(k,k+1:n) − B(k,1:k−1) * B(1:k−1,k+1:n)
   B(k+1:m,k) = B(k+1:m,k)/B(k,k)
end for
B(n:m,n) = B(n:m,n) − B(n:m,1:n−1) * B(1:n−1,n)
B(n+1:m,n) = B(n+1:m,n)/B(n,n)

The array name B instead of A is used to distinguish this from the usual square matrix case where m = n. However, B will actually be a subarray of the array A in most cases. The last two steps are the new ingredients - they apply the remaining updates to the rest of the matrix. Also, another pivot step can be taken if |B(n,n)| is too small. That is one case where piv(n) must be defined.

The Matlab function LU3.m implements this algorithm without pivoting. The Matlab script testLU3.m creates a 10 x 7 matrix and then calls LU3.m on it, checking the resulting LU factors.

Practical partial pivoting

Obviously, the three lines for implementing pivoting in the algorithms above are simple, and simple to implement. Here are some tips:

If k = p, don't call a function to swap a row with itself (well, duh!). Since swapping rows constitutes data movement without any computation occurring, it has a load/store ratio of ∞. This is the worst possible ratio of memory references to flops, so should be avoided when possible.
The goal of pivoting is not to get the largest possible element into the pivot position, but to avoid dividing by too-small of an entry. So the choice of the pivot element can be relaxed. Markovitz pivoting with parameter s searches for the subdiagonal element with largest absolute value, but only swaps the rows if |A(k,p)| > s*|A(k,k)|. The value s = 10 is a common choice. In other words, if the current diagonal element A(k,k) is within a factor of 10 of the largest possible pivot, use it as the pivot.
Always choosing the largest subdiagonal element as the pivot can be seen as setting the "Markovitz parameter" s = 1. Never pivoting can be seen as setting s = ∞. This is a case where actually using one of the IEEE special floating point values can make the coding simpler.
Choosing the Markovitz parameter s too large can lead to numerical instability, which is what pivoting is supposed to avoid. Part of the issue is just how much time is spent in swapping rows for a give value of s. At this point, you have the skills needed to determine that ... and it would be foolish not to insert the necessary timing calls in advance.
Pivoting does mean that LU factorization is not completely in-place, using only the original array A for everything. However, it introduces only an integer vector of length n and so is minor compared to copying or replicating the original data for A which requires n² words of storage.
The BLAS level-1 function dswap() will swap two vectors in-place, and was in large part defined to be used for LU factorization pivoting. This also explains why dswap has the signature dswap(n, x, incx, y, incy): the entries of the vectors x and y are rows of the 2D array A. When A is in column-major order, the strides incx and incy are greater than 1. What should those strides be?
The Matlab idiom for swapping rows k and p in an array A is A([k p], :) = A([p k], :) which is more straightforward than the usual loop with a temporary approach required by C/C++ or Java. Just beware when using Fortran: it requires commas separate the two row indices: A([k, p], :) = A([p, k], :) . Using brackets [k, p] was introduced in Fortran 2008; before that the notation had to use (\k, p \) which is still valid but just plain ugly.

¹ a version of full pivoting has been developed that does not require O(n³) work, but is more complicated than what we need here.

Next: BLAS-3 version of LU factorization

Last Modified: Mon 04 Nov 2019, 07:01 AM