Adaptive Stable Additive Methods for Linear Algebraic Calculations József Smidla, Péter Tar, István Maros University of Pannonia Veszprém, Hungary 4 th of July 204. / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Outline Linear algebraic kernel Dot product 2 Hilbert matrix Condition number Large condition number aware logic 3 Stable dot product Primary large condition number detector 2 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Linear algebraic kernel Dot product Pannon Optimizer: linear programming solver Linear programming problem min c T x Ax = b x j 0, j =..n Linear algebraic kernel Provides linear algebraic algorithms and data structures: vector operations (e.g. dot product) FTRAN: α = B a BTRAN: π T = h T B where B is the actual basis 3 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Dot product Floating point numbers: ( ) s.m m 2...m n 2 e Errors s {0,}: sign m i : i th bit of the mantissa e: exponent Rounding error: A = A + B, where A» B, and B 0 Cancellation: Given A and B 0, A = -B C = A + B Expectation: C = 0 Error: C = ±ε These errors can create a lot of fake nonzeros, lead to wrong results and slow down the algorithms. 4 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Intel s SIMD architecture Dot product Paralell operations on multiple data SSE2: 28 bit wide XMM registers One register: 4 single precision floating point numbers, or 2 doubles, or 4 32 bit integers, or 2 64 bit integers Single precision and double precision operations (add, multiply, etc...) Bitwise operations Integer operations Logical operations Moving operations AVX: 256 bit wide YMM registers 4 double precision floating point numbers per register 5 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Naive add implementation Dot product Given A and B vectors C := A+B, where c i := a i + b i Requirement: Avoid cancellation errors Minimize the overhead Naive implementation Input: A, B Output: C For each element of A and B: c i := a i + b i 6 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Linear algebraic kernel Dot product Naive implementation: Does not avoid cancellation errors Stabilize the result using relateive tolerance ǫ r : c i := a i + b i if ( a i + b i )ǫ r c i then c i := 0 Operations: 2 additions, multiplication, 2 assignments 3 absolute values jump comparison The result is stable, but the algorithm contains conditional jumps slows down 7 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Dot product Our accelerated stable add method Use Intel s AVX instruction set of the parallel comparisons are in a YMM register These results can be used for bit masking: mask := 000...000 2 if ( a i + b i )ǫ r < a i + b i then mask :=... 3 c i := ((a i + b i )) bitwise and with mask The comparison in step 2 is an AVX instruction There is no jumping in the implementation! Absolute value: bit masking (bitwise and) ( a + b ) ε i a i + b i compare a i + bi result i r e-5 4.5e-4 2.e-6 4e-2 3e-2 45.56 7.4e-0 4e-0 000...0... 000...0... 3e-2 45.56 7.4e-0-4e-0 0 45.56 0-4e-0 YMM0 YMM YMM2 YMM3 YMM4 8 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Naive dot product implementation Dot product We have two n dimensional vectors: a and b n a T b = a i b i i= Problem: We have to use floating point arithmetic Rounding and cancellation errors 9 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Stable dot product implementation Dot product Separate the negative and positive products Two variables: N: sum of negative products P: sum of positive products Algorithm: Read the a i and b i 2 p := a i b i 3 if p < 0 then 4 N := N + p 5 else 6 P := P + p Final result := N + P 0 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Dot product Our accelerated stable dot product implementation Conditional jumping can be avoided using pointer arithmetic: union Number { double num; unsigned long long int bits; } number; double negpos[2] = {0.0, 0.0}; [...] const double prod = a * b; number.num = prod; *(negpos + (number.bits >> 63)) += prod; The AVX can give more enhancement for the stable dot product / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Hilbert matrix Linear algebraic kernel Hilbert matrix Condition number Large condition number aware logic Hilbert matrix: H n,n, where h i,j = Example: H 4,4 = 2 2 3 3 4 4 5 i+j 3 4 5 6, i,j =,...,n 4 5 6 7 We can construct the following LP problem: min 0 H n,n x = b x j 0,j =..n, and b j = n i= i + j 2 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Solvers and the Hilbert matrix Hilbert matrix Condition number Large condition number aware logic It is clear that if and only if x j =,j =..n, the solution is optimal We have tested the CLP and the GLPK Size GLPK Exact GLPK CLP 3 3 x j = ±3.997 0 5 x j = x j = 4 4 x j = ±8.27 0 3 x j = x j = 5 5 x j = ±.75 0 x j = x j = 6 6 x j = ±2.66 0 0 x j = INFEASIBLE 7 7 x j = ±.57 x j = INFEASIBLE 8 8 x j = ±.600 x j = ±0.20 INFEASIBLE 20 20 x j = ±6.298 x j = ±4.24 INFEASIBLE 00 00 0 x j 24.009 x j = ±2.682 INFEASIBLE We have used Clp ang GLPK as libraries, the models were generated and solved by C++ programs. 3 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Condition number Hilbert matrix Condition number Large condition number aware logic Measures, how much the output changes if the input changes κ(b) = B B Problems with computing κ(b): The matrix changes in every iterations If κ(b) is large, computing B is difficult The condition number of the n*n Hilbert matrix is very large, it grows as ( (+ ) 2) 4n O n κ(h 6,6 ) 2.907 0 7 κ(h 0,0 ) 3.536 0 3 κ(h 00,00 ) 2.42 0 48 4 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Hilbert matrix Condition number Large condition number aware logic Primary large condition number detector We propose: We can not compute the condition number directly However, we can detect the effect of the large condition number! The input of the classic FTRAN is vector a: B a = α Create the perturbed ā copy of a Use a modified FTRAN, which computes B a = α and B ā = ᾱ The modified FTRAN perturbs every sum during computing ᾱ If r = max{ α, ᾱ } min{ α, ᾱ } is greater than a threshold, it means that the condition number is too large primary alarm 5 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Large condition number aware logic Hilbert matrix Condition number Large condition number aware logic The primary detector is executed An error occurs, for example: fallback to phase- If a primary alarm occurs, the algorithm performs primary detector in the following iterations If primary alarms occur in every next iteration and r does not decrease secondary alarm The algorithm ends If a primary alarm occurs, the algorithm performs a sensitivity analysis If the sensitivity analysis finds that the result is extremely instable secondary alarm If a secondary alarm occurs, the software restarts from the last basis with modified parameters (enabled scaling, switching to LU decomposition, etc.) In the last resort: The software restarts from the last basis with enhanced precision arithmetic 6 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Next steps Linear algebraic kernel Hilbert matrix Condition number Large condition number aware logic We have to integrate the enhanced precision arithmetic to the Pannon Optimizer We have to integrate the large condition number recognizer algorithm The large condition number recognizer can be accelerated with low-level optimization (SIMD architecture) Our goal: Implement a solver which runs fast on the stable problems, but recognizes the excessively numerical instable problems Switches to more precise arithmetic, and solves this problems too 7 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Linear algebraic kernel Stable dot product Primary large condition number detector CPU: Intel Core i5-320m, 2.50 GHz Vector lengths: 0 5 Dot product operations repeated 0 5 times 35,00 30,00 28,83 25,00 time [sec] 20,00 5,00 0,00 0,94 8,82 0,55 5,00 0,00 naive conditional jump SSE2 AVX 8 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Stable dot product Stable dot product Primary large condition number detector CPU: Intel Core i5-320m, 2.50 GHz Vector lengths: 0 6 Dot product operations repeated 0 4 times 70,00 63,35 60,00 50,00 time [sec] 40,00 30,00 20,00 0,00 2,3 2,35 6,76 0,0 0,00 naive conditional jump pointer arithmetic SSE2 AVX 9 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Primary large condition number detector Stable dot product Primary large condition number detector Output of the detector: r = max{ α, ᾱ } min{ α, ᾱ } δ = r Problem 25FV47.MPS STOCFOR3.MPS PILOT.MPS MAROS-R7.MPS Value of δ after the last iteration 3.66059e-08 3.07735e-08 9.22276e-06.39086e-0 Hilbert 7*7 0.06745 Hilbert 8*8 0.524724 Hilbert 20*20 2.05845 Hilbert 26*26 5.4588 Hilbert 00*00 0.32362 20 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations
Stable dot product Primary large condition number detector Thank you for your attention! This publication/research has been supported by the European Union and Hungary and co-financed by the European Social Fund through the project TÁMOP-4.2.2.C-//KONV-202-0004 - National Research Center for Development and Market Introduction of Advanced Information and Communication Technologies. 2 / 2 József Smidla, Péter Tar, István Maros Adaptive Stable Additive Methods for Linear Algebraic Calculations