| b29a8671 | 19-Dec-2023 |
Junchao Zhang <jczhang@anl.gov> |
Vec: add GEMV optimizations for VecMDot and friends for VecStandard
Remove KSPPIPEFGMRES from example with skip convergence test since very sensitive to happy ending
Appears to have a sweet spot of
Vec: add GEMV optimizations for VecMDot and friends for VecStandard
Remove KSPPIPEFGMRES from example with skip convergence test since very sensitive to happy ending
Appears to have a sweet spot of much better performance for smallish vectors then matches unrolled code for large vectors
Sample results on Barry's Apple M2 Laptop (using Apple's BLAS)
./ex19 -da_refine 5 -pc_type none -log_view -ksp_gmres_preallocate -ksp_view
Vector length 37,636
VecMDot 1920 1.0 1.9707e-01 1.0 2.23e+09 1.0 0.0e+00 0.0e+00 0.0e+00 25 29 0 0 0 25 29 0 0 0 11291
-vec_mdot_use_gemv
VecMDot 1920 1.0 7.5098e-02 1.0 2.23e+09 1.0 0.0e+00 0.0e+00 0.0e+00 12 29 0 0 0 12 29 0 0 0 29693 VecMDot 1920 1.0 8.1523e-02 1.0 2.23e+09 1.0 0.0e+00 0.0e+00 0.0e+00 12 29 0 0 0 12 29 0 0 0 27353 VecMDot 1920 1.0 7.0889e-02 1.0 2.23e+09 1.0 0.0e+00 0.0e+00 0.0e+00 11 29 0 0 0 11 29 0 0 0 31456
-da_refine 6
Vector length 148,996
VecMDot 4340 1.0 1.7666e+00 1.0 2.00e+10 1.0 0.0e+00 0.0e+00 0.0e+00 20 29 0 0 0 20 29 0 0 0 11319
-vec_mdot_use_gemv
VecMDot 4422 1.0 1.3725e+00 1.0 2.04e+10 1.0 0.0e+00 0.0e+00 0.0e+00 15 29 0 0 0 15 29 0 0 0 14884 VecMDot 4422 1.0 1.4354e+00 1.0 2.04e+10 1.0 0.0e+00 0.0e+00 0.0e+00 16 29 0 0 0 16 29 0 0 0 14231
./ex19 -da_refine 7 -pc_type none -log_view -ksp_gmres_preallocate -ksp_view -vec_mdot_use_gemv -ksp_max_it 100 -snes_max_it 1
Vector length 592,900
VecMDot 100 1.0 1.5915e-01 1.0 1.72e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 27 0 0 0 14 27 0 0 0 10804
-vec_mdot_use_gemv
VecMDot 100 1.0 1.6854e-01 1.0 1.72e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 27 0 0 0 14 27 0 0 0 10230 VecMDot 100 1.0 1.5698e-01 1.0 1.72e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 27 0 0 0 14 27 0 0 0 10983
-da_refine 8
vector length 2,365,444
VecMDot 100 1.0 6.2499e-01 1.0 6.86e+09 1.0 0.0e+00 0.0e+00 0.0e+00 13 27 0 0 0 13 27 0 0 0 10976
-vec_mdot_use_gemv
VecMDot 100 1.0 6.8197e-01 1.0 6.88e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 27 0 0 0 14 27 0 0 0 10087
show more ...
|