| bd72cc96 | 20-Jan-2012 |
Paul Mullowney <paulm@txcorp.com> |
Changes to the Parallel GPU sparse matrix vector multiplication. It is now far more efficient due to overlapped communication and computation. To use these features, one must configure with --with-t
Changes to the Parallel GPU sparse matrix vector multiplication. It is now far more efficient due to overlapped communication and computation. To use these features, one must configure with --with-txpetscgpu=1 --download-txpetscgpu=no One has 4 choices for the matrix storage format: csr, coo, dia, and ell. A kernel for the inodes (csr format only) is also supported. These are accessed with -cusp_storage_format ell
Lastly, cusparse algorithms are now used for the upper and lower triangular solves.
Hg-commit: 2a4f352daf491fda81965ac393e29a9dca6d3ca3
show more ...
|