History log of /petsc/src/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.cu (Results 326 – 350 of 371)
Revision Date Author Comments
# 72cfe0ad 23-Jul-2013 Karl Rupp <rupp@iue.tuwien.ac.at>

Merge branch 'paulmullowney/txpetscgpu-package-removal'


# 2692e278 08-Jul-2013 Paul Mullowney <paulm@txcorp.com>

Adding PREPROCESSOR directives to protect ELL and HYB storage formats.

I've added preprocessor directives around all code using the cusparse
hybrid (or ellpack) format to only build when CUDA 4.2 or

Adding PREPROCESSOR directives to protect ELL and HYB storage formats.

I've added preprocessor directives around all code using the cusparse
hybrid (or ellpack) format to only build when CUDA 4.2 or beyond is
being used. I've also changed the documentation in a few places to
reflect this. In a few places, protections were required for CUDA
5.0 (hyb2csr conversion and in the stream creation in veccusp.cu).

Also adding code to the init.c that 1) checks cuda error codes and
2) sets the device flags so that memory can be registered as paged-
locked via : cudaSetDeviceFlags(cudaDeviceMapHost). This should be
valid for all 1.3 devices and later. Moreover, these changes allow
multiple MPI threads to work on 1 GPU using cuda streams in a thread
safe manner.

show more ...


# b06137fd 27-Jun-2013 Paul Mullowney <paulm@txcorp.com>

Removing TXPETSCGPU from veccusp and mpiaijcusparse

In this next step of removing TXPETSCGPU, the host-device and
device-host messaging code has been significantly simplified. In
particular, all met

Removing TXPETSCGPU from veccusp and mpiaijcusparse

In this next step of removing TXPETSCGPU, the host-device and
device-host messaging code has been significantly simplified. In
particular, all methods VecCUSPCopyToGPU/FromGPU now use
a cudaMemcpyAsync with a stream (and a stream synchronize()).
This never hurts you. Moreover, it can help you in the case
of the multi-GPU SpMV as this data transfer will overlap
with the MatMult kernel. The more signficant change comes in
VecCUSPCopyToGPUSome and VecCUSPCopyFromGPUSome. In this code,
the data transfer now moves the smallest contiguous set of
vector data containing ALL the indices in a single asynchronous data
transfer. Then, the stream containing the data transfer is
synchronized (not the entire device). While this can be wasteful
in terms of messaging too much data, it has shown the best
scalability performance across a wide range of matrices. Lastly
the simplicity of the code is a significant advantage over
the old way of doing the data transfer. Some old cold
in these methods is "if 0"-ed out for reference and will be
cleaned up later. One final optimization in the vector code
involves registering the host buffer as page locked--which
is done in VecCUSPAllocateCheck. Then, the buffer must be
unregistered at VecDestroy_SeqCUSP. This shows a nice
speedup in the data transfer for a parallel MatMult.

Also in this commit, I am removing the TXPETSCGPU dependence from
the mpiaijcusparse class--it now depends only on CUDA. In order
for the same stream to be used in the MatMult and MatMultAdd
(necessary for an optimal Multi-GPU SpMV), the stream is built
in the mpiaijcusparse and then passed in the seqaijcusparse data
structure via a new method (MatCUSPARSESetStream). A similar method
is added for the CUSPARSE library handle (context) as I think the
stream needs to be attached to a particular context to work properly.
When running in parallel, multiple GPUs, the references to the handle
in the seqaijcusparse are cleared from the mpiaijcusparse classes with
the method MatCUSPARSEClearHandle. Then, the mpiaijcusparse class
deletes the handle.

One other non-trivial change was made to the seqaijcusparse. The alpha
and beta parameters to the SpMV are now device data which is owned by
the Mat_SEQAIJCUSPARSEMultStruct structure. This enables slightly better
multi-GPU performance as this data does not need to be copied to the
GPU at each kernel launch.

Multi-GPU SpMV now works without TXPETSCGPU and the performance is recovered
as tested on up to 4 GPUs. Code is valgrind clean and cuda-memcheck clean.

Results of tests have been modified to have 1 less digit of precision. This
yields consistent results across different GPUs. Lastly, the parallel test
is set to run on a different matrix (shallow_water1) so that the iteration
actually converges.

show more ...


# 3bb1ff40 28-May-2013 Barry Smith <bsmith@mcs.anl.gov>

logging memory now credits to all ancestors


# 7e7d4f0d 10-Apr-2013 Richard Mills <rtm@eecs.utk.edu>

Merged petsc/petsc into rmills/petsc master


# 8a1af44d 03-Apr-2013 Jed Brown <jed@59A2.org>

Merge branch 'barry/rm-xxxregisterdynamic'

* barry/rm-xxxregisterdynamic:
Registration: remove stale 'XXRegisterDynamic)' entries in man pages
TS examples: fix use of PetscFunctionList and add t

Merge branch 'barry/rm-xxxregisterdynamic'

* barry/rm-xxxregisterdynamic:
Registration: remove stale 'XXRegisterDynamic)' entries in man pages
TS examples: fix use of PetscFunctionList and add to nightlies
Changes: Note updates to XRegisterDynamic/PetscObjectComposeFunctionDynamic
PetscObjectComposeFunctionDynamic: remove stale docs and usage
developers.tex: remove complications from function composition with dlls
removed string version of function name for XXXRegister(), PetscFunctionListAdd() and PetscObjectComposeFunction()
changes: document PetscFunctionListAdd() API change
developers.tex: update documentation of PetscObjectComposeFunction
removed path and MPI_Comm arguments from PetscFunctionListFind/Add()
removed path argument to XXXInitializePackage() and XXXRegister()
removed XXXRegisterDynamic() but kept the APIs for everything else underneath the same phase I of the update to handling registering function pointers

show more ...


# e1d27e54 28-Mar-2013 Jed Brown <jed@59A2.org>

Merge branch 'barry/rm-xxxregisterdynamic' into jed/ts-eimex

PetscObjectComposeFunctionDynamic() and TSRegisterDynamic() were
replaced by PetscObjectComposeFunction() and TSRegister(), both of which

Merge branch 'barry/rm-xxxregisterdynamic' into jed/ts-eimex

PetscObjectComposeFunctionDynamic() and TSRegisterDynamic() were
replaced by PetscObjectComposeFunction() and TSRegister(), both of which
drop the string name argument.

* barry/rm-xxxregisterdynamic: (82 commits)
...

Conflicts:
src/ts/interface/tsregall.c

show more ...


# bdf89e91 26-Mar-2013 Barry Smith <bsmith@mcs.anl.gov>

removed string version of function name for XXXRegister(), PetscFunctionListAdd() and PetscObjectComposeFunction()


# 4042b796 17-Mar-2013 Jed Brown <jed@59A2.org>

Merge branch 'master' into jed/ts-eimex

Sync to include Git conversion, PETSC_EXTERN, and minor API changes.

Conflicts:
src/ts/interface/tsregall.c


# c19eab39 06-Mar-2013 Richard Tran Mills <rmills@ornl.gov>

Automerge.

Hg-commit: b6659d546870fb013f3da5bcd5066d1dc0dc329c


# 296840b1 06-Mar-2013 Jed Brown <jed@59A2.org>

Merge branch 'master' of gitifyhg::ssh://hg@bitbucket.org/BarryFSmith/petsc-dev-simp

Symbol visibility and namespacing.

C++ builds always set extern "C" and can be called from plain C. Most
users w

Merge branch 'master' of gitifyhg::ssh://hg@bitbucket.org/BarryFSmith/petsc-dev-simp

Symbol visibility and namespacing.

C++ builds always set extern "C" and can be called from plain C. Most
users will only want --with-clanguage=C++ for std::complex.


Hg-commit: f848d02318cae92d7b32037c7ee88f92dbe46347

show more ...


# 8cc058d9 06-Mar-2013 Jed Brown <jed@59A2.org>

Change all PETSC_EXTERN_C to PETSC_EXTERN

Hg-commit: 8d2ebbb193fb583bccc64015e35640c4e08c3426


# 39d7646b 06-Mar-2013 Jed Brown <jed@59A2.org>

Change all PETSC_EXTERN_C to PETSC_EXTERN


Hg-commit: ba0cf153561ff2dc521f42e94b7164fbe7b5d798


# b2573a8a 05-Mar-2013 Barry Smith <bsmith@mcs.anl.gov>

completed removing unneeded EXTERN_C_BEGIN/END from Mat directories and converting to PETSC_EXTERN_C for constructors
tested with and without dynamic with and without C++

Hg-commit: 7d27d7f4d9ea3bfe

completed removing unneeded EXTERN_C_BEGIN/END from Mat directories and converting to PETSC_EXTERN_C for constructors
tested with and without dynamic with and without C++

Hg-commit: 7d27d7f4d9ea3bfe6616fafdfb32d046b5db53a1

show more ...


# e9fbd226 05-Mar-2013 Richard Tran Mills <rmills@ornl.gov>

Automerge.

Hg-commit: 2a552fd584bf855b9dc42efec9e8ab778063a84f


# 00de8ff0 04-Mar-2013 Barry Smith <bsmith@mcs.anl.gov>

changed use of PetscObjectComposeFunctionDynamic() to PetscObjectComposeFunction() to allow use of static for functions for standard use in PETSc
PetscObjectComposeFunctionDynamic() is still availabl

changed use of PetscObjectComposeFunctionDynamic() to PetscObjectComposeFunction() to allow use of static for functions for standard use in PETSc
PetscObjectComposeFunctionDynamic() is still available for use if needed
also fixed calls to PetscObjectComposeFunction() to not wrap lines (per PETSc coding style)

Hg-commit: 822f9ddaac95a8ff6c2a9ad77fbf07d02d2c20d9

show more ...


# cbf1f8ac 13-Feb-2013 Satish Balay <balay@mcs.anl.gov>

more PetscLayoutReference -> VecSetLayout ref: 8264b0c10223

Hg-commit: 17b9145a76635c848112b30e4b82c2f23f7101de


# ce94432e 13-Feb-2013 Barry Smith <bsmith@mcs.anl.gov>

added PetscObjectComm() and used it to replace (((PetscObject)obj)->comm)

Hg-commit: 3da37c458124ad48ae939f4e9823e4430ee0b8be


# 0298fd71 08-Feb-2013 Barry Smith <bsmith@mcs.anl.gov>

removed PETSC_NULL from C and Fortran (except declaration in C for backward compatibility). Kept PETSC_NULL_xxx for Fortran
Fixed a few bugs where PETSC_NULL had been used incorrectly.

Hg-commit: 05

removed PETSC_NULL from C and Fortran (except declaration in C for backward compatibility). Kept PETSC_NULL_xxx for Fortran
Fixed a few bugs where PETSC_NULL had been used incorrectly.

Hg-commit: 054705a517d7f4388a8a084415d7478cbe95dff4

show more ...


# 7e590d5f 03-Feb-2013 Barry Smith <bsmith@mcs.anl.gov>

commit after merge

Hg-commit: db805e8197486aa7db018c01793dec447b9e9cbb


# 26fbe8dc 02-Feb-2013 Karl Rupp <rupp@mcs.anl.gov>

Uncrustified src/mat/*.

Hg-commit: 5c6f04286a6cfcd98361b2479b884c0041d95b73


# e79ce49d 02-Feb-2013 Barry Smith <bsmith@mcs.anl.gov>

commit after merge

Hg-commit: f37b1e00e84f4f2c19b94a78ed2de72bd29e5778


# 2205254e 02-Feb-2013 Karl Rupp <rupp@mcs.anl.gov>

Partially uncrustified /src/mat/*

Hg-commit: f66b7241e67ccd55e47747ce1f2433e82e4f86b9


# 31d8eec5 01-Feb-2013 Barry Smith <bsmith@mcs.anl.gov>

commit after merge

Hg-commit: 1fa981254c79c783277e37654668dfe698cddf41


# 8468deee 01-Feb-2013 Karl Rupp <rupp@mcs.anl.gov>

Applied patch by Paul Mullowney for fixing CUSPARSE and CUSP documentation.

Hg-commit: 514cc9069e33afcb15e6994adcde72dc463e716f


1...<<1112131415