History log of /petsc/src/mat/impls/sell/seq/sell.c (Results 76 – 100 of 249)
Revision Date Author Comments
# 90d2215b 12-Jan-2021 Hong Zhang <hongzhang@anl.gov>

Add the load-balancing kernel for MatMultAdd_SeqSELL and fine tune the heuristic

Kernel7 is significantly slower than kernel9x for the following two cases:
- nrows is too small. Kernel7 uses 2 threa

Add the load-balancing kernel for MatMultAdd_SeqSELL and fine tune the heuristic

Kernel7 is significantly slower than kernel9x for the following two cases:
- nrows is too small. Kernel7 uses 2 threads per row (assuming sliceheight=16), it does not fully utilize the GPU if nrows < 100K.
- maxslicewidth is too big.

Thanks-to: Peng Wang <penwang@nvidia.com>

show more ...


# 4e58db63 31-Dec-2020 Hong Zhang <hongzhang@anl.gov>

Make slice height more flexible

- The slice height now does not have to match device memory alignment; it just need to be divisible by DEVICE_MEM_ALIGN
- Pad each slice with extra columns to achieve

Make slice height more flexible

- The slice height now does not have to match device memory alignment; it just need to be divisible by DEVICE_MEM_ALIGN
- Pad each slice with extra columns to achieve coalesced memory access if needed

show more ...


# 07e43b41 10-Sep-2020 Hong Zhang <hongzhang@anl.gov>

Further optimization of MatMult_SeqSELLCUDA

- Add more kernels
- Use multiple threads per row for matrices with narrow slices
- Use multiple blocks per slice for matrices with wide slices
- Add thre

Further optimization of MatMult_SeqSELLCUDA

- Add more kernels
- Use multiple threads per row for matrices with narrow slices
- Use multiple blocks per slice for matrices with wide slices
- Add three new APIs to return the irregularity ratio, the maximum slice width and the average slice width

Experiments show that column blocking gives much worse performance for wide matrices and permulation based on slice width has almost no impact on the performance.

show more ...


# 2d1451d4 09-Jan-2020 Hong Zhang <hongzhang@anl.gov>

Initial commit for porting SELL to GPU

- Add tiled SPMV and basic SpMVfor SeqSELL
- Tested in serial
- Offloadmask is used to determine when the matrix should be copied to GPU
- Use different slice

Initial commit for porting SELL to GPU

- Add tiled SPMV and basic SpMVfor SeqSELL
- Tested in serial
- Offloadmask is used to determine when the matrix should be copied to GPU
- Use different slice height for CUDA version
- By checking the nonzerostate, PETSc can decide if the whole matrix need to be copied or just the values need to be copied
- Make the convert function public so that the very slow MatConvert_Basic can be avoided sometimes. E.g. one can use a two-step convert method: AIJ->SELL,SELL->SELLCUDA instead of the direct convert AIJ->SELLCUDA
- Make the FLOPS count for SELL same as that for AIJCUSPARSE.
- MatDisAssemble is not needed.
- Change slice height from 32 to 16 for GPU
- To overlap communication with MatMult, VecScatterBegin() should be called before MatMult() for the diagonal part.
- SLICE_HEIGHT is defined to be 32 to match the warp size of GPU. For other cases, it is still 8.

Funded-by:
Project: PETSc for GPU
Time: 42 hours
Reported-by:
Thanks-to:

show more ...


# 80f6d96d 01-Apr-2023 Satish Balay <balay@mcs.anl.gov>

Merge remote-tracking branch 'origin/release'


# 08eaad2d 01-Apr-2023 Satish Balay <balay@mcs.anl.gov>

Merge branch 'jolivet/fix-typos-portability' into 'release'

Fix typos, portability issues, segmentation fault

See merge request petsc/petsc!6267


# aaa8cc7d 31-Mar-2023 Pierre Jolivet <pierre@joliv.et>

Fix some documentation and typos


# e9f36840 18-Mar-2023 Satish Balay <balay@mcs.anl.gov>

Merge branch 'barry/2023-03-08/fix-man-pages-detected-by-lint' into 'main'

Fix many manual pages

See merge request petsc/petsc!6162


# 20f4b53c 09-Mar-2023 Barry Smith <bsmith@mcs.anl.gov>

Fix manual pages based on reports from Jacob's lint tool

Commit-type: documentation


# 6c749b74 07-Mar-2023 Satish Balay <balay@mcs.anl.gov>

Merge branch 'barry/2023-03-01/fix-mat-man-pages' into 'main'

Cleanup of mat manual pages

See merge request petsc/petsc!6134


# 2ef1f0ff 01-Mar-2023 Barry Smith <bsmith@mcs.anl.gov>

Cleanup of mat manual pages

Commit-type: documentation


# 7a3a620f 24-Feb-2023 Satish Balay <balay@mcs.anl.gov>

Merge branch 'jolivet/housekeeping' into 'main'

Double spaces, wrong backticks, or unneeded braces

See merge request petsc/petsc!6110


# aa624791 24-Feb-2023 Pierre Jolivet <pierre@joliv.et>

Double spaces, wrong backticks, or unneeded braces


# a682ec2a 23-Feb-2023 Satish Balay <balay@mcs.anl.gov>

Merge branch 'barry/2023-02-17/man-page-fixes-huge-jacob-automatic' into 'main'

A variety of manual page fixes for problems found by Jacob's lint or noted...

See merge request petsc/petsc!6088


# 27430b45 23-Feb-2023 Barry Smith <bsmith@mcs.anl.gov>

A variety of manual page fixes for problems found by Jacob's lint or noted while fixing those problems

Commit-type: docs-only


# 2975ceb4 13-Feb-2023 Satish Balay <balay@mcs.anl.gov>

Merge remote-tracking branch 'origin/release'


# 0f3c9fe5 13-Feb-2023 Satish Balay <balay@mcs.anl.gov>

Merge branch 'barry/2023-02-07/fix-man-pages/release' into 'release'

Fix a few manual pages using Jacob's make lint information

See merge request petsc/petsc!6029


# 67be906f 07-Feb-2023 Barry Smith <bsmith@mcs.anl.gov>

Fix a few manual pages using Jacob's make lint information

Commit-type: documentation


# 37d05b02 06-Feb-2023 Satish Balay <balay@mcs.anl.gov>

Merge remote-tracking branch 'origin/release'


# b877537e 05-Feb-2023 Satish Balay <balay@mcs.anl.gov>

Merge branch 'jolivet/fix-typos' into 'release'

Fix Typos

See merge request petsc/petsc!6024


# da81f932 05-Feb-2023 Pierre Jolivet <pierre@joliv.et>

Fix Typos


# 31d78bcd 02-Feb-2023 Satish Balay <balay@mcs.anl.gov>

Merge branch 'jacobf/2022-12-10/petscerrorcode-nodiscard' into 'main'

Feature: Non-discardable PetscErrorCode

See merge request petsc/petsc!5923


# 3ba16761 10-Dec-2022 Jacob Faibussowitsch <jacob.fai@gmail.com>

Make PetscErrorCode a non-discardable enum


# d441b7a2 12-Nov-2022 Satish Balay <balay@mcs.anl.gov>

Merge branch 'hongzh/improve-fd-coloring' into 'main'

Add MatEliminateZeros

See merge request petsc/petsc!5816


# dec0b466 07-Nov-2022 Hong Zhang <hongzhang@anl.gov>

Add MatEliminateZeros


12345678910