| 0fa67573 | 26-Dec-2021 |
Jed Brown <jed@jedbrown.org> |
thrust: use thrust::cuda::par_nosync in vector operations
Version 1.16 of Thrust adds policy thrust::cuda::par_nosync, which accepts a stream argument and does not synchronize, thus preventing a sta
thrust: use thrust::cuda::par_nosync in vector operations
Version 1.16 of Thrust adds policy thrust::cuda::par_nosync, which accepts a stream argument and does not synchronize, thus preventing a stall waiting for the CPU to learn the kernel has completed before launching its next operation.
https://github.com/NVIDIA/thrust/pull/1568
This feature (not blocking for kernels that don't need to) had been removed (breaking change) in Thrust-1.9.4 to simplify error handling behavior and because a futures-based async interface had been deemed sufficient. This issue describes the history and rationale for the new par_nosync feature.
https://github.com/NVIDIA/thrust/issues/1515
show more ...
|