#include <taskflow/cuda/cuda_stream.hpp>
template<typename Creator, typename Deleter>
cudaStreamBase class
class to create a smart pointer wrapper for managing cudaStream_t
Template parameters | |
---|---|
Creator | functor to create the stream (used in constructor) |
Deleter | functor to delete the stream (used in destructor) |
The cudaStream
class encapsulates a cudaStream_t
using std::
, ensuring that CUDA events are properly created and destroyed with a unique ownership.
Public types
-
using base_type = std::
unique_ptr<std::remove_pointer_t<cudaStream_t>, Deleter> - base type for the underlying unique pointer
Constructors, destructors, conversion operators
-
template<typename... ArgsT>cudaStreamBase(ArgsT && ... args) explicit
- constructs a
cudaStream
object by passing the given arguments to the stream creator - cudaStreamBase(cudaStreamBase&&) defaulted
- constructs a
cudaStream
from the given rhs using move semantics
Public functions
- auto operator=(cudaStreamBase&&) -> cudaStreamBase& defaulted
- assign the rhs to
*this
using move semantics - auto synchronize() -> cudaStreamBase&
- synchronizes the associated stream
- void begin_capture(cudaStreamCaptureMode m = cudaStreamCaptureModeGlobal) const
- begins graph capturing on the stream
- auto end_capture() const -> cudaGraph_t
- ends graph capturing on the stream
- void record(cudaEvent_t event) const
- records an event on the stream
- void wait(cudaEvent_t event) const
- waits on an event
-
template<typename C, typename D>auto run(const cudaGraphExecBase<C, D>& exec) -> cudaStreamBase&
- runs the given executable CUDA graph
- auto run(cudaGraphExec_t exec) -> cudaStreamBase&
- runs the given executable CUDA graph
Typedef documentation
template<typename Creator, typename Deleter>
using tf:: cudaStreamBase<Creator, Deleter>:: base_type = std:: unique_ptr<std::remove_pointer_t<cudaStream_t>, Deleter>
base type for the underlying unique pointer
This alias provides a shorthand for the underlying std::
type that manages CUDA stream resources with an associated deleter.
Function documentation
template<typename Creator, typename Deleter>
template<typename... ArgsT>
tf:: cudaStreamBase<Creator, Deleter>:: cudaStreamBase(ArgsT && ... args) explicit
constructs a cudaStream
object by passing the given arguments to the stream creator
Parameters | |
---|---|
args | arguments to pass to the stream creator |
Constructs a cudaStream
object by passing the given arguments to the stream creator
template<typename Creator, typename Deleter>
cudaStreamBase& tf:: cudaStreamBase<Creator, Deleter>:: synchronize()
synchronizes the associated stream
Equivalently calling cudaStreamSynchronize
to block until this stream has completed all operations.
template<typename Creator, typename Deleter>
void tf:: cudaStreamBase<Creator, Deleter>:: begin_capture(cudaStreamCaptureMode m = cudaStreamCaptureModeGlobal) const
begins graph capturing on the stream
When a stream is in capture mode, all operations pushed into the stream will not be executed, but will instead be captured into a graph, which will be returned via cudaStream::
A thread's mode can be one of the following:
cudaStreamCaptureModeGlobal:
This is the default mode. If the local thread has an ongoing capture sequence that was not initiated withcudaStreamCaptureModeRelaxed
atcuStreamBeginCapture
, or if any other thread has a concurrent capture sequence initiated withcudaStreamCaptureModeGlobal
, this thread is prohibited from potentially unsafe API calls.cudaStreamCaptureModeThreadLocal:
If the local thread has an ongoing capture sequence not initiated withcudaStreamCaptureModeRelaxed
, it is prohibited from potentially unsafe API calls. Concurrent capture sequences in other threads are ignored.cudaStreamCaptureModeRelaxed:
The local thread is not prohibited from potentially unsafe API calls. Note that the thread is still prohibited from API calls which necessarily conflict with stream capture, for example, attemptingcudaEventQuery
on an event that was last recorded inside a capture sequence.
template<typename Creator, typename Deleter>
cudaGraph_t tf:: cudaStreamBase<Creator, Deleter>:: end_capture() const
ends graph capturing on the stream
Equivalently calling cudaStreamEndCapture
to end capture on stream and returning the captured graph. Capture must have been initiated on stream via a call to cudaStream::
template<typename Creator, typename Deleter>
void tf:: cudaStreamBase<Creator, Deleter>:: record(cudaEvent_t event) const
records an event on the stream
Equivalently calling cudaEventRecord
to record an event on this stream, both of which must be on the same CUDA context.
template<typename Creator, typename Deleter>
void tf:: cudaStreamBase<Creator, Deleter>:: wait(cudaEvent_t event) const
waits on an event
Equivalently calling cudaStreamWaitEvent
to make all future work submitted to stream wait for all work captured in event.
template<typename Creator, typename Deleter>
template<typename C, typename D>
cudaStreamBase& tf:: cudaStreamBase<Creator, Deleter>:: run(const cudaGraphExecBase<C, D>& exec)
runs the given executable CUDA graph
Parameters | |
---|---|
exec | the given cudaGraphExec |
template<typename Creator, typename Deleter>
cudaStreamBase& tf:: cudaStreamBase<Creator, Deleter>:: run(cudaGraphExec_t exec)
runs the given executable CUDA graph
Parameters | |
---|---|
exec | the given cudaGraphExec_t |