Loading...
Searching...
No Matches
tf::cudaStreamBase< Creator, Deleter > Class Template Reference

class to create a CUDA stream with unique ownership More...

#include <taskflow/cuda/cuda_stream.hpp>

Inheritance diagram for tf::cudaStreamBase< Creator, Deleter >:
[legend]
Collaboration diagram for tf::cudaStreamBase< Creator, Deleter >:
[legend]

Public Types

using base_type = std::unique_ptr<std::remove_pointer_t<cudaStream_t>, Deleter>
 base type for the underlying unique pointer
 

Public Member Functions

template<typename... ArgsT>
 cudaStreamBase (ArgsT &&... args)
 constructs a cudaStream object by passing the given arguments to the stream creator
 
 cudaStreamBase (cudaStreamBase &&)=default
 constructs a cudaStream from the given rhs using move semantics
 
cudaStreamBaseoperator= (cudaStreamBase &&)=default
 assign the rhs to *this using move semantics
 
cudaStreamBasesynchronize ()
 synchronizes the associated stream
 
void begin_capture (cudaStreamCaptureMode m=cudaStreamCaptureModeGlobal) const
 begins graph capturing on the stream
 
cudaGraph_t end_capture () const
 ends graph capturing on the stream
 
void record (cudaEvent_t event) const
 records an event on the stream
 
void wait (cudaEvent_t event) const
 waits on an event
 
template<typename C, typename D>
cudaStreamBaserun (const cudaGraphExecBase< C, D > &exec)
 runs the given executable CUDA graph
 
cudaStreamBaserun (cudaGraphExec_t exec)
 runs the given executable CUDA graph
 

Detailed Description

template<typename Creator, typename Deleter>
class tf::cudaStreamBase< Creator, Deleter >

class to create a CUDA stream with unique ownership

Template Parameters
Creatorfunctor to create the stream (used in constructor)
Deleterfunctor to delete the stream (used in destructor)

The cudaStream class encapsulates a cudaStream_t using std::unique_ptr, ensuring that CUDA events are properly created and destroyed with a unique ownership.

Member Typedef Documentation

◆ base_type

template<typename Creator, typename Deleter>
using tf::cudaStreamBase< Creator, Deleter >::base_type = std::unique_ptr<std::remove_pointer_t<cudaStream_t>, Deleter>

base type for the underlying unique pointer

This alias provides a shorthand for the underlying std::unique_ptr type that manages CUDA stream resources with an associated deleter.

Constructor & Destructor Documentation

◆ cudaStreamBase()

template<typename Creator, typename Deleter>
template<typename... ArgsT>
tf::cudaStreamBase< Creator, Deleter >::cudaStreamBase ( ArgsT &&... args)
inlineexplicit

constructs a cudaStream object by passing the given arguments to the stream creator

Constructs a cudaStream object by passing the given arguments to the stream creator

Parameters
argsarguments to pass to the stream creator

Member Function Documentation

◆ begin_capture()

template<typename Creator, typename Deleter>
void tf::cudaStreamBase< Creator, Deleter >::begin_capture ( cudaStreamCaptureMode m = cudaStreamCaptureModeGlobal) const
inline

begins graph capturing on the stream

When a stream is in capture mode, all operations pushed into the stream will not be executed, but will instead be captured into a graph, which will be returned via cudaStream::end_capture.

A thread's mode can be one of the following:

  • cudaStreamCaptureModeGlobal: This is the default mode. If the local thread has an ongoing capture sequence that was not initiated with cudaStreamCaptureModeRelaxed at cuStreamBeginCapture, or if any other thread has a concurrent capture sequence initiated with cudaStreamCaptureModeGlobal, this thread is prohibited from potentially unsafe API calls.
  • cudaStreamCaptureModeThreadLocal: If the local thread has an ongoing capture sequence not initiated with cudaStreamCaptureModeRelaxed, it is prohibited from potentially unsafe API calls. Concurrent capture sequences in other threads are ignored.
  • cudaStreamCaptureModeRelaxed: The local thread is not prohibited from potentially unsafe API calls. Note that the thread is still prohibited from API calls which necessarily conflict with stream capture, for example, attempting cudaEventQuery on an event that was last recorded inside a capture sequence.

◆ end_capture()

template<typename Creator, typename Deleter>
cudaGraph_t tf::cudaStreamBase< Creator, Deleter >::end_capture ( ) const
inline

ends graph capturing on the stream

Equivalently calling cudaStreamEndCapture to end capture on stream and returning the captured graph. Capture must have been initiated on stream via a call to cudaStream::begin_capture. If capture was invalidated, due to a violation of the rules of stream capture, then a NULL graph will be returned.

◆ record()

template<typename Creator, typename Deleter>
void tf::cudaStreamBase< Creator, Deleter >::record ( cudaEvent_t event) const
inline

records an event on the stream

Equivalently calling cudaEventRecord to record an event on this stream, both of which must be on the same CUDA context.

◆ run() [1/2]

template<typename Creator, typename Deleter>
template<typename C, typename D>
cudaStreamBase & tf::cudaStreamBase< Creator, Deleter >::run ( const cudaGraphExecBase< C, D > & exec)

runs the given executable CUDA graph

Parameters
execthe given cudaGraphExec

◆ run() [2/2]

template<typename SC, typename SD>
cudaStreamBase< SC, SD > & tf::cudaStreamBase< SC, SD >::run ( cudaGraphExec_t exec)

runs the given executable CUDA graph

Parameters
execthe given cudaGraphExec_t

◆ synchronize()

template<typename Creator, typename Deleter>
cudaStreamBase & tf::cudaStreamBase< Creator, Deleter >::synchronize ( )
inline

synchronizes the associated stream

Equivalently calling cudaStreamSynchronize to block until this stream has completed all operations.

◆ wait()

template<typename Creator, typename Deleter>
void tf::cudaStreamBase< Creator, Deleter >::wait ( cudaEvent_t event) const
inline

waits on an event

Equivalently calling cudaStreamWaitEvent to make all future work submitted to stream wait for all work captured in event.


The documentation for this class was generated from the following files: