tf::cudaFlowSequentialOptimizer class

class to capture a CUDA graph using a sequential stream

A sequential capturing algorithm finds a topological order of the described graph and captures dependent GPU tasks using a single stream. All GPU tasks run sequentially without breaking inter dependencies.

Constructors, destructors, conversion operators

cudaFlowSequentialOptimizer() defaulted
constructs a sequential optimizer