Loading...
Searching...
No Matches
tf Namespace Reference

taskflow namespace More...

Classes

class  AsyncTask
 class to hold a dependent asynchronous task with shared ownership More...
 
class  AtomicIntrusiveStack
 class to create a lock-free, ABA-safe intrusive stack More...
 
class  BoundedWSQ
 class to create a lock-free bounded work-stealing queue More...
 
class  CachelineAligned
 class to ensure cacheline-aligned storage for an object. More...
 
class  cudaEventBase
 class to create a CUDA event with unique ownership More...
 
class  cudaEventCreator
 class to create functors that construct CUDA events More...
 
class  cudaEventDeleter
 class to create a functor that deletes a CUDA event More...
 
class  cudaGraphBase
 class to create a CUDA graph with uunique ownership More...
 
class  cudaGraphCreator
 class to create functors that construct CUDA graphs More...
 
class  cudaGraphDeleter
 class to create a functor that deletes a CUDA graph More...
 
class  cudaGraphExecBase
 class to create an executable CUDA graph with unique ownership More...
 
class  cudaGraphExecCreator
 class to create functors for constructing executable CUDA graphs More...
 
class  cudaGraphExecDeleter
 class to create a functor for deleting an executable CUDA graph More...
 
class  cudaScopedDevice
 class to create an RAII-styled context switch More...
 
class  cudaStreamBase
 class to create a CUDA stream with unique ownership More...
 
class  cudaStreamCreator
 class to create functors that construct CUDA streams More...
 
class  cudaStreamDeleter
 class to create a functor that deletes a CUDA stream More...
 
class  cudaTask
 class to create a task handle of a CUDA Graph node More...
 
class  DataPipe
 class to create a stage in a data-parallel pipeline More...
 
class  DataPipeline
 class to create a data-parallel pipeline scheduling framework More...
 
class  DefaultClosureWrapper
 class to create a default closure wrapper More...
 
class  DefaultTaskParams
 class to create an empty task parameter for compile-time optimization More...
 
class  DynamicPartitioner
 class to create a dynamic partitioner for scheduling parallel algorithms More...
 
class  Executor
 class to create an executor More...
 
class  FlowBuilder
 class to build a task dependency graph More...
 
class  Future
 class to access the result of an execution More...
 
class  Graph
 class to create a graph object More...
 
class  GuidedPartitioner
 class to create a guided partitioner for scheduling parallel algorithms More...
 
class  IndexRange
 class to create an N-dimensional index range of integral indices More...
 
class  IndexRange< T, 1 >
 class to create a 1D index range of integral indices with a step size More...
 
class  NonblockingNotifier
 class to create a non-blocking notifier More...
 
class  ObserverInterface
 class to derive an executor observer More...
 
class  PartitionerBase
 class to derive a partitioner for scheduling parallel algorithms More...
 
class  Pipe
 class to create a pipe object for a pipeline stage More...
 
class  Pipeflow
 class to create a pipeflow object used by the pipe callable More...
 
class  Pipeline
 class to create a pipeline scheduling framework More...
 
class  RandomPartitioner
 class to construct a random partitioner for scheduling parallel algorithms More...
 
class  Runtime
 class to create a runtime task More...
 
class  ScalablePipeline
 class to create a scalable pipeline object More...
 
class  Semaphore
 class to create a semophore object for building a concurrency constraint More...
 
class  SmallVector
 class to define a vector optimized for small array More...
 
class  StaticPartitioner
 class to construct a static partitioner for scheduling parallel algorithms More...
 
class  Subflow
 class to construct a subflow graph from the execution of a dynamic task More...
 
class  Task
 class to create a task handle over a taskflow node More...
 
class  Taskflow
 class to create a taskflow object More...
 
class  TaskGroup
 class to create a task group from a task More...
 
class  TaskParams
 class to create a task parameter object More...
 
class  TaskView
 class to access task information from the observer interface More...
 
class  UnboundedWSQ
 class to create a lock-free unbounded work-stealing queue More...
 
class  Worker
 class to create a worker in an executor More...
 
class  WorkerInterface
 class to configure worker behavior in an executor More...
 
class  WorkerView
 class to create an immutable view of a worker More...
 
class  Xorshift
 class to create a fast xorshift-based pseudo-random number generator More...
 

Concepts

concept  IndexRangeLike
 concept to check if a type an tf::IndexRange, regardless of dimensionality
 
concept  IndexRange1DLike
 concept to check if a type is a tf::IndexRange<T, 1>.
 
concept  IndexRangeMDLike
 concept to check if a type is a tf::IndexRange<T, N> with rank > 1
 
concept  TaskParameters
 determines if a type is a task parameter type
 
concept  HasGraph
 concept that determines if a type owns or provides access to a tf::Graph
 
concept  StaticTask
 determines if a callable is a static task
 
concept  SubflowTask
 determines if a callable is a subflow task
 
concept  RuntimeTask
 determines if a callable is a runtime task
 
concept  ConditionTask
 determines if a callable is a condition task
 
concept  MultiConditionTask
 determines if a callable is a multi-condition task
 
concept  Partitioner
 determines if a type is a partitioner
 

Typedefs

using DefaultNotifier = NonblockingNotifier
 the default notifier type used by Taskflow
 
using observer_stamp_t = std::chrono::time_point<std::chrono::steady_clock>
 default time point type of observers
 
using DefaultPartitioner = GuidedPartitioner<>
 default partitioner set to tf::GuidedPartitioner
 
using cudaEvent = cudaEventBase<cudaEventCreator, cudaEventDeleter>
 default smart pointer type to manage a cudaEvent_t object with unique ownership
 
using cudaStream = cudaStreamBase<cudaStreamCreator, cudaStreamDeleter>
 default smart pointer type to manage a cudaStream_t object with unique ownership
 
using cudaGraph = cudaGraphBase<cudaGraphCreator, cudaGraphDeleter>
 default smart pointer type to manage a cudaGraph_t object with unique ownership
 
using cudaGraphExec = cudaGraphExecBase<cudaGraphExecCreator, cudaGraphExecDeleter>
 default smart pointer type to manage a cudaGraphExec_t object with unique ownership
 

Enumerations

enum class  TaskType : int {
  PLACEHOLDER = 0 , STATIC , RUNTIME , SUBFLOW ,
  CONDITION , MODULE , ASYNC , UNDEFINED
}
 enumeration of all task types More...
 
enum class  PartitionerType : int { STATIC , DYNAMIC }
 enumeration of all partitioner types More...
 
enum class  PipeType : int { PARALLEL = 1 , SERIAL = 2 }
 enumeration of all pipe types More...
 

Functions

template<typename T>
requires (std::is_unsigned_v<std::decay_t<T>> && sizeof(T) == 8)
constexpr T next_pow2 (T x)
 rounds the given 64-bit unsigned integer to the nearest power of 2
 
template<typename T>
requires (std::is_unsigned_v<std::decay_t<T>> && sizeof(T) == 4)
constexpr T next_pow2 (T y)
 rounds the given 32-bit unsigned integer to the nearest power of 2
 
template<std::integral T>
constexpr bool is_pow2 (const T &x)
 checks if the given number is a power of 2
 
template<size_t N>
constexpr size_t static_floor_log2 ()
 returns the floor of log2(N) at compile time
 
template<typename RandItr, typename C>
RandItr median_of_three (RandItr l, RandItr m, RandItr r, C cmp)
 finds the median of three numbers pointed to by iterators using the given comparator
 
template<typename RandItr, typename C>
RandItr pseudo_median_of_nine (RandItr beg, RandItr end, C cmp)
 finds the pseudo median of a range of items using a spread of nine numbers
 
template<typename Iter, typename Compare>
void sort2 (Iter a, Iter b, Compare comp)
 sorts two elements of dereferenced iterators using the given comparison function
 
template<typename Iter, typename Compare>
void sort3 (Iter a, Iter b, Iter c, Compare comp)
 Sorts three elements of dereferenced iterators using the given comparison function.
 
template<std::integral T>
unique_id ()
 generates a program-wide unique ID of the given type in a thread-safe manner
 
template<typename T>
void atomic_max (std::atomic< T > &v, const T &max_v) noexcept
 updates an atomic variable with the maximum value
 
template<typename T>
void atomic_min (std::atomic< T > &v, const T &min_v) noexcept
 updates an atomic variable with the minimum value
 
template<typename T>
seed () noexcept
 generates a random seed based on the current system clock
 
constexpr size_t coprime (size_t N)
 computes a coprime of a given number
 
template<size_t N>
constexpr std::array< size_t, N > make_coprime_lut ()
 generates a compile-time array of coprimes for numbers from 0 to N-1
 
std::string get_env (const std::string &str)
 retrieves the value of an environment variable
 
bool has_env (const std::string &str)
 checks whether an environment variable is defined
 
template<std::integral T>
constexpr bool is_index_range_invalid (T beg, T end, T step)
 checks if the given index range is invalid
 
template<std::integral T>
constexpr size_t distance (T beg, T end, T step)
 calculates the number of iterations in the given index range
 
template<std::integral T>
 IndexRange (T, T, T) -> IndexRange< T, 1 >
 deduction guide for tf::IndexRange<T, 1>
 
template<HasGraph T>
Graphretrieve_graph (T &target)
 retrieves a reference to the underlying tf::Graph from an object
 
template<typename T>
constexpr auto wsq_empty_value ()
 returns the empty sentinel for work-stealing steal operations
 
template<typename T>
auto wsq_contended_value ()
 returns the contended sentinel for work-stealing steal operations
 
template<typename T, typename... ArgsT>
std::shared_ptr< T > make_worker_interface (ArgsT &&... args)
 helper function to create an instance derived from tf::WorkerInterface
 
const char * to_string (TaskType type)
 convert a task type to a human-readable string
 
std::ostream & operator<< (std::ostream &os, const Task &task)
 overload of ostream inserter operator for Task
 
template<typename Input, typename Output, typename C>
auto make_data_pipe (PipeType d, C &&callable)
 function to construct a data pipe (tf::DataPipe)
 
template<HasGraph T>
auto make_module_task (T &target)
 creates a module task using the given graph
 
size_t cuda_get_num_devices ()
 queries the number of available devices
 
int cuda_get_device ()
 gets the current device associated with the caller thread
 
void cuda_set_device (int id)
 switches to a given device context
 
void cuda_get_device_property (int i, cudaDeviceProp &p)
 obtains the device property
 
cudaDeviceProp cuda_get_device_property (int i)
 obtains the device property
 
void cuda_dump_device_property (std::ostream &os, const cudaDeviceProp &p)
 dumps the device property
 
size_t cuda_get_device_max_threads_per_block (int d)
 queries the maximum threads per block on a device
 
size_t cuda_get_device_max_x_dim_per_block (int d)
 queries the maximum x-dimension per block on a device
 
size_t cuda_get_device_max_y_dim_per_block (int d)
 queries the maximum y-dimension per block on a device
 
size_t cuda_get_device_max_z_dim_per_block (int d)
 queries the maximum z-dimension per block on a device
 
size_t cuda_get_device_max_x_dim_per_grid (int d)
 queries the maximum x-dimension per grid on a device
 
size_t cuda_get_device_max_y_dim_per_grid (int d)
 queries the maximum y-dimension per grid on a device
 
size_t cuda_get_device_max_z_dim_per_grid (int d)
 queries the maximum z-dimension per grid on a device
 
size_t cuda_get_device_max_shm_per_block (int d)
 queries the maximum shared memory size in bytes per block on a device
 
size_t cuda_get_device_warp_size (int d)
 queries the warp size on a device
 
int cuda_get_device_compute_capability_major (int d)
 queries the major number of compute capability of a device
 
int cuda_get_device_compute_capability_minor (int d)
 queries the minor number of compute capability of a device
 
bool cuda_get_device_unified_addressing (int d)
 queries if the device supports unified addressing
 
int cuda_get_driver_version ()
 queries the latest CUDA version (1000 * major + 10 * minor) supported by the driver
 
int cuda_get_runtime_version ()
 queries the CUDA Runtime version (1000 * major + 10 * minor)
 
size_t cuda_get_free_mem (int d)
 queries the free memory (expensive call)
 
size_t cuda_get_total_mem (int d)
 queries the total available memory (expensive call)
 
template<typename T>
T * cuda_malloc_device (size_t N, int d)
 allocates memory on the given device for holding N elements of type T
 
template<typename T>
T * cuda_malloc_device (size_t N)
 allocates memory on the current device associated with the caller
 
template<typename T>
T * cuda_malloc_shared (size_t N)
 allocates shared memory for holding N elements of type T
 
template<typename T>
void cuda_free (T *ptr, int d)
 frees memory on the GPU device
 
template<typename T>
void cuda_free (T *ptr)
 frees memory on the GPU device
 
void cuda_memcpy_async (cudaStream_t stream, void *dst, const void *src, size_t count)
 copies data between host and device asynchronously through a stream
 
void cuda_memset_async (cudaStream_t stream, void *devPtr, int value, size_t count)
 initializes or sets GPU memory to the given value byte by byte
 
template<typename T, std::enable_if_t<!std::is_same_v< T, void >, void > * = nullptr>
cudaMemcpy3DParms cuda_get_copy_parms (T *tgt, const T *src, size_t num)
 gets the memcpy node parameter of a copy task
 
cudaMemcpy3DParms cuda_get_memcpy_parms (void *tgt, const void *src, size_t bytes)
 gets the memcpy node parameter of a memcpy task (untyped)
 
cudaMemsetParams cuda_get_memset_parms (void *dst, int ch, size_t count)
 gets the memset node parameter of a memcpy task (untyped)
 
template<typename T, std::enable_if_t< is_pod_v< T > &&(sizeof(T)==1||sizeof(T)==2||sizeof(T)==4), void > * = nullptr>
cudaMemsetParams cuda_get_fill_parms (T *dst, T value, size_t count)
 gets the memset node parameter of a fill task (typed)
 
template<typename T, std::enable_if_t< is_pod_v< T > &&(sizeof(T)==1||sizeof(T)==2||sizeof(T)==4), void > * = nullptr>
cudaMemsetParams cuda_get_zero_parms (T *dst, size_t count)
 gets the memset node parameter of a zero task (typed)
 
size_t cuda_graph_get_num_root_nodes (cudaGraph_t graph)
 queries the number of root nodes in a native CUDA graph
 
size_t cuda_graph_get_num_nodes (cudaGraph_t graph)
 queries the number of nodes in a native CUDA graph
 
size_t cuda_graph_get_num_edges (cudaGraph_t graph, cudaGraphNode_t *from, cudaGraphNode_t *to)
 Handles compatibility with CUDA <= 12.x and CUDA == 13.x.
 
size_t cuda_graph_node_get_dependencies (cudaGraphNode_t node, cudaGraphNode_t *dependencies)
 Handles compatibility with CUDA <= 12.x and CUDA 13.
 
size_t cuda_graph_node_get_dependent_nodes (cudaGraphNode_t node, cudaGraphNode_t *dependent_nodes)
 Handles compatibility with CUDA <= 12.x and CUDA 13.
 
void cuda_graph_add_dependencies (cudaGraph_t graph, const cudaGraphNode_t *from, const cudaGraphNode_t *to, size_t numDependencies)
 Handles compatibility with CUDA <= 12.x and CUDA 13.
 
size_t cuda_graph_get_num_edges (cudaGraph_t graph)
 queries the number of edges in a native CUDA graph
 
std::vector< cudaGraphNode_t > cuda_graph_get_nodes (cudaGraph_t graph)
 acquires the nodes in a native CUDA graph
 
std::vector< cudaGraphNode_t > cuda_graph_get_root_nodes (cudaGraph_t graph)
 acquires the root nodes in a native CUDA graph
 
std::vector< std::pair< cudaGraphNode_t, cudaGraphNode_t > > cuda_graph_get_edges (cudaGraph_t graph)
 acquires the edges in a native CUDA graph
 
cudaGraphNodeType cuda_get_graph_node_type (cudaGraphNode_t node)
 queries the type of a native CUDA graph node
 
constexpr const char * to_string (cudaGraphNodeType type)
 convert a cuda_task type to a human-readable string
 
std::ostream & operator<< (std::ostream &os, const cudaTask &ct)
 overload of ostream inserter operator for cudaTask
 
constexpr const char * version ()
 queries the version information in a string format major.minor.patch
 

Variables

template<typename>
constexpr bool is_index_range_v = false
 base type trait to detect if a type is an IndexRange
 
template<typename T, size_t N>
constexpr bool is_index_range_v< IndexRange< T, N > > = true
 specialization of the detector for tf::IndexRange<T, N>
 
template<typename T>
constexpr bool is_1d_index_range_v = false
 base type trait to detect if a type is a 1D IndexRange
 
template<typename T>
constexpr bool is_1d_index_range_v< IndexRange< T, 1 > > = true
 specialization of the detector for tf::IndexRange<T, 1>
 
template<typename T>
constexpr bool is_md_index_range_v = false
 base type trait to detect if a type is a multi-dimensional IndexRange (rank > 1)
 
template<typename T, size_t N>
constexpr bool is_md_index_range_v< IndexRange< T, N > > = true
 specialization of the detector for tf::IndexRange<T, N> where N > 1
 
template<typename P>
constexpr bool is_task_params_v = TaskParameters<P>
 determines if a type is a task parameter type (variable template)
 
template<typename C>
constexpr bool is_static_task_v = StaticTask<C>
 determines if a callable is a static task (variable template)
 
template<typename C>
constexpr bool is_subflow_task_v = SubflowTask<C>
 determines if a callable is a subflow task (variable template)
 
template<typename C>
constexpr bool is_runtime_task_v = RuntimeTask<C>
 determines if a callable is a runtime task (variable template)
 
template<typename C>
constexpr bool is_condition_task_v = ConditionTask<C>
 determines if a callable is a condition task (variable template)
 
template<typename C>
constexpr bool is_multi_condition_task_v = MultiConditionTask<C>
 determines if a callable is a multi-condition task (variable template)
 
template<typename P>
constexpr bool is_partitioner_v = Partitioner<P>
 determines if a type is a partitioner (variable template)
 

Detailed Description

taskflow namespace

Typedef Documentation

◆ DefaultNotifier

the default notifier type used by Taskflow

By default, Taskflow uses tf::NonblockingNotifier due to its stable performance on most platforms. We do not use tf::AtomicNotifier since on some platforms and compiler versions, the atomic notification may exhibit suboptimal performance due to buggy wake-up mechanisms. These issues have been discussed in GCC bug reports and patch threads related to atomic wait/notify implementations.

See also:

◆ DefaultPartitioner

default partitioner set to tf::GuidedPartitioner

Guided partitioning algorithm can achieve stable and decent performance for most parallel algorithms.

Enumeration Type Documentation

◆ PartitionerType

enum class tf::PartitionerType : int
strong

enumeration of all partitioner types

Enumerator
STATIC 

static partitioner type

DYNAMIC 

dynamic partitioner type

◆ PipeType

enum class tf::PipeType : int
strong

enumeration of all pipe types

Enumerator
PARALLEL 

parallel type

SERIAL 

serial type

◆ TaskType

enum class tf::TaskType : int
strong

enumeration of all task types

Enumerator
PLACEHOLDER 

placeholder task type

STATIC 

static task type

RUNTIME 

runtime task type

SUBFLOW 

dynamic (subflow) task type

CONDITION 

condition task type

MODULE 

module task type

ASYNC 

asynchronous task type

UNDEFINED 

undefined task type (for internal use only)

Function Documentation

◆ atomic_max()

template<typename T>
void tf::atomic_max ( std::atomic< T > & v,
const T & max_v )
inlinenoexcept

updates an atomic variable with the maximum value

This function atomically updates the provided atomic variable v to hold the maximum of its current value and max_v. The update is performed using a relaxed memory ordering for efficiency in non-synchronizing contexts.

Template Parameters
TThe type of the atomic variable. Must be trivially copyable and comparable.
Parameters
vThe atomic variable to update.
max_vThe value to compare with the current value of v.
Attention
If multiple threads call this function concurrently, the value of v will be the maximum value seen across all threads.

◆ atomic_min()

template<typename T>
void tf::atomic_min ( std::atomic< T > & v,
const T & min_v )
inlinenoexcept

updates an atomic variable with the minimum value

This function atomically updates the provided atomic variable v to hold the minimum of its current value and min_v. The update is performed using a relaxed memory ordering for efficiency in non-synchronizing contexts.

Template Parameters
TThe type of the atomic variable. Must be trivially copyable and comparable.
Parameters
vThe atomic variable to update.
min_vThe value to compare with the current value of v.
Attention
If multiple threads call this function concurrently, the value of v will be the minimum value seen across all threads.

◆ coprime()

size_t tf::coprime ( size_t N)
constexpr

computes a coprime of a given number

This function finds the largest number less than N that is coprime (i.e., has a greatest common divisor of 1) with N. If N is less than 3, it returns 1 as a default coprime.

Parameters
Ninput number for which a coprime is to be found.
Returns
the largest number < N that is coprime to N

◆ cuda_free() [1/2]

template<typename T>
void tf::cuda_free ( T * ptr)

frees memory on the GPU device

Template Parameters
Tpointer type
Parameters
ptrdevice pointer to memory to free

This methods call cudaFree to free the memory space pointed to by ptr using the current device context of the caller.

◆ cuda_free() [2/2]

template<typename T>
void tf::cuda_free ( T * ptr,
int d )

frees memory on the GPU device

Template Parameters
Tpointer type
Parameters
ptrdevice pointer to memory to free
ddevice context identifier

This methods call cudaFree to free the memory space pointed to by ptr using the given device context.

◆ cuda_get_graph_node_type()

cudaGraphNodeType tf::cuda_get_graph_node_type ( cudaGraphNode_t node)
inline

queries the type of a native CUDA graph node

valid type values are:

  • cudaGraphNodeTypeKernel = 0x00
  • cudaGraphNodeTypeMemcpy = 0x01
  • cudaGraphNodeTypeMemset = 0x02
  • cudaGraphNodeTypeHost = 0x03
  • cudaGraphNodeTypeGraph = 0x04
  • cudaGraphNodeTypeEmpty = 0x05
  • cudaGraphNodeTypeWaitEvent = 0x06
  • cudaGraphNodeTypeEventRecord = 0x07

◆ cuda_graph_add_dependencies()

void tf::cuda_graph_add_dependencies ( cudaGraph_t graph,
const cudaGraphNode_t * from,
const cudaGraphNode_t * to,
size_t numDependencies )
inline

Handles compatibility with CUDA <= 12.x and CUDA 13.

Parameters
graph
from
to
numDependencies

◆ cuda_graph_node_get_dependencies()

size_t tf::cuda_graph_node_get_dependencies ( cudaGraphNode_t node,
cudaGraphNode_t * dependencies )
inline

Handles compatibility with CUDA <= 12.x and CUDA 13.

Parameters
node
dependencies
Returns

◆ cuda_graph_node_get_dependent_nodes()

size_t tf::cuda_graph_node_get_dependent_nodes ( cudaGraphNode_t node,
cudaGraphNode_t * dependent_nodes )
inline

Handles compatibility with CUDA <= 12.x and CUDA 13.

Parameters
node
dependent_nodes
Returns

◆ cuda_malloc_device() [1/2]

template<typename T>
T * tf::cuda_malloc_device ( size_t N)

allocates memory on the current device associated with the caller

The function calls malloc_device from the current device associated with the caller.

◆ cuda_malloc_device() [2/2]

template<typename T>
T * tf::cuda_malloc_device ( size_t N,
int d )

allocates memory on the given device for holding N elements of type T

The function calls cudaMalloc to allocate N*sizeof(T) bytes of memory on the given device d and returns a pointer to the starting address of the device memory.

◆ cuda_malloc_shared()

template<typename T>
T * tf::cuda_malloc_shared ( size_t N)

allocates shared memory for holding N elements of type T

The function calls cudaMallocManaged to allocate N*sizeof(T) bytes of memory and returns a pointer to the starting address of the shared memory.

◆ cuda_memcpy_async()

void tf::cuda_memcpy_async ( cudaStream_t stream,
void * dst,
const void * src,
size_t count )
inline

copies data between host and device asynchronously through a stream

Parameters
streamstream identifier
dstdestination memory address
srcsource memory address
countsize in bytes to copy

The method calls cudaMemcpyAsync with the given stream using cudaMemcpyDefault to infer the memory space of the source and the destination pointers. The memory areas may not overlap.

◆ cuda_memset_async()

void tf::cuda_memset_async ( cudaStream_t stream,
void * devPtr,
int value,
size_t count )
inline

initializes or sets GPU memory to the given value byte by byte

Parameters
streamstream identifier
devPtrpointer to GPU memory
valuevalue to set for each byte of the specified memory
countsize in bytes to set

The method calls cudaMemsetAsync with the given stream to fill the first count bytes of the memory area pointed to by devPtr with the constant byte value value.

◆ distance()

template<std::integral T>
size_t tf::distance ( T beg,
T end,
T step )
constexpr

calculates the number of iterations in the given index range

Template Parameters
Tintegral type of the indices and step
Parameters
begstarting index of the range
endending index of the range
stepstep size to traverse the range
Returns
returns the number of required iterations to traverse the range

The distance of a range represents the number of required iterations to traverse the range from the beginning index to the ending index (exclusive) with the given step size.

Example 1:

// Range: 0 to 10 with step size 2
size_t dist = distance(0, 10, 2); // Returns 5, the sequence is [0, 2, 4, 6, 8]
constexpr size_t distance(T beg, T end, T step)
calculates the number of iterations in the given index range
Definition iterator.hpp:71

Example 2:

// Range: 10 to 0 with step size -2
size_t dist = distance(10, 0, -2); // Returns 5, the sequence is [10, 8, 6, 4, 2]

Example 3:

// Range: 5 to 20 with step size 5
size_t dist = distance(5, 20, 5); // Returns 3, the sequence is [5, 10, 15]
Attention
It is user's responsibility to ensure the given index range is valid. For instance, a range from 0 to 10 with a step size of -2 is invalid.

◆ get_env()

std::string tf::get_env ( const std::string & str)
inline

retrieves the value of an environment variable

This function fetches the value of an environment variable by name. If the variable is not found, it returns an empty string.

Parameters
strThe name of the environment variable to retrieve.
Returns
The value of the environment variable as a string, or an empty string if not found.
Attention
The implementation differs between Windows and POSIX platforms:
  • On Windows, it uses _dupenv_s to fetch the value.
  • On POSIX, it uses std::getenv.

◆ has_env()

bool tf::has_env ( const std::string & str)
inline

checks whether an environment variable is defined

This function determines if a specific environment variable exists in the current environment.

Parameters
strThe name of the environment variable to check.
Returns
true if the environment variable exists, false otherwise.
Attention
The implementation differs between Windows and POSIX platforms:
  • On Windows, it uses _dupenv_s to check for the variable's presence.
  • On POSIX, it uses std::getenv to check for the variable's presence.

◆ IndexRange()

template<std::integral T>
tf::IndexRange ( T ,
T ,
T  ) -> IndexRange< T, 1 >

deduction guide for tf::IndexRange<T, 1>

Template Parameters
Tintegral type, deduced from the three constructor arguments

Allows class template argument deduction (CTAD) from a three-argument constructor call, mapping the common case IndexRange(beg, end, step) to the 1D specialization tf::IndexRange<T, 1> without requiring an explicit template argument.

tf::IndexRange r(0, 10, 2); // deduced as IndexRange<int, 1>
tf::IndexRange s(0ul, 100ul, 5ul); // deduced as IndexRange<size_t, 1>
class to create an N-dimensional index range of integral indices
Definition iterator.hpp:139

Without this guide, CTAD cannot resolve N from a three-argument call because IndexRange is a two-parameter template (T and N). The guide explicitly pins N = 1 for the (beg, end, step) form.

◆ is_index_range_invalid()

template<std::integral T>
bool tf::is_index_range_invalid ( T beg,
T end,
T step )
constexpr

checks if the given index range is invalid

Template Parameters
Tintegral type of the indices and step
Parameters
begstarting index of the range
endending index of the range
stepstep size to traverse the range
Returns
returns true if the range is invalid; false otherwise.

A range is considered invalid under the following conditions:

  • The step is zero and the begin and end values are not equal.
  • A positive range (begin < end) with a non-positive step.
  • A negative range (begin > end) with a non-negative step.

◆ is_pow2()

template<std::integral T>
bool tf::is_pow2 ( const T & x)
constexpr

checks if the given number is a power of 2

This function determines if the given integer is a power of 2.

Template Parameters
Tintegral type of the input
Parameters
xThe integer to check.
Returns
true if x is a power of 2, otherwise false.
Attention
This function is constexpr and can be evaluated at compile time.

◆ make_coprime_lut()

template<size_t N>
std::array< size_t, N > tf::make_coprime_lut ( )
constexpr

generates a compile-time array of coprimes for numbers from 0 to N-1

This function constructs a constexpr array where each element at index i contains a coprime of i (the largest number less than i that is coprime to it).

Template Parameters
Nthe size of the array to generate (should be greater than 0).
Returns
a constexpr array of size N where each index holds a coprime of its value.

◆ make_data_pipe()

template<typename Input, typename Output, typename C>
auto tf::make_data_pipe ( PipeType d,
C && callable )

function to construct a data pipe (tf::DataPipe)

Template Parameters
Inputinput data type
Outputoutput data type
Ccallable type

tf::make_data_pipe is a helper function to create a data pipe (tf::DataPipe) in a data-parallel pipeline (tf::DataPipeline). The first argument specifies the direction of the data pipe, either tf::PipeType::SERIAL or tf::PipeType::PARALLEL, and the second argument is a callable to invoke by the pipeline scheduler. Input and output data types are specified via template parameters, which will always be decayed by the library to its original form for storage purpose. The callable must take the input data type in its first argument and returns a value of the output data type.

[](int& input) {
return std::to_string(input + 100);
}
);
auto make_data_pipe(PipeType d, C &&callable)
function to construct a data pipe (tf::DataPipe)
Definition data_pipeline.hpp:171
@ SERIAL
serial type
Definition pipeline.hpp:117

The callable can additionally take a reference of tf::Pipeflow, which allows you to query the runtime information of a stage task, such as its line number and token number.

[](int& input, tf::Pipeflow& pf) {
printf("token=%lu, line=%lu\n", pf.token(), pf.line());
return std::to_string(input + 100);
}
);
class to create a pipeflow object used by the pipe callable
Definition pipeline.hpp:43
size_t token() const
queries the token identifier
Definition pipeline.hpp:78
size_t line() const
queries the line identifier of the present token
Definition pipeline.hpp:64

◆ make_module_task()

template<HasGraph T>
auto tf::Algorithm::make_module_task ( T & target)

creates a module task using the given graph

Template Parameters
Ttype satisfying tf::HasGraph
Parameters
targetthe target object used to create the module task
Returns
a module task that can be used by Taskflow or asynchronous tasking

This example demonstrates how to create and launch multiple taskflows in parallel using modules with asynchronous tasking:

tf::Executor executor;
A.emplace([](){ printf("Taskflow A\n"); });
B.emplace([](){ printf("Taskflow B\n"); });
C.emplace([](){ printf("Taskflow C\n"); });
D.emplace([](){ printf("Taskflow D\n"); });
// launch the four taskflows using asynchronous tasking
executor.wait_for_all();
class to create an executor
Definition executor.hpp:62
void wait_for_all()
waits for all tasks to complete
auto async(P &&params, F &&func)
creates a parameterized asynchronous task to run the given function
Task emplace(C &&callable)
creates a static task
Definition flow_builder.hpp:1558
class to create a taskflow object
Definition taskflow.hpp:64
auto make_module_task(T &target)
creates a module task using the given graph
Definition module.hpp:74

The module task maker, tf::make_module_task, is basically the same as tf::Taskflow::composed_of but provides a more generic interface that can be used beyond Taskflow. For instance, the following two approaches achieve the same functionality.

// approach 1: composition using composed_of
tf::Task m1 = taskflow1.composed_of(taskflow2);
// approach 2: composition using make_module_task
tf::Task m1 = taskflow1.emplace(tf::make_module_task(taskflow2));
class to create a task handle over a taskflow node
Definition task.hpp:263
Task & composed_of(T &object)
creates a module task from a taskflow
Definition task.hpp:984
Attention
Users are responsible for ensuring that the given target remains valid throughout its execution. The executor does not assume ownership of the target object.

◆ make_worker_interface()

template<typename T, typename... ArgsT>
std::shared_ptr< T > tf::make_worker_interface ( ArgsT &&... args)

helper function to create an instance derived from tf::WorkerInterface

Template Parameters
Ttype derived from tf::WorkerInterface
ArgsTargument types to construct T
Parameters
argsarguments to forward to the constructor of T

◆ median_of_three()

template<typename RandItr, typename C>
RandItr tf::median_of_three ( RandItr l,
RandItr m,
RandItr r,
C cmp )

finds the median of three numbers pointed to by iterators using the given comparator

This function determines the median value of the elements pointed to by three random-access iterators using the provided comparator.

Template Parameters
RandItrThe type of the random-access iterator.
CThe type of the comparator.
Parameters
lIterator to the first element.
mIterator to the second element.
rIterator to the third element.
cmpThe comparator used to compare the dereferenced iterator values.
Returns
The iterator pointing to the median value among the three elements.

◆ pseudo_median_of_nine()

template<typename RandItr, typename C>
RandItr tf::pseudo_median_of_nine ( RandItr beg,
RandItr end,
C cmp )

finds the pseudo median of a range of items using a spread of nine numbers

This function computes an approximate median of a range of items by sampling nine values spread across the range and finding their median. It uses a combination of the median_of_three function to determine the pseudo median.

Template Parameters
RandItrThe type of the random-access iterator.
CThe type of the comparator.
Parameters
begIterator to the beginning of the range.
endIterator to the end of the range.
cmpThe comparator used to compare the dereferenced iterator values.
Returns
The iterator pointing to the pseudo median of the range.
Attention
The pseudo median is an approximation of the true median and may not be the exact middle value of the range.

◆ retrieve_graph()

template<HasGraph T>
Graph & tf::retrieve_graph ( T & target)

retrieves a reference to the underlying tf::Graph from an object

This helper function abstracts the retrieval of a graph reference. It uses compile-time introspection to determine if the object provides a graph() member function or if it should be treated as a tf::Graph directly.

Template Parameters
Ttype satisfying the tf::HasGraph concept
Parameters
targetobject from which to retrieve the graph
Returns
a reference to the underlying tf::Graph
// Case 1: T has a .graph() member (composition)
struct CustomGraph1 {
tf::Graph& graph() { return _graph; }
tf::Graph _graph;
};
// Case 2: T is derived from tf::Graph (inheritance)
struct CustomGraph2 : public tf::Graph {
// ...
};
CustomGraph1 custom_graph1;
CustomGraph2 custom_graph2;
tf::Graph& g1 = tf::retrieve_graph(custom_graph1);
tf::Graph& g2 = tf::retrieve_graph(custom_graph2);
class to create a graph object
Definition graph.hpp:47
Graph & retrieve_graph(T &target)
retrieves a reference to the underlying tf::Graph from an object
Definition graph.hpp:1053
Note
This function is evaluated at compile time via if constexpr, resulting in zero runtime overhead.

◆ seed()

template<typename T>
T tf::seed ( )
inlinenoexcept

generates a random seed based on the current system clock

This function returns a seed value derived from the number of clock ticks since the epoch as measured by the system clock. The seed can be used to initialize random number generators.

Template Parameters
TThe type of the returned seed. Must be an integral type.
Returns
A seed value based on the system clock.

◆ sort2()

template<typename Iter, typename Compare>
void tf::sort2 ( Iter a,
Iter b,
Compare comp )

sorts two elements of dereferenced iterators using the given comparison function

This function compares two elements pointed to by iterators and swaps them if they are out of order according to the provided comparator.

Template Parameters
IterThe type of the iterator.
CompareThe type of the comparator.
Parameters
aIterator to the first element.
bIterator to the second element.
compThe comparator used to compare the dereferenced iterator values.

◆ sort3()

template<typename Iter, typename Compare>
void tf::sort3 ( Iter a,
Iter b,
Iter c,
Compare comp )

Sorts three elements of dereferenced iterators using the given comparison function.

This function sorts three elements pointed to by iterators in ascending order according to the provided comparator. The sorting is performed using a sequence of calls to the sort2 function to ensure the correct order of elements.

Template Parameters
IterThe type of the iterator.
CompareThe type of the comparator.
Parameters
aIterator to the first element.
bIterator to the second element.
cIterator to the third element.
compThe comparator used to compare the dereferenced iterator values.

◆ to_string()

const char * tf::to_string ( TaskType type)
inline

convert a task type to a human-readable string

The name of each task type is the litte-case string of its characters.

◆ unique_id()

template<std::integral T>
T tf::unique_id ( )

generates a program-wide unique ID of the given type in a thread-safe manner

This function provides a globally unique identifier of the specified integral type. It uses a static std::atomic counter to ensure thread safety and increments the counter in a relaxed memory ordering for efficiency.

Template Parameters
Tintegral type of the ID to generate
Returns
A unique ID of type T.
Attention
The uniqueness of the ID is guaranteed only within the program's lifetime.
The function does not throw exceptions.

◆ version()

const char * tf::version ( )
constexpr

queries the version information in a string format major.minor.patch

Release notes are available here: https://taskflow.github.io/taskflow/Releases.html

◆ wsq_contended_value()

template<typename T>
auto tf::wsq_contended_value ( )

returns the contended sentinel for work-stealing steal operations

For pointer types T, returns reinterpret_cast<T>(uintptr_t{1}), i.e., the pointer address 0x1. A steal operation returning this value means the queue was non-empty but the CAS was lost to another concurrent thief — the caller should retry the same victim since work is known to exist.

For non-pointer types, returns std::nullopt (same as wsq_empty_value) since sentinel encoding is not possible without a dedicated out-of-band value.

◆ wsq_empty_value()

template<typename T>
auto tf::wsq_empty_value ( )
constexpr

returns the empty sentinel for work-stealing steal operations

For pointer types T, returns nullptr. For non-pointer types, returns std::nullopt. A steal operation returning this value means the queue was genuinely empty at the time of the attempt.

Variable Documentation

◆ is_1d_index_range_v

template<typename T>
bool tf::is_1d_index_range_v = false
constexpr

base type trait to detect if a type is a 1D IndexRange

Template Parameters
Tthe type to inspect

◆ is_1d_index_range_v< IndexRange< T, 1 > >

template<typename T>
bool tf::is_1d_index_range_v< IndexRange< T, 1 > > = true
constexpr

specialization of the detector for tf::IndexRange<T, 1>

Template Parameters
Tthe underlying coordinate type (e.g., size_t, int)

◆ is_condition_task_v

template<typename C>
bool tf::is_condition_task_v = ConditionTask<C>
constexpr

determines if a callable is a condition task (variable template)

Template Parameters
Ccallable type to check

Equivalent to tf::ConditionTask<C>. Provided for backward compatibility.

◆ is_index_range_v

template<typename>
bool tf::is_index_range_v = false
constexpr

base type trait to detect if a type is an IndexRange

Template Parameters
TThe type to inspect.

◆ is_index_range_v< IndexRange< T, N > >

template<typename T, size_t N>
bool tf::is_index_range_v< IndexRange< T, N > > = true
constexpr

specialization of the detector for tf::IndexRange<T, N>

Matches an IndexRange of ANY dimensionality (1D, 2D, 3D, etc.).

Template Parameters
Tthe underlying coordinate type (e.g., size_t, int)
Nthe number of dimensions

◆ is_md_index_range_v

template<typename T>
bool tf::is_md_index_range_v = false
constexpr

base type trait to detect if a type is a multi-dimensional IndexRange (rank > 1)

Template Parameters
Tthe type to inspect

◆ is_md_index_range_v< IndexRange< T, N > >

template<typename T, size_t N>
bool tf::is_md_index_range_v< IndexRange< T, N > > = true
constexpr

specialization of the detector for tf::IndexRange<T, N> where N > 1

Template Parameters
Tthe underlying coordinate type (e.g., size_t, int)
Nthe number of dimensions (must be > 1)

◆ is_multi_condition_task_v

template<typename C>
bool tf::is_multi_condition_task_v = MultiConditionTask<C>
constexpr

determines if a callable is a multi-condition task (variable template)

Template Parameters
Ccallable type to check

Equivalent to tf::MultiConditionTask<C>. Provided for backward compatibility.

◆ is_partitioner_v

template<typename P>
bool tf::is_partitioner_v = Partitioner<P>
inlineconstexpr

determines if a type is a partitioner (variable template)

Template Parameters
Ptype to check

Equivalent to tf::Partitioner

. Provided for backward compatibility.

◆ is_runtime_task_v

template<typename C>
bool tf::is_runtime_task_v = RuntimeTask<C>
constexpr

determines if a callable is a runtime task (variable template)

Template Parameters
Ccallable type to check

Equivalent to tf::RuntimeTask<C>. Provided for backward compatibility.

◆ is_static_task_v

template<typename C>
bool tf::is_static_task_v = StaticTask<C>
constexpr

determines if a callable is a static task (variable template)

Template Parameters
Ccallable type to check

Equivalent to tf::StaticTask<C>. Provided for backward compatibility.

◆ is_subflow_task_v

template<typename C>
bool tf::is_subflow_task_v = SubflowTask<C>
constexpr

determines if a callable is a subflow task (variable template)

Template Parameters
Ccallable type to check

Equivalent to tf::SubflowTask<C>. Provided for backward compatibility.

◆ is_task_params_v

template<typename P>
bool tf::is_task_params_v = TaskParameters<P>
constexpr

determines if a type is a task parameter type (variable template)

Template Parameters
Ptype to check

Equivalent to tf::TaskParameters

. Provided for backward compatibility.