Taskflow provides template functions for constructing tasks to perform parallel find over a range of items.
Include the Header
You need to include the header file, taskflow/algorithm/find.hpp, for using parallel-find algorithms.
#include <taskflow/algorithm/find.hpp>
What is a Find Algorithm?
A find algorithm searches a range [first, last) for an element that satisfies a given criterion and returns an iterator to the first such element, or last if no such element exists.
What distinguishes a parallel find from a parallel for-each is its short-circuit semantics: as soon as any worker finds a qualifying element, the remaining workers abandon their portions of the range. This means the algorithm does not always process every element — it stops as early as possible once a result is found.
Taskflow provides the following parallel-find algorithms:
All four algorithms store the result iterator in a variable passed by reference, so the result is available after the task completes.
Create a Parallel Find-If Task
tf::Taskflow::find_if(B first, E last, T& result, UOP predicate, P part) performs a parallel search over the range [first, last) and stores in result an iterator to the first element for which predicate returns true, or last if no such element exists. It resembles a parallel implementation of the following loop:
template <typename InputIt, typename UnaryPredicate>
InputIt find_if(InputIt first, InputIt last, UnaryPredicate predicate) {
for (; first != last; ++first) {
if (predicate(*first)) {
return first;
}
}
return last;
}
The example below creates a task to find the element equal to 22 in a range of 10 integers:
std::vector<int> input = {1, 9, 22, 3, -6, 13, 12, 0, 9, 11};
std::vector<int>::iterator result;
taskflow.find_if(
input.begin(), input.end(), result, [](int i) { return i == 22; }
);
executor.run(taskflow).wait();
assert(*result == 22);
Capture Iterators by Reference
You can pass iterators by reference using std::ref to marshal parameter updates between dependent tasks. This is useful when the range is not known at task-graph construction time but is initialized by an upstream task.
std::vector<int> input;
std::vector<int>::iterator result, first, last;
tf::Task init = taskflow.emplace([&]() {
input = {1, 9, 22, 3, -6, 13, 12, 0, 9, 11};
first = input.begin();
last = input.end();
});
std::ref(first), std::ref(last), result, [](int i) { return i == 22; }
);
executor.run(taskflow).wait();
assert(*result == 22);
class to create a task handle over a taskflow node
Definition task.hpp:263
Task & precede(Ts &&... tasks)
adds precedence links from this to other tasks
Definition task.hpp:952
When init finishes, first and last point to the initialized data range of input, and the find-if task searches over all 10 elements.
Create a Parallel Find-If-Not Task
tf::Taskflow::find_if_not(B first, E last, T& result, UOP predicate, P part) performs a parallel search over the range [first, last) and stores in result an iterator to the first element for which predicate returns false, or last if no such element exists. It resembles a parallel implementation of the following loop:
template <typename InputIt, typename UnaryPredicate>
InputIt find_if_not(InputIt first, InputIt last, UnaryPredicate predicate) {
for (; first != last; ++first) {
if (!predicate(*first)) {
return first;
}
}
return last;
}
The example below creates a task to find the first element that is not equal to 1 in a range where only one element differs:
std::vector<int> input = {1, 1, 22, 1, 1, 1, 1, 1, 1, 1};
std::vector<int>::iterator result;
taskflow.find_if_not(
input.begin(), input.end(), result, [](int i) { return i == 1; }
);
executor.run(taskflow).wait();
assert(*result == 22);
Capture Iterators by Reference
As with tf::Taskflow::find_if, iterators can be passed by reference using std::ref so that an upstream task can set up the range before the parallel find-if-not runs.
std::vector<int> input;
std::vector<int>::iterator result, first, last;
tf::Task init = taskflow.emplace([&]() {
input = {1, 1, 22, 1, 1, 1, 1, 1, 1, 1};
first = input.begin();
last = input.end();
});
std::ref(first), std::ref(last), result, [](int i) { return i == 1; }
);
executor.run(taskflow).wait();
assert(*result == 22);
Find the Minimum Element
tf::Taskflow::min_element(B first, E last, T& result, C comp, P part) performs a parallel search over the range [first, last) and stores in result an iterator to the element with the smallest value under the comparator comp. It resembles a parallel implementation of the following loop:
template <typename InputIt, typename Compare>
InputIt min_element(InputIt first, InputIt last, Compare comp) {
if (first == last) return last;
InputIt smallest = first;
for (++first; first != last; ++first) {
if (comp(*first, *smallest)) {
smallest = first;
}
}
return smallest;
}
The example below finds the smallest element in a range of 10 integers:
std::vector<int> input = {1, 1, 1, 1, 1, -1, 1, 1, 1, 1};
std::vector<int>::iterator result;
taskflow.min_element(
input.begin(), input.end(), std::less<int>(), result
);
executor.run(taskflow).wait();
assert(*result == -1);
Capture Iterators by Reference
You can pass iterators by reference using std::ref so that an upstream task can set up the range before the parallel min-element search runs.
std::vector<int> input;
std::vector<int>::iterator result, first, last;
tf::Task init = taskflow.emplace([&]() {
input = {1, 1, 1, 1, 1, -1, 1, 1, 1, 1};
first = input.begin();
last = input.end();
});
std::ref(first), std::ref(last), std::less<int>(), result
);
executor.run(taskflow).wait();
assert(*result == -1);
Find the Maximum Element
tf::Taskflow::max_element(B first, E last, T& result, C comp, P part) performs a parallel search over the range [first, last) and stores in result an iterator to the element with the largest value under the comparator comp. It resembles a parallel implementation of the following loop:
template <typename InputIt, typename Compare>
InputIt max_element(InputIt first, InputIt last, Compare comp) {
if (first == last) return last;
InputIt largest = first;
for (++first; first != last; ++first) {
if (comp(*largest, *first)) {
largest = first;
}
}
return largest;
}
The example below finds the largest element in a range of 10 integers:
std::vector<int> input = {1, 1, 1, 1, 1, 2, 1, 1, 1, 1};
std::vector<int>::iterator result;
taskflow.max_element(
input.begin(), input.end(), std::less<int>(), result
);
executor.run(taskflow).wait();
assert(*result == 2);
- Note
- tf::Taskflow::max_element uses
std::less as its comparator, which may seem counterintuitive. The comparator comp defines an ordering such that comp(a, b) returns true if a comes before b. For finding the maximum, we want the element that nothing else comes after, so std::less naturally puts the largest element last — and max_element returns it. Refer to std::max_element for further details.
Capture Iterators by Reference
You can pass iterators by reference using std::ref so that an upstream task can set up the range before the parallel max-element search runs.
std::vector<int> input;
std::vector<int>::iterator result, first, last;
tf::Task init = taskflow.emplace([&]() {
input = {1, 1, 1, 1, 1, 2, 1, 1, 1, 1};
first = input.begin();
last = input.end();
});
std::ref(first), std::ref(last), std::less<int>(), result
);
executor.run(taskflow).wait();
assert(*result == 2);
Configure a Partitioner
A partitioner controls how the iteration space is divided among workers. Taskflow provides four partitioners, each suited to different workload characteristics:
- tf::StaticPartitioner divides the range into equal-sized chunks ahead of execution and assigns them to workers in order. It has the lowest scheduling overhead and works best when the predicate or comparator costs the same for every element.
- tf::DynamicPartitioner distributes fixed-sized chunks to workers on demand as they become available. It adapts well to workloads where element evaluation cost varies, at the expense of slightly higher coordination overhead.
- tf::GuidedPartitioner distributes chunks whose size decreases adaptively as work is consumed — large chunks early to reduce overhead, smaller chunks late to balance the tail. This is the default partitioner and delivers stable, near-optimal performance across a wide range of workloads.
- tf::RandomPartitioner distributes chunks of randomly sampled sizes, which can help avoid systematic load imbalances caused by data-dependent cost patterns.
The following example creates two parallel find-if tasks using different partitioners:
std::vector<int> input(1024, -1);
std::vector<int>::iterator result;
taskflow.find_if(
input.begin(), input.end(), result,
[](int i) { return i == -1; },
static_partitioner
);
taskflow.find_if(
input.begin(), input.end(), result,
[](int i) { return i == -1; },
guided_partitioner
);
class to create a guided partitioner for scheduling parallel algorithms
Definition partitioner.hpp:402
class to construct a static partitioner for scheduling parallel algorithms
Definition partitioner.hpp:262
As a rule of thumb, prefer tf::StaticPartitioner when the predicate or comparator costs the same for every element and tf::GuidedPartitioner for irregular workloads where evaluation cost varies across elements. tf::DynamicPartitioner is a good choice when chunks must be kept small and strictly equal in size.
- Note
- By default, parallel-find tasks use tf::DefaultPartitioner (currently tf::GuidedPartitioner) if no partitioner is specified.