[#concurrent] = Concurrent Containers :idprefix: concurrent_ Boost.Unordered provides `boost::concurrent_node_set`, `boost::concurrent_node_map`, `boost::concurrent_flat_set` and `boost::concurrent_flat_map`, hash tables that allow concurrent write/read access from different threads without having to implement any synchronzation mechanism on the user\'s side. In the example above, threads access `m` without synchronization, just as we\'d do in a single-threaded scenario. In an ideal setting, if a given workload is distributed among _N_ threads, execution is _N_ times faster than with one thread —this limit is never attained in practice due to synchronization overheads and _contention_ (one thread waiting for another to leave a locked portion of the map), but Boost.Unordered concurrent containers are designed to perform with very little overhead and typically achieve _linear scaling_ (that is, performance is proportional to the number of threads up to the number of logical cores in the CPU). Visitation-based API The first thing a new user of Boost.Unordered concurrent containers will notice is that these classes _do not provide iterators_ (which makes them technically not https://en.cppreference.com/w/cpp/named_req/Container[Containers^] in the C++ standard sense). The reason for this is that iterators are inherently thread-unsafe. Consider this hypothetical code: In a multithreaded scenario, the iterator `it` may be invalid at point B if some other thread issues an `m.erase(k)` operation between A and B. There are designs that can remedy this by making iterators lock the element they point to, but this approach lends itself to high contention and can easily produce deadlocks in a program. `operator[]` has similar concurrency issues, and is not provided by `boost::concurrent_flat_map`/`boost::concurrent_node_map` either. Instead, element access is done through so-called _visitation functions_: The visitation function passed by the user (in this case, a lambda function) is executed internally by Boost.Unordered in a thread-safe manner, so it can access the element without worrying about other threads interfering in the process. On the other hand, a visitation function can _not_ access the container itself: Access to a different container is allowed, though: But, in general, visitation functions should be as lightweight as possible to reduce contention and increase parallelization. In some cases, moving heavy work outside of visitation may be beneficial: Visitation is prominent in the API provided by concurrent containers, and many classical operations have visitation-enabled variations: "Note that in this last example the visitation function could actually _modify_ the element: as a general rule, operations on a concurrent map `m` will grant visitation functions const/non-const access to the element depending on whether `m` is const/non-const. Const access can be always be explicitly requested by using `cvisit` overloads (for instance, `insert_or_cvisit`) and may result in higher parallelization. For concurrent sets, on the other hand, visitation is always const access." Although expected to be used much less frequently, concurrent containers also provide insertion operations where an element can be visited right after element creation (in addition to the usual visitation when an equivalent element already exists): Consult the references of `xref:reference/concurrent_node_set.adoc#concurrent_node_set[boost::concurrent_node_set]`, `xref:reference/concurrent_node_map.adoc#concurrent_node_map[boost::concurrent_node_map]`, `xref:reference/concurrent_flat_set.adoc#concurrent_flat_set[boost::concurrent_flat_set]` and `xref:reference/concurrent_flat_map.adoc#concurrent_flat_map[boost::concurrent_flat_map]` for the complete list of visitation-enabled operations. Whole-Table Visitation In the absence of iterators, `visit_all` is provided as an alternative way to process all the elements in the container: In C++17 compilers implementing standard parallel algorithms, whole-table visitation can be parallelized: Traversal can be interrupted midway: There is one last whole-table visitation operation, `erase_if`: `visit_while` and `erase_if` can also be parallelized. Note that, in order to increase efficiency, whole-table visitation operations do not block the table during execution: this implies that elements may be inserted, modified or erased by other threads during visitation. It is advisable not to assume too much about the exact global state of a concurrent container at any point in your program. Bulk visitation Suppose you have an `std::array` of keys you want to look up for in a concurrent map: _Bulk visitation_ allows us to pass all the keys in one operation: This functionality is not provided for mere syntactic convenience, though: by processing all the keys at once, some internal optimizations can be applied that increase performance over the regular, one-at-a-time case (consult the xref:benchmarks.adoc#benchmarks_boostconcurrent_flatnode_map[benchmarks]). In fact, it may be beneficial to buffer incoming keys so that they can be bulk visited in chunks: There\'s a latency/throughput tradeoff here: it will take longer for incoming keys to be processed (since they are buffered), but the number of processed keys per second is higher. `bulk_visit_size` is the recommended chunk size —smaller buffers may yield worse performance. Blocking Operations Concurrent containers can be copied, assigned, cleared and merged just like any other Boost.Unordered container. Unlike most other operations, these are _blocking_, that is, all other threads are prevented from accesing the tables involved while a copy, assignment, clear or merge operation is in progress. Blocking is taken care of automatically by the library and the user need not take any special precaution, but overall performance may be affected. Another blocking operation is _rehashing_, which happens explicitly via `rehash`/`reserve` or during insertion when the table\'s load hits `max_load()`. As with non-concurrent containers, reserving space in advance of bulk insertions will generally speed up the process. Interoperability with non-concurrent containers As open-addressing and concurrent containers are based on the same internal data structure, they can be efficiently move-constructed from their non-concurrent counterpart, and vice versa. ^|`boost::concurrent_node_set` ^|`boost::unordered_node_set` ^|`boost::concurrent_node_map` ^|`boost::unordered_node_map` ^|`boost::concurrent_flat_set` ^|`boost::unordered_flat_set` ^|`boost::concurrent_flat_map` ^|`boost::unordered_flat_map` This interoperability comes handy in multistage scenarios where parts of the data processing happen in parallel whereas other steps are non-concurrent (or non-modifying). In the following example, we want to construct a histogram from a huge input vector of words: the population phase can be done in parallel with `boost::concurrent_flat_map` and results then transferred to the final container.