fastmm::FixedSizeMultiMap is a fixed-capacity, low-latency object store where secondary index membership is optional, mutation must preserve cross-index invariants, and failure/rollback must work even when an object is only present in some views.
The target workload is not "storing a lot of stuff somehow and maximizing throughput." (although throughput may not be too bad for a broad range of workloads) It is the tighter case where the same object is hit over and over through different access patterns:
- hash lookup by a primary key
- ordered traversal by one or more fields
- non-unique grouping/range queries
- insertion-order iteration
- mixed-field access once the object is found
- modify and reindex in selected views
In other words: one object, many (optional) views, no duplicated payload, no need to keep several containers manually in sync, and everything designed in a real-time friendly manner.
If you already know the object count upper bound and you care about latency, the usual approach of "one container per query pattern" gets expensive fast:
- the payload is duplicated or indirectly owned in several places
- updates have to be mirrored across containers
- allocator traffic and fragmentation show up at the wrong time
- locality gets worse when hot objects are spread out
While boost::multi_index has existed for years and is a powerful general-purpose solution, it is not specifically designed for fixed-capacity, low-latency workloads. In particular, its model assumes that each object participates in all views, whereas modern C++ makes it possible to explore a more tailored design: one centered on stable storage, optional secondary indexing, and minimal overhead for mixed lookup and traversal patterns.
Hence, this library takes the following choices:
- store each object once
- attach multiple indices to that one object (via
boost::intrusivecontainers) - choose the index set at compile time
- allow for indexing/unindexing objects within individual containers at runtime.
- use a fixed-size LIFO pool for storage
That gives you a unified interface for repeated mixed queries with minimal runtime machinery.
The design is intentionally simple.
1. Fixed capacity.
Capacity is part of the type. If the pool is full, insertion fails cleanly.
2. Policy-based indices.
The index set is declared in the type. No virtual dispatch, no runtime registry, no dynamic query planner. You pay for the indices you ask for.
3. One object, many indices.
The payload exists once. The indices are just alternate access paths to the same object.
4. LIFO memory pool.
Storage comes from a fixed pool with LIFO reuse. The point is to avoid general-purpose allocator noise, reduce fragmentation, and keep recently-touched slots hot when the workload is bursty.
5. Explicit indexing control when needed.
You can insert into the primary index only, then opt into selected secondary indices later. That matters when object construction and indexing have to be staged. This is the most significant difference in design, compared with similar libraries such as boost::multi_index, as we allow for partial indexing in general.
This container is a good fit when:
- objects are relatively large
- the working set has a known maximum size
- the same live objects are queried repeatedly in different ways
- latency matters more than generality
- you want predictable failure on exhaustion instead of unbounded growth
Less ideal when:
- capacity is not known ahead of time
- you need arbitrary ad hoc query types
- you mainly want a drop-in STL replacement
The examples below use the same Particle type and index setup as the benchmarks and tests.
#include <cstddef>
#include <cstdint>
#include "multimap.h"
struct Particle {
uint64_t id;
double x;
double y;
double m;
Particle(uint64_t id, double x, double y, double m)
: id(id), x(x), y(y), m(m) {}
};
struct Idgetter {
using type = uint64_t;
const uint64_t& operator()(const Particle& p) const { return p.id; }
};
struct Xgetter {
using type = double;
const double& operator()(const Particle& p) const { return p.x; }
};
struct Ygetter {
using type = double;
const double& operator()(const Particle& p) const { return p.y; }
};
struct Mgetter {
using type = double;
const double& operator()(const Particle& p) const { return p.m; }
};
struct IdHash {
std::size_t operator()(uint64_t id) const {
return std::hash<uint64_t>{}(id);
}
};
struct IdEqual {
bool operator()(uint64_t a, uint64_t b) const { return a == b; }
};
struct ById {};
struct ByX {};
struct ByY {};
struct ByM {};
struct BySeq {};
static constexpr std::size_t kMaxParticles = 64;
using ParticleMap = fastmm::FixedSizeMultiMap<
Particle, kMaxParticles,
fastmm::Unordered<Idgetter, IdHash, IdEqual, 128>,
fastmm::Named<fastmm::Ordered<Xgetter, std::less<double>>, ByX>,
fastmm::Named<fastmm::Ordered<Ygetter, std::less<double>>, ByY>,
fastmm::Named<fastmm::OrderedNonUnique<Mgetter, std::less<double>>, ByM>,
fastmm::Named<fastmm::List>, BySeq>;This map has five views over the same Particle objects:
- index
0: unordered unique primary index byid - index
1(ByX): ordered unique index byx - index
2(ByY): ordered unique index byy - index
3(ByM): ordered non-unique index bym(particle "mass") - index
4(BySeq): insertion-order list
We can imagine a simple game where particles can move in the two-dimensional space. Each of x and y coordinates are can not be identical for any two particles, just for simplicity in benchmarks interpretation (indeed, we can easily define composite key (x, y)). We will want to add/remove particles, iterate over them in different orders (inserting order, id, or ascending order in x/y), and find all particles with the same mass (an equal_range search for boost::multiset).
Insert and index into all configured indices:
ParticleMap map;
auto it = map.insert<true>(1, 0.0, 0.0, 1.0);
if (it == map.cend()) {
// insert failed: capacity exhausted or a unique index rejected it
}If insert<true> hits a conflict in a secondary unique index, the insert is rolled back completely. The object does not survive in the primary index as a half-inserted entry.
Insert into the primary index only:
auto it = map.insert<false>(99, 7.0, 8.0, 3.0);
if (it != map.cend()) {
map.index<ByX>(*it); // add to x-index
map.index<BySeq>(*it); // add to list index
}That is useful when secondary indexing is conditional or staged.
auto it = map.find_primary(42u);
if (it != map.cend()) {
// full object is available directly
double x = it->x;
double y = it->y;
}Iterate by x in sorted order:
for (const auto& p : map.get<ByX>()) {
// p.x is nondecreasing
}Iterate in insertion order:
for (const auto& p : map.get<BySeq>()) {
// insertion order
}Get all particles with the same mass:
auto& mi = map.get<ByM>();
auto [beg, end] = mi.equal_range(5.0);
for (auto it = beg; it != end; ++it) {
// every particle here has m == 5.0
}Walk distinct mass levels:
for (auto it = mi.begin(); it != mi.cend(); it = mi.upper_bound(it->m)) {
// one representative per distinct mass value
}Remove by object handle:
auto it = map.find_primary(1u);
if (it != map.cend()) {
map.remove(*it);
}Remove by key through a unique secondary index:
bool removed = map.remove<ByX>(3.14); // remove by xauto it = map.insert<true>(1, 1.0, 1.0, 1.0);
map.unindex<ByX>(*it); // remove from x-index only
// object is still alive in the primary index
auto again = map.find_primary(1u);If you found an object through one index and want the iterator for another index, use project<N>:
auto& mi = map.get<ByM>();
auto mit = mi.find(7.0);
if (mit != mi.cend()) {
auto lit = map.project<BySeq>(*mit);
if (lit != map.get<BySeq>().cend()) {
// same object, now as a list-index iterator
}
}If the object is not currently indexed in the target index, project<N> returns that index's end().
If you mutate a field that participates in an index, reindex that index explicitly:
auto it = map.insert<true>(1, 1.0, 3.0, 2.0);
bool ok =
map.modify<fastmm::ReindexOnly<ByX>>(*it,
[](Particle& p) { p.x = 9.0; });
if (!ok) {
// unique-index conflict; original state was restored
}For unique indices, conflicting reindex operations fail and roll back cleanly.
For non-unique indices, reindex succeeds as expected:
map.modify<fastmm::ReindexOnly<ByM>>(*it,
[](Particle& p) { p.m = 7.0; });In general, one can specify one of three reindex policies (ReindexAll, ReindexNone, ReindexOnly). Note that the method skips any unlinked index even if it is included in the policy. That is to say, the policy is only for optimization purpose: When you know exactly what you want to change, then specify; for explicit "reindexing", use insert instead.
If you do not want separate getter structs, member pointers work too, one can also skip the tag and use the oridinal position to refer to an index:
using ParticleMapKF = fastmm::FixedSizeMultiMap<
Particle, kMaxParticles,
fastmm::Unordered<fastmm::KeyFrom<&Particle::id>, IdHash, IdEqual, 128>,
fastmm::Ordered<fastmm::KeyFrom<&Particle::x>, std::less<double>>,
fastmm::Ordered<fastmm::KeyFrom<&Particle::y>, std::less<double>>,
fastmm::OrderedNonUnique<fastmm::KeyFrom<&Particle::m>, std::less<double>>,
fastmm::List>;That gives the same behavior with less boilerplate when the key is just a data member.
-
insert<true>(...)
Create object and index it into all configured indices. -
insert<false>(...)
Create object in the primary index only. -
index<N>(obj)
Add an existing object to indexN. -
unindex<N>(obj)
Remove an object from indexNonly. -
get<N>()
Access indexN. -
find_primary(key)
Find through the primary unordered index. -
remove(obj)
Remove the object from the container entirely. -
remove<N>(key)
Remove by key through indexNwhen that operation is supported. -
project<N>(obj_or_iterator_value)
Get the iterator for the same object in indexN, orend()if the object is not present there. -
modify<fastmm::ReindexOnly<N>>(obj, fn)
Mutate the object and reindex only indexN.
This library aims to fail in boring ways.
- Pool exhaustion returns
end(). - Duplicate primary keys are rejected.
- Secondary unique-index conflicts during
insert<true>roll back the whole insert. - Unique-index conflicts during
modify<ReindexOnly<N>>roll back the object state. project<N>returnsend()when the object is not in indexN.
That is the intended contract: no ghost entries, no half-updated indices, no dangling hooks after a normal failed operation.
This is a fixed-size, policy-driven container for a specific low-latency workload. It is not trying to be a general-purpose database or a universal associative container. If your object count is bounded and the same objects must be hit through several access paths repeatedly, that is the use case it is built for.
Benchmarks compare three implementations against Google Benchmark on an Intel i7 (Linux):
- MM —
fastmm::MultiMap(this library) - PoolBMI —
boost::multi_indexwithboost::fast_pool_allocator. The container is reserved at the beginning to minize allocation difference. - BMI —
boost::multi_indexwith the default allocator
The benchmark fixture uses a the above-mentioned "Particle" payload, exposing indices {id, x, y, m}: a primary hash index on id, ordered indices on x and y (for range queries), and a secondary hash index on m. It is enlarged by a few other fields to a full payload of 112 Bytes.
Two layout variants are tested: aligned (id, x, y, m in declaration order) and reversed (id, m, y, x), which affects cache-line layout and iterator access patterns.
Following results are time-per-operation in nanoseconds at kN = 100,000 elements.
| Operation | MM (ns) | PoolBMI (ns) | BMI (ns) | Notes |
|---|---|---|---|---|
| Create | 478 | 508 | 522 | Insert + index all keys |
| FindPrimary | 4.1 | 2.3 | 2.3 | Hash lookup by id |
| Modify | 68 | 142 | 147 | Rekey across all indices |
| Remove | 137 | 134 | 158 | Erase + unlink all hooks |
| BulkIterate | 337k | 355k | 405k | Iterate all elements |
| OrderedIterate | 3,769k | 2,826k | 3,175k | In-order traversal |
| LevelWalk | 236 | 250 | 264 | Per-node price-level walk |
| MassRange | 206k | 186k | 170k | Large ordered range scan |
| Mixed | 21.7M | 29.3M | 31.2M | Realistic workload mix |
| Operation | MM (ns) | PoolBMI (ns) | BMI (ns) | Notes |
|---|---|---|---|---|
| Create | 465 | 507 | 519 | |
| FindPrimary | 4.1 | 2.3 | 2.3 | |
| Modify | 65 | 134 | 142 | |
| Remove | 141 | 134 | 162 | |
| OrderedIterate | 2,604k | 2,823k | 3,207k | MM wins here (see below) |
| Mixed | 25.8M | 27.4M | 28.2M |
Mixed workload is the primary story. Under a realistic interleaving of inserts, lookups, modifications, and removes, MM is consistently 30–45% faster than BMI and 25–35% faster than PoolBMI across all container sizes. This reflects the core design advantage: a single fixed-capacity pool means no heap allocation on the hot path, no per-element allocator overhead, and cache-warm LIFO reuse of freed nodes.

- Mixed workload performance
- reversed layout
Modify shows the starkest single-operation advantage: ~2× faster than BMI. This is expected — a Modify in MM is an in-place rekey with pointer surgery on intrusive hooks. Although both MM and BMI have to erase and reinsert with allocator involvement, the target index can be specified by the user in the MM case, saving the overhead of dealing with others.

- Modify and reindex (for a single key)
- reversed layout
FindPrimary is the one consistent loss: MM runs at ~4 ns vs ~2 ns for both BMI variants, regardless of container size or layout. This appears to be an i7-specific issue (reversed on Apple M1, data not included). The likely cause is the slightly larger node size (payload + hooks) than MultiIndex, as the iteration seems basic. Further investigation pending.

- Search the primary key (hash)
- reversed layout
OrderedIterate (Tree Traversal) this is an interesting one. Here the percentage in difference is shown, as the absolute magnitude rises very quickly, making it difficult to see. From first look, it is surprising to see that PoolBMI quickly outperforms MM for aligned layout with increasing kN, while with reversed layout it is the opposite: MM is generally faster than PoolBMI. This turned out to be a memory effect issue: In the reversed layout, the header fields (ByX) accessed during ordered traversal are farther from the indexed field x in the struct, causing the traversal to skip two cache lines (64 Byte) per node (vs one for aligned) and improves L2/L3 utilization. This is a memory layout effect, which dominates the constant factor paricularly for large. BMI becomes very unstable in both cases, which is expected from its "allocate when needed" policy.

- Iterate the ordered index x (tree traversal)
- reversed layout
PoolBMI vs BMI: From the above results we also see that adding a pool allocator to BMI consistently improves it, sometimes matching MM on individual operations (Remove at reversed layout). However, the Mixed benchmark reliably separates them: pool allocation alone does not replicate MM's structural locality benefits. This is probably due to the clean policy based design of MM, which leads to minimal overhead at the cost of being more tailored to specific applications.
- Build:
-O3, single-threaded - Warmup: Google Benchmark default (1 second)
- Repetitions: 200 per configuration, median reported
- Error bars: p5/p95 across repetitions
- Sizes: N ∈ {50, 100, 500, 1k, 5k, 10k, 50k, 100k}
- Machine: Intel Core i7 (Linux); Apple M1 results omitted for platform consistency
Raw JSON results are in benchmarks/data/.
Unit tests and benchmark code in this repository were initially generated with AI assistance.
All such code has been:
- reviewed,
- revised,
- and explicitly approved by a human before inclusion.
Correctness, design decisions, and performance claims are the responsibility of the human author(s).
AI is used as a drafting tool, not as a source of truth.
- AI-generated code may be incomplete or suboptimal without review.
- Benchmarks should be interpreted with standard caution (environment, methodology, workload).



