I was chatting with a friend a while ago, and they stumbled upon an interesting conundrum in C++.

std::unordered_map<int, int> x;
while (true) {
    for (int i = 0; i< 20000; i++)
        x[rand()] = 1;
    /* Some work with x here*/
    x.clear()
} 

The runtime for that little snippet, even if you fill in the comment with some code that performs lookups & calculations on values in the map, is dominated by x.clear().

While std::unordered_map does optimize for trivially destructible elements (like int), it doesn’t shrink the size of the allocated map. So you end up with lots of empty buckets. This means:

  1. Lots of time is spent in calls to memset(), iterating over buckets that may already have been zeroed.
  2. Your cache locality sucks if the number of the elements changes between runs1

With regard to 2, Imagine if I had written:

std::unordered_map<int, int> x;
while (true) {
    int rand_factor = rand() % 50;
    for (int i = 0; i< 1000*rand_factor; i++)
        x[rand()] = 1;
    /* Some work with x here*/
    x.clear()
} 

With enough runs, the list of empty buckets will grow, on small inputs, there will be less chance of collisions (given the large number of buckets), so less chaining, but the entries will be spread across a bunch of sparse buckets.

The solution? If instead, we re-wrote the above as:

std::unordered_map<int, int> x;
while (true) {
    int rand_factor = rand() % 50;
    for (int i = 0; i< 1000*rand_factor; i++)
        x[rand()] = 1;
    /* Some work with x here*/
    std::unordered_map<int, int> tmp;
    x.swap(tmp);
} 

The runtime now becomes dominated by work done with the map, rather than calls memsetting already empty buckets on destruction. Another piece of C++ trivia I’ll likely never use ¯\_(ツ)_/¯.


  1. C++ aficionados will point out that std::unordered_map’s cache locality sucks anyway, as each bucket is implemented as a gasp linked list. ↩︎