The C++ logo, by Jeremy Kratz, licensed under CC0 1.0 Universal

Insert an element in a map efficiently

Notes published the
10 - 13 minutes to read, 2574 words
Categories: c++
Keywords: c++ container cppcheck move semantic performance quick-bench

find and insert

I came to criticize the following piece of code

void bar0(std::map<std::string, std::string>& m){
    // if key is *not* there, insert new data
    if (m.find("<some id>") == m.end()) {
        m["<some id>"] = "some data";

To be clear, it works as intended.

If the key "<some id>" is in the map, then nothing happens, otherwise {"<some id>", "some data"} is inserted into the map.

What is "wrong" with this function, is how it accomplishes its job.

The first, and most obvious issue, is that if the key is missing, the map is traversed twice. The first time by calling the member function find, the second time by calling operator[].

Note 📝
cppcheck reports CWE 398, and warns "Searching before insertion is not necessary." It also suggests a solution, but I do not want to spoil it.

The second issue is that while searching for the relevant element, a temporary string is created, and afterward discarded.

The third issue is that with operator[] a temporary string is constructed, and then replaced with "some data" (at least in most implementations std::string() does not allocate memory).

What happens next is a detour of different methods for inserting elements into a map when the key is absent, and trying to avoid possible performance issues.

To avoid implicit accidental implicit conversion, and to make the code more obvious, I’m going to use the class my_string instead of std::string

struct my_string{
    my_string() noexcept = default;
    explicit my_string(const char* s):str(s){}; // NOTE: contrary to string this is explicit
    auto operator<=>(const my_string& lhs) const noexcept = default;
    std::string str;

which means our function should be changed to

void bar1(std::map<my_string, string>& m){
    if (m.find(my_string("<some id>")) == m.end()) {
        m[my_string("<some id>")] = my_string("some data");

At this point, it is obvious that for using std::map::find a temporary string is created.

Remove the temporary string

Thanks to move semantic, it is trivial to get rid of the unnecessary temporary string

void bar2(std::map<my_string, string>& m){
    auto id = my_string("<some id>");
    if (m.find(id) == m.end()) {
        m[std::move(id)] = my_string("some data");

This solution does not apply if the class does not have an efficient move constructor. More on that later.

Remove the need of default-constructor

void bar3(std::map<my_string, string>& m){
    auto id = my_string("<some id>");
    if (m.find(id) == m.end()) {
        m.emplace(std::move(id), "some data");

The default-constructed string is not necessarily an issue, as it is implemented efficiently/does not require additional resources.

For other types, a default constructed value could be still expensive, or the class might simply be missing a default constructor.

By using the member function emplace (or insert) instead of operator[], this is not an issue.

Note that even if my_string(const char*) is explicit, emplace can take as a second parameter const char*, instead of my_string.

It is possible to write m.emplace(std::move(id), my_string("some data"));, it would create a temporary my_string("some data"), and then move it in the map.

In the case of std::string, moving is cheap, so this makes little difference.

But given the context of those notes, and the fact that for some types a move is not cheap, avoiding calling the move constructor is generally better.

Note 📝
thanks to std::map::emplace it is even possible to insert values that cannot be moved or copied (for example a class that contains a std::mutex)

Avoid traversing the map more than once

The biggest issue is now the traversal of the map.

But emplace already solves this issue

Inserts a new element into the container constructed in-place with the given args if there is no element with the key in the container.

Thus the code can be simplified to

void bar4(std::map<my_string, string>& m){
    m.emplace("<some id>", "some data");

At this point, it is possible to see that adding new elements to the map will generally be faster, depending on the position where the new element is inserted, and the size of the map.

This test demonstrates that the double traversal and unnecessary copies have some overhead. This test (with int instead of strings) shows that the difference is not only to be attributed to the copy constructor of std::string.

Difference in performance with find and insertions versus `emplace`
Figure 1. Difference in performance with find and insertions versus emplace

As finding and inserting takes logarithmic time, it should not be a surprise that the difference is not gigantic.

Note 📝
for some reason, std::map::find with operator[] seems to be faster than std::map::find with emplace. I have no good explanation for this observation.

Avoid side effects during insertions

Unfortunately emplace can have side-effects even if the insertion fails.

void foo(int);

void bar5(std::map<my_string, std::unique_ptr<int>>& m, std::unique_ptr<int> ptr){
    auto [it, result] = m.emplace("<some id>", std::move(ptr));
    if (!result) {
        foo(*ptr); // might be UB
Note 📝
std::map::insert, just like std::map::emplace, has the same limitation

Thus, try_emplace is in those cases a better solution

Unlike insert or emplace, these functions do not move from rvalue arguments if the insertion does not happen, which makes it easy to manipulate maps whose values are move-only types

The necessary changes are minimal

void bar6(std::map<my_string, std::unique_ptr<int>>& m, std::unique_ptr<int> ptr){
    auto [it, result] = m.try_emplace(my_string("<some id>"), std::move(ptr));
    if (!result) {
        foo(*ptr); // no UB, guaranteed

It’s irritating that some emplace functions take a template parameter and others a key type, but in both cases, a key is created before insertion (thus possibly unnecessarily), otherwise std::map could not verify if the key already exists.

Lazy construction

There is also another issue. If an appropriate constructor does not exist, we are forced to create the value (or other temporary values) before knowing if it will be inserted or not

void bar7(std::map<my_string, std::unique_ptr<int>>& m){
    m.try_emplace(my_string("<some id>"), std::make_unique<int>(42));

If the insertion fails because the key already exists, not only we have created an unnecessary key, but we have also created an unnecessary value.

In some cases, it is possible to resort to a two-phase construction:

void bar8(std::map<my_string, std::unique_ptr<int>>& m){
    auto [it, result] = m.try_emplace(my_string("<some id>"), nullptr);
		it->second = std::make_unique<int>(42);

But then this is a similar requirement we had operator[]: we need a cheap constructor for creating a dummy value, and overwriting it afterward. This is not always possible.

Lazy factory for delaying the creation of value

template<class Factory>
struct lazy_converter
    constexpr lazy_converter(Factory factory)
        : factory_(std::move(factory))
    constexpr operator auto() const {
        return factory_();
    Factory factory_;

void bar9(std::map<my_string, std::unique_ptr<int>>& m){
    m.try_emplace(my_string("<some id>"),
        lazy_converter([]{ return std::make_unique(42); })

This pattern can also be used with emplace

void bar10(std::map<my_string, std::unique_ptr<int>>& m){
    m.emplace("<some id>",
        lazy_converter([]{ return std::make_unique(42); })

insertion with a hint

There are member functions for inserting/emplacing elements that take an additional iterator as a parameter. It denotes a hint where the element needs to be inserted, and if it points to the correct location, it effectively means that there is no traversal during insertion.

My naive approach for determining the correct hint value without creating a key type was to use std::lower_bound

void bar11(std::map<my_string, std::unique_ptr<int>>& m, std::unique_ptr<int>& ptr, const char* key){
	auto it = std::lower_bound(m.begin(), m.end(), key_like, [](const auto& key_value, const auto& key_like){
		// WARNING: uses operator< instead of m.key_comp()
		return key_value.first < key_like;
	if(it == m.end() || it->first != key ){
		m.emplace_hint(lower_bound, key, std::move(ptr)); // thanks to hint, no 2nd iteration

While there is more code, it has the following features:

  • the map is traversed only once

  • the key is created only if necessary

  • the value is moved from only if the insertion actually happens

All listed features are more or less explicit by looking at the function bar11, in the other functions, it was less clear if and when a value was moved from or if it was created unnecessarily

It is also trivial to create the value lazily if there is an appropriate constructor or with lazy_converter, but since insertion will not fail (unless construction fails) as we’ve already verified that the key does not exist, there is no such use-case.

What is more "difficult" is to generalize this pattern.

I can imagine that it is possible to create an insertion_hint function

template <class iter>
class hint{
	bool match;
	iter it;
	hint(bool match, iter it) : match(match), it(it){}
	//! true if it points to the position where to insert the desired key
	//! false if the key already exists
	explicit operator bool() const noexcept{
		return match;
	operator iter() const noexcept{
		return it;

template <class M, class K>
auto insertion_hint(const M& m, const K& key_like){
	auto it = std::lower_bound(m.begin(), m.end(), key_like, [](const auto& key_value, const auto& key_like){
		return key_value.first < key_like;
	return hint( (it == m.end()) or (it->first != key_like), it);

void bar12(std::map<my_string, std::unique_ptr<int>>& m, std::unique_ptr<int>& ptr, const char* key){
	if(auto h = insertion_hint(m, key); h){
		m.emplace_hint(h, key, std::move(ptr));

But is this approach more efficient?

As mentioned, it is a naive approach; the map is not traversed twice, but two factors are playing a major role in this context:

  • std::lower_bound is slower than std::map::lower_bound/the member functions of the map for searching and inserting elements

  • std::map::lower_bound needs a parameter of the "correct" type, thus we might create a key type unnecessarily

It is not obvious which operation will be the most efficient, it will depend on the size of the map, where the value is going to be inserted, and how many resources the creation of a key are required.

Both when all insertions succeed and when all insertion fails, creating a string is faster than traversing the map once with std::lower_bound. Ie using std::lower_bound should be avoided (but with other types, and sizes, the results might differ).

But not all hope it lost!

If instead of using std::map<my_string,my_string> one can use a map with a transparent comparator (std::map<my_string,my_string,std::less<>>), then an overload of std::map::lower_bound can be used, that takes a template parameter without converting it, as long as it can be compared with the keys.

In that case, insertion_hint can be changed to

template <class M, class K>
auto insertion_hint(const M& m, const K& key_like){
    auto it = m.lower_bound(key_like);
    return hint( (it == m.end()) or (it->first != key_like), it);

The only disadvantage is that it can be an API change, and it can change some existing code (which is the main reason the C++ committee decided not to change the behavior of std::map::lower_bound without a transparent comparator)

Difference in performance when using `std::lower_bound` transparent comparator with `std::map::lower_bound` and direct insertion
Figure 2. Difference in performance when using std::lower_bound transparent comparator with std::map::lower_bound and direct insertion

The function insertion_hint with the class hint is a small piece of code as it permits greater flexibility for inserting optimally a value in a map.

Providing a catch-all function that is difficult to use incorrectly is difficult, otherwise std::map would probably already provide it.

I like the pattern

if(auto h = insertion_hint(m, "<some id>"); h){
	m.emplace_hint(h, "<some id>", "some data");

but it provides the desired speedup only for maps with a transparent comparator, otherwise emplace and/or try_emplace (eventually with the factory wrapper) will be generally more efficient.

It is a pity that the overload of lower_bound is there only if one changes the type of map class. It would have been possible to add a new function instead of overloading std::map::lower_bound, or ignore the fact that some code would have been broken/changed.

Note that the test with insert_hint is slightly slower than inserting directly. Probably because when searching for the optimal position and then inserting std::map needs at least one additional check compared to inserting directly; the hint might point to the wrong position.


The differences in (my) tests were minimal.

Considering that search and insertion are logarithmic in time either

  • the map is extremely big

  • the comparison operator is extremely slow

  • the application is only inserting elements in a map

or iterating more than once will generally not make a measurable difference.

On the other hand, constraints and requirements like

  • default constructor

  • 2 phase constructions

  • inexpensive construction

  • moved-from elements

can have bigger side effects, like being more resource-intensive or creating an incorrect program. For this reason, one should look at functions like std::map::insert, std::map::emplace, std::map::try_emplace, and so on.

Querying where to insert the value is error-prone, as using the wrong value is difficult to notice; one needs to measure the performance.

Lazy insertion with lazy_converter is less error-prone than getting a correct insertion hint.

The most speedup is gained when inserting multiple values ordered and calculating (instead of querying) where the next insertion position is.

For example, if elements are inserted in ascending order

for (const auto& v : source){
    m.emplace_hint(m.end(), v.key, v.value);

and if the elements are inserted in descending order

for (const auto& v : source) {
    m.emplace_hint(m.begin(), v.key, v.value);

Do you want to share your opinion? Or is there an error, some parts that are not clear enough?

You can contact me anytime.