fmove

What is move semantic

C++11 introduced move semantic.

In short, move semantics permits to make further optimization for avoiding copying data.

Not only the compiler can do such optimization, but the developer itself can also hint the compiler where variables are not used anymore, and so a copy of it could be avoided.

The canonical example is

std::vector<std::string> create() {
	std::vector<std::string> collection; // since empty, suppose no allocations
	collection.reserve(3); // -> 1 allocation
	std::string data = "a very long string"; // -> 1 allocation
	collection.push_back(data); // creates a copy -> 1 allocation
	collection.push_back(data+data); // creates a temporary, then create a copy in coll, then destroys the temporary -> 2 allocations, 1 deallocation
	collection.push_back(data); // creates a copy -> 1 allocation
	return collection; // data goes out of scope, coll is copied to the caller -> 4 allocation, 5 deallocation
}


void foo(){
	std::vector<std::string> v = create();
}

We have 10 allocations and 6 deallocations.

With C++11 and move semantic, we can achieve the following result

std::vector<std::string> create() {
	std::vector<std::string> collection;
	collection.reserve(3); // -> 1 allocation
	std::string data = "something"; // -> 1 allocation
	collection.push_back(data; // creates a copy -> 1 allocation
	collection.push_back(data+data); // creates a temporary, which is moved in coll, then destroys the temporary -> 1 allocations
	collection.push_back(std::move(data)); // steals content, no allocations
	return collection; // do not move, otherwise optimization like RVO are not possible, no deallocations
}


void foo(){
	std::vector<std::string> v = create();
}

We have 4 allocations, and 0 deallocation, which is optimal since we are creating an std::vector containing 3 separate std::string instances.

The best thing is that we needed to change only one line of code to avoid one copy (and one deallocation associated with it), everything else came for free.

There is no need to use some "low-level", "unsafe" or simply another interface.

This has been achieved by extending the C++ type system with rvalue expressions, and by adding move constructors to nearly all types of the standard library, and generate those automatically for many other types (like copy constructors).

The move constructor (and an update to the function signature of std::vector::push_back) is what permits to avoid so many unnecessary allocations and deallocations. A naive implementation for the move constructor of std::vector and std::string would simply copy the internal pointers that point to the allocated data, instead of making an allocation and copying the allocated data. To avoid something like a double-free, the internal status of the moved-from object would be set to empty, or in some other valid but unspecified state that does not cause issues when the destructor is called.

Notice that the standard does not guarantee that there are no allocations, or that the moved-from objects will be empty. It’s an implementation detail, but an allocating move-constructor for a container is very probably considered a bug in all major standard library implementations, as it would defeat the main reason for defining a move constructor.

To sum it up, in simple terms, std::move is a cast to an "rvalue expressions". This permits collection.push_back("hello"); and collection.push_back(std::move(data)); to be optimally efficient in terms of allocations and deallocations.

So far, so good.

Move constructors and the new value type make the type system more complex, and we have new rules and best practices for designing interfaces and classes.

Issues when using std::move

The main issue I have is that, because of backward compatibility, even if writing collection.push_back(std::move(data)); data might get copied, depending on how it has been defined.

Suppose someone looks at the create function, and notices that data is never changed (which is not true, as std::move modifies it, but suppose we overlooked it):

std::vector<std::string> create() { std::vector<std::string> collection; collection.reserve(1); // → 1 allocation std::string data = "something"; // → 1 allocation collection.push_back(std::move(data)); // moves content, no allocation/deallocations return collection; // no copies/deallocations }

As we want to express this (false) property in the code, we just declare it const.

Normally when we try to modify constant data, we get a compiler error (unless casting const away, which is a very explicit operation), but in this case, we change the behavior of the program:

std::vector<std::string> create() {
	std::vector<std::string> collection;
	collection.reserve(1); // -> 1 allocation
	const std::string data = "something"; // -> 1 allocation
	collection.push_back(std::move(data)); // creates a copy, as we cannot modify data -> 1 allocation
	return collection; // no copies/deallocations for collection, data goes out of scope -> 1 deallocation
}

If it seems a contrived example, it is not.

Even if the behavior of the program changed, it does probably not change its output, only the used resources. As unfortunately, performance analysis is not that easy, these types of error or optimizations get unnoticed a lot of times.

For example in the chrome browser, it was determined that most allocations are made of unnecessary string copies. As another example, the test suite Catch2 suffered from similar problems.

Unfortunately, not all static checker follows the same guidelines. As always, it is important not to follow advises blindly. Consider the following snippet

struct foo{
	explicit foo(std::string str_) : str{std::move(str_)} {}
	private:
	std::string str;
};

Cppcheck (Cppcheck 1.76.1, with C++11 enabled) complains about it with the following warning:

(performance) Parameter 'str_' is passed by value. It could be passed as a const reference which is usually faster and recommended in C++.

While the statement is true as a general guideline, it is not true in this case. As the std::string is used for initializing internally and unconditionally a new variable, passing it by reference would only add an indirection in the best case. In the worst case, it would create an unnecessary copy, for example when using passing a literal or const char* to foo.

Never version of Cppcheck (at least Cppcheck 1.86) do not trigger this specific warning anymore, but AFAIK it still does not warn that

struct foo{
	explicit foo(const std::string& str_) : str{str_) {}
	private:
	std::string str;
};

has a suboptimal interface.

Probably the most overhead of an application would not be copying a specific string instance, but because there are many little unnecessary copies, the needed time for those copies all together might take significant time, but as taken one by one they are not significant, they won’t show up in a profile report.

And when using memory checkers, like Valgrind or AddressSanitizer, and the application will suffer a death by a thousand cuts, those scattered memory allocations tend to blow the runtimes, even if they might still not appear in a performance report.

Since we cannot change std::move (although renaming it to move_if_possible would make the intent more clear), we can provide a new function that calls std::move, but also ensures that the move operation is not going to fall back to a copy. And while we are at it, we can also ensure that the call to std::move makes sense.

While it is possible to write

int i = 42;
int j = std::move(i);

it makes little sense to do so, as there is nothing to optimize as there are no resources to steal.

A simple implementation for fmove (the f would stand for "force" and "fail" as in failing fast) would be

#include <utility>

// like std::move, but without fallback behavior (triggers an error if move has no effect)
template <class T>
auto fmove(T& t) {
	using T2 = typename std::remove_reference<T>::type;
	static_assert(!std::is_const<T2>::value, "const is not moveable");
	static_assert(!std::is_enum<T2>::value, "std::move on enum does not have any effect");
	static_assert(!std::is_pointer<T2>::value, "std::move on pointer does not have any effect");
	static_assert(!std::is_array<T2>::value, "std::move on array does not have any effect");
	static_assert(!std::is_scalar<T2>::value, "std::move on scalar does not have any effect");
	static_assert(!std::is_pod<T2>::value, "std::move on pod does not have any effect");
	return std::move(t);
}

This gives us a tool that does not depend on the compiler or static analyzer (some compilers can point out some useless moves), and for expressing better our intention to move the content, and notify the user if this is not possible through a compilation error.

For the careful reader: the function signature is fmove(T&) and not fmove(T&&), otherwise we would not trigger an error on an rvalue, ie it would accept code like fmove(std::move(a)), which is no different than std::move(a)

For example

int i = 42;
int j = fmove(i);

would not compile, and not even

const std::string str1 = "...";
const std::string str2 = fmove(str1);

but

std::string str1 = "...";
const std::string str2 = fmove(str1);

does, as desired.

Unfortunately, there is one situation where the user experience can be worse than with std::move:

std::string foo() {
	std::string str1 = "...";
	return fmove(str1);
}

Doing an std::move inside a return statement will probably inhibit RVO, clang (and in the meantime GCC too) are able to emit warnings when this construct is found -Wpessimizing-move. As in this case the call to std::move is "hidden" behind a function, both compilers do not trigger any warning. Adding inline or attributealways_inline to fmove did not help the static analyzer to catch this case.

Fortunately detecting this issue with a grep is straightforward, albeit not optimal.

While fmove solves the problem I had with unnecessary copies, it is not a replacement for std::move, as it is not equally generic.

It is for example not possible, or desirable, to use it in a too much generic context.

For example, a possible implementation for std::accumulate is

template<class I, class T>
T accumulate(I first, I last, T init) {
    for (; first != last; ++first) {
        init = std::move(init) + *first;
    }
    return init;
}

replacing it with

template<class I, class T>
T accumulate(I first, I last, T init) {
    for (; first != last; ++first) {
        init = fmove(init) + *first;
    }
    return init;
}

would break existing and valid use cases, this is exactly one of those scenarios where the fallback behavior is desired.

So my guideline is to use fmove, where I want to assert that I’m stealing some resource. This is mostly the case where the type is known. In generic code, when it is not possible or desirable to make such an assertion, use std::move.