String parameters
This topic comes up from time to time, so I wanted to write a couple of considerations down.
What is the most efficient way to pass a string around? Is there a guideline?
The answer is that there is a guideline, but no, there is no method that works best in all situations.
The std::string
class has the following properties:
-
it has a reusable storage
-
it converts implicitly from string literals
-
it can have efficient move operations that do not allocate
-
there exists a view type
std::string_view
(which also converts implicitly from literals)
Some additional considerations:
-
I’m ignoring SSO (Small String Optimization) and assume that all strings are long enough to require an allocation
-
most, if not all, considerations done for
std::string
also hold forstd::wstring
,std::pmr::string
, and strings with custom allocators -
I’m ignoring the fact that a compiler is permitted to elide copies and other possible optimizations
To show which method is "better", I’ll use the following class; a wrapper wr
around std::string
:
struct wr{
std::string s;
wr( /* ??? */ p) : s( /* ??? */ ) {}
void set( /* ??? */ p) { /* ??? */ }
};
There are mainly three relevant test scenarios:
-
calling a function with a reference
-
calling a function with a temporary
-
calling a function with something else that is convertible to a string
Constructor
What is the "best" function signature for the constructor?
If we eliminate some candidates that are not particularly useful for this context, there are mainly four sensible choices for the parameter:
-
const std::string&
-
std::string_view
-
std::string&&
-
std::string
I’m not considering const char*
as a parameter, as in the absence of overloads it provides no benefit over std::string_view
.
Instead of counting which operators are invoked, let’s count the allocation for different scenarios.
Since allocating is an order of magnitude slower than copying the content pointed by std::string
or assigning pointers, it should be sufficient for determining the most efficient implementation.
wr("123") | wr(getValue()) | wr(getRef()) | |
---|---|---|---|
| 2alloc | 2alloc | 1alloc |
| 1alloc | 2alloc | 1alloc |
| 1alloc | 1alloc | compiler error |
| 1alloc | 1alloc | 1alloc |
In the case of wr::wr("123")
, "123"
is converted to a std::string
(except for the constructor wr::wr(std::string_view p)
). This causes one allocation, and when the constructor of wr
takes a string as a constant reference, it is not possible to "steal" the resources. Thus using wr::wr(const std::string& p)
does not work well with temporaries.
std::string_view
has a similar issue. Although it avoids the unnecessary allocation in the case of wr::wr("123")
, it does not own the data. Thus if the caller already has a std::string
, the constructor cannot steal the resources.
The winner is clearly "pass by value", or "copy and move": wr::wr(std::string p) :s(std::move(p)) {}
.
But what if the content of the parameter p
is not saved unconditionally in the class wr
? What if the string is copied only under certain conditions, for example, like
wr( /* ??? */ p) :s() {
if( not p.empty() and p[0] == 'a' ){ /* ??? */ }
}
In this case, "pass by value" might unnecessarily allocate memory, copy the content, inspect the first byte, and then throw everything away.
Thus "pass by value" creates unnecessary copies when the content is not assigned, and std::string_view
does not make it possible to "steal" the resources under the right conditions.
If the condition under which the string is not assigned happens very rarely, one should probably still prefer "pass by value".
What does "very rarely" mean? In my case, it means that when it happens, performance is not a concern anymore.
For example, consider the following class:
struct twostrings{
std::string s0;
std::string s1;
twostrings( std::string p1 ): s0(p1 + ".txt"), s1(std::move(p1)) {}
};
In the case of twostrings
, the assignment of s1
happens only if the assignment of s0
does not throw
. This means that the code is suboptimal if there is not enough memory. In some circumstances, it creates p1
and then throws everything away.
If there is not enough memory for copying the string, performance is in general not a concern for most programs in many environments.
-
it almost never happens that an allocation fails
-
the system is already extremely slow because of swapping
Similar considerations hold with constructors with multiple parameters and member variables. While it looks like all member variables and parameters are used unconditionally, in practice they might not.
If exceptions are used rarely, then this is not a concern. For the "happy" flow, the code is simple and performant. In the exceptional case, the code is still simple and correct but does unnecessary work.
If exceptions are used "too often", then this should not be a concern either, as there are probably lower-hanging fruits for optimizing for performance.
On the other hand, if the condition under which the string is not assigned happens very frequently, one should probably prefer string_view
.
Setter
For a setter, things look a little bit different.
In the case of construction, an initialization always implies an allocation.
In the case of an assignment, the member variable std::string
might not be empty, and thus can avoid an allocation, and just copy the content over. This is why in this case, I’m counting allocation and assignments, as assignments might cause an allocation, depending on the state of the member variable.
set("123") | set(getValue()) | set(getRef()) | |
---|---|---|---|
| 1alloc+1assign | 1alloc+1assign | 1assign |
| 1assign | 1alloc+1assign | 1assign |
| 1alloc | 1alloc | compiler error |
| 1alloc | 1alloc | 1alloc |
As mentioned before the table, an assignment might or might not allocate, depending on the capacity of s
.
In this scenario "pass by value" is not the clear winner, even if the assignment happens unconditionally.
The assignment from std::string_view
is in general more efficient as it might not require an allocation.
The only constellation where "pass by value" outperforms std::string_view
is when getValue()
returns a string whose .size()
is greater than s.capacity()
, as in this case, it causes two allocations instead of one.
Thus without further context void wr::set(std::string_view s)
should be preferred over the alternatives.
Similarly for the constructors, if the assignment is conditional, then using std::string_view
makes even more sense.
When to use const std::string&
According to the tables, string_view
is always better than const std::string&
when copying the content. And since string_view
is generally also better than const std::string&
for just reading the content out, should one always avoid const std::string&
?
Using const std::string&
might be more ergonomic, since some overloads between std::string
and other types might not exist. In this example, the function add1
compiles, while add2
does not. In this case, it is possible to make it work efficiently (just use append
), but it might not always be the case if the API of a third-party library uses a const std::string&
parameter.
// compiles
std::string add1(std::string lhs, const std::string& rhs){
return lhs + rhs;
}
// does not compile
std::string add2(std::string lhs, std::string_view rhs){
return lhs + rhs;
}
If most of the code is already working with std::string
, then in practice using const std::string&
might never lead to additional copies. Thus for internal functions, especially if it is possible to inspect all call sites, using const std::string&
might not be an issue and can lead to simpler code.
But there are also other two important differences between std::string
and std::string_view
: \0
termination and allocator.
Granted, when dealing with constant the allocator is in general not interesting, but \0
-termination is, unfortunately, an important property.
If the content is passed to an API that requires a \0
-terminated string, then using const std::string&
can avoid unnecessary copies, since std::string_view
does not have an API for verifying if the content is \0
-terminated, and looking behind the end of the content of std::string_view
might invoke undefined behavior.
When to use std::string&&
I think never unless it is part of an overload. Normally one does not want to force a user to move a std::string
.
An example can be found at std::string::substr() &&. Granted, in the case of substr
it’s an overload over this
and not the other parameters, but the paper also adds an overload for the constructor that provides the same functionality of substr
.
Why not use overloads?
For the setter, one could write the following overloads:
struct wr{
std::string s;
wr(std::string s) :s(std::move(s)) {}
void set(std::string_view p) {s = p;}
void set(std::string&& p) {s = std::move(p);}
void set(const char* p) {s = p;}
};
When overloading a function with std::string_view
and std::string
, one needs, unfortunately, to add an overload for const char*
too. Since both std::string_view
and std::string
can be implicitly assigned from const char*
, the compiler does not know which overload should be preferred. Thus the solution is to add a third overload.
This is a single setter, but it already leads to a massive amount of code duplication.
What if the setter function accepts two strings? That would lead to nine overloads! At least they are one-liners, but it is generally not acceptable.
Thus, one should prefer std::string_view
if it is possible to reuse existing storage.
No constructors and no setters
There is an even better approach: don’t write any code!
struct wr{
std::string s;
};
Less code, and as efficient as it gets.
This works if wr
is just a bag of data, or provides nearly no additional invariants over std::string
. Classes that simply assign the content and have getters and setters for every member variable are probably better defined as a struct
with just the member variables.
What about other classes?
The topic comes up often with strings as they are so common, but similar considerations hold for other classes too.
The same analysis can be applied to other containers from the standard library, especially for the combination std::vector
/std::span
.
For other classes, it depends on how cheap or expensive it is to invoke the copy and move operators.
If a class is cheap to copy, like an enum
, a pointer, a std::pair
of int, std::string_view
, … then just do everything by value. Using references and/or pointers introduces aliasing, and makes it harder both for the programmer and compiler to reason about the code.
For classes that are not cheap to copy, and if moving is not cheaper than copying, then a copy followed by a move is worse than using a constant reference and a copy when not dealing with temporaries.
Construction
wr(getValue()) | wr(getRef()) | |
---|---|---|
| 2copy | 1copy |
| 1copy+1move | compiler error |
| 1copy+1move | 1copy+1move |
Thus if a move is not cheaper than a copy, and a copy is not cheap, using wr::wr(const T& s)
should be preferred. If move is cheaper and "copy+move" is in the same order of magnitude as a copy, then wr::wr(T s)
should be preferred. For classes in between, I do not know.
Try to design your classes so that moving is efficient.
Setter
set(getValue()) | set(getRef()) | |
---|---|---|
| 1copy+1assign | 1assign |
| 1copy+1move-assign | compiler error |
| 1copy+1move-assign | 1copy+1move-assign |
Note 📝 | move-assignment should always be cheap and noexcept , a naive implementation would swap all member variables. |
Even if "move-assign" is cheaper than assignment, void wr::set(const T& p)
is a better alternative than void wr::set(T p)
.
Classes without copy constructors
For classes like std::unique_ptr
, where it is not possible to make a copy, one could argue that wr::wr(T&& s)
and void wr::set(T&& p)
are better than wr::wr(T s)
and void wr::set(T p)
.
They are functionally equivalent, but if moving is cheap (which is normally the case for move-only types), then I would definitively prefer wr::wr(T p)
over wr::wr(T&& p)
. The syntax looks more "natural" (pass-by-value happens relatively often in C++, while pass-by-xvalue not) as it leads to an easier rule-of-thumb: use pass-by-value for transferring ownership.
wr(getValue()) | wr(getRef()) | |
---|---|---|
| compiler error | compiler error |
| compiler error | 1move |
| 1move | compiler error |
| 1 or 2 move | compiler error |
set(getValue()) | set(getRef()) | |
---|---|---|
| compiler error | compiler error |
| compiler error | 1move |
| 1move | compiler error |
| 1 or 2 move | compiler error |
If values are not copied unconditionally, it makes sense to add a void wr::set(T& p)
overload. If it is difficult to determine efficiently from the outside if a value will be moved from or not, then the user can try to transfer ownership, and if it fails, it can still do something with the original (hopefully unmodified) object, instead of having it discarded.
Conclusion
If possible, avoid to write any code.
If some code is necessary, then try to avoid overloads.
Assuming that move is cheap:
-
for "cheap" and "tiny" types, pass by value
-
for unconditional construction, pass by value and move
-
for unconditional assignment for copyable types, pass by const-reference, or use view types.
-
for unconditional assignment for move-only types (which are normally cheap to move), pass by value.
-
for conditional construction and assignment, pass by const-reference, or use view types.
With overloads, it is possible to optimize many possible scenarios, at the cost of maintaining more code.
The code is not necessarily complicated, but
-
there is additional overhead to ensure consistency between all overloads
-
additional build time (imagine every function providing at least three overloads if a string appears in a parameter!)
-
skewed metrics like code coverage (again, imagine every function providing at least three overloads if a string appears in a parameter!)
-
some overloads might be dead because the class/function is only used internally
For me (and most projects I’ve encountered), those disadvantages are not worth it.
With more advanced techniques like templates and lazy construction one can probably optimize all scenarios (see how difficult it is to insert elements in a map efficiently) without writing overloads, but just like it is not worth to add all posssible overloads, using more complex techniques does not make much sense.
Unless you are profiling your code and can measure a significant difference, at that point, you should not apply the rule of thumb and other general guidelines.
Do you want to share your opinion? Or is there an error, some parts that are not clear enough?
You can contact me anytime.