Constructors and related functions
Since the topic on when and how to define constructors and other conversion operators pops up from time to time, I#ve decided to write down all things I’ve considered so far.
Constructors before C++11
Before C++11, there was no move semantic. Either you could copy a class, or not.
This is an example of a copayble with with a specific invariant: the internal pointer owns some data and is never nullptr
.
#include <utility>
#include <cassert>
struct data{
// ...
};
struct s{
data* d; // is always != nullptr
s() : d(new data()){
assert(d != nullptr);
}
s(const s& other) : d(new data(*other.d)){
assert(d != nullptr);
}
s& operator=(const s& other){
if(this != &other){
s tmp(other);
std::swap(this->d, tmp.d);
}
assert(d != nullptr);
return *this;
}
~s(){
assert(d != nullptr);
delete d;
}
};
Value semantic since C++11
Since C++11, authors of classes should also take move semantic into account. Blindly adding a move constructor can break invariants, in this case, the invariant d != nullptr
.
#include <utility>
#include <cassert>
#include <memory>
struct data{
// ...
};
struct s{
std::unique_ptr<data> d; // is always != nullptr
// other constructors similarly as before
// efficient move constructor breaks invariant of other
s(s&& other) : d(std::move(other.d)) noexcept {
assert(d != nullptr);
}
s& operator=(s&& other) noexcept {
std::swap(this->d, other.d);
assert(d != nullptr);
return *this;
}
~s() = default;
};
One could do a copy in the move constructor, thus instead of
// breaks invariant of other
s(s&& other) : d(std::move(*other.d)) noexcept {
assert(d != nullptr);
}
one would write
// does *not* break invariant of other
s(const s&& other) : d(std::make_unique<data>(*other.d)) {
assert(d != nullptr);
}
Note that it is not necessary to change the definition of s& operator=(s&& other)
; the implementation with swap
does not break the invariant.
The disadvantage of this approach is that the move constructor is not cheap anymore. Since the new implementation allocates memory, I’ve not marked the function s(const s&& other)
as noexcept
, which means that in some scenarios, like a std::vector
reallocating memory, instead of moving the values around, to provide a strong exception guarantee, it needs to duplicate every value before removing the old ones.
Which constructors are automatically generated
This table summarize the current status on how the constructors (and destructor, and assignment operators) interact with each other:
Default constructor | Copy constructor | Copy operator= | Move constructor | Move operator= | Destructor | |
---|---|---|---|---|---|---|
Nothing | comp-defined | comp-defined | comp-defined | comp-defined | comp-defined | comp-defined |
Conversion constructor | not defined | comp-defined | comp-defined | comp-defined | comp-defined | comp-defined |
Default constructor | user-defined | comp-defined | comp-defined | comp-defined | comp-defined | comp-defined |
Copy constructor | not defined | user-defined | comp-defined (deprecated) | not defined | not defined | comp-defined |
Copy operator= | comp-defined | comp-defined (deprecated) | user-defined | not defined | not defined | comp-defined |
Move constructor | not defined | defined as delete | defined as delete | user-defined | not defined | comp-defined |
Move operator= | comp-defined | defined as delete | defined as delete | not defined | user-defined | comp-defined |
Destructor | comp-defined | comp-defined (deprecated) | comp-defined (deprecated) | not defined | user-defined | user-defined |
Since it is error-prone to remember the rules (I got some cells wrong a couple of times), the safest way to handle constructors and assignment operators is to follow the "Rule of Zero" and the "Rule of Five".
Rule of Zero
Try to avoid defining any copy/move constructor/assignment operators or destructor. A memberwise destruction/copy/move is often the right thing to do.
Try to use tools like existing containers and unique_ptr
to avoid having to define any of the mentioned operations.
Rule of Five
If you must define and/or implement at least the destructor, an assignment operator or copy/move constructor, then you should implement all assignment operators and move/copy constructor. The compiler-generated functions might not have the right semantic.
Since a destructor is always defined, I would actually claim that you should not declare a destructor if you are just deleting move and copy operations.
explicit
constructors
Implicit conversion can help to remove some of the bloat and make code easier to read
// unnecessarily verbose
void foo(std::string_view);
int bar(){
foo(std::string_view("hello"));
std::string data = ...;
foo(std::string_view(data));
}
// less boilerplate, easier to read
void foo(std::string_view);
int bar(){
foo("ababab");
std::string data = ...;
foo(data);
}
But implicit conversions have drawbacks too, in particular
-
it can do the "wrong" conversion, thus the program behaves incorrectly
-
it can create costly temporaries, thus the program uses more resources
struct S{
S(int);
};
void foo(S);
void foo(bool);
void foo(const std::string&);
void bar(){
// converts bool to 12, instead of S, even if S() is implicit
foo(12);
// creates two temporary strings, allocates memory for them, and then discards them immediately
std::string s;
s = "a" + s + "a";
for(int i = 1; i != 100; ++i){
// creates a temporary std::string at every iteration
foo("hello");
}
}
Implicit constructors can be invoked with different syntaxes, some of which can be hard to read
struct Implicit {
Implicit();
Implicit(int);
Implicit(int, int);
};
void foo(const Implicit&);
void bar() {
foo(Implicit());
foo({});
foo(Implicit(1));
foo(1);
foo({1});
foo(Implicit(1,2));
foo({1, 2});
}
while explicit constructors allow only a subset
struct Explicit {
explicit Explicit();
explicit Explicit(int);
explicit Explicit(int, int);
};
void foo(const Explicit&);
void bar() {
foo(Explicit());
foo(Explicit(1));
foo(Explicit(1, 2));
}
Since by default constructors are implicit; it is in general not possible to recognize from the code if the author of a class really wanted an implicit conversion or not. Since C++20 🗄️, it is possible to write explicit(false)
to express explicitly that the constructor is implicit.
For some types (fundamental types, enum
, types from external libraries, …) it is not possible to make the conversion operator explicit.
In this case, it is possible to use = delete;
to avoid unwanted conversions when calling a function.
struct S{
S(int);
};
void foo(S);
void foo(bool);
void foo(int) = delete;
void bar(){
// compiler error, otherwise would convert 12 to a bool, instead of S
foo(12);
foo(S(12)); // works as expected
}
Note that some constructors should always be implicit.
Copy constructors, move constructors, and constructors taking an std::initializer_list
, should not be marked explicit, otherwise they do not work correctly.
If a class has an unary constructor taking an std::initializer_list
and a default constructor, then the default constructor should be implicit for consistency.
"view" types that are mainly used as function parameters and are cheap to create (like string_view
, span
, …) should be implicit.
Classes that would otherwise have no user-provided constructors do not need to add explicit constructors Adding user-provided constructors breaks the integration with pair
, tuple
and destructuring. Even if those feature are not used, adding explicit constructors leads to code duplication (one parameter for every member variable), and aggregate-initialization can be even more explicit (and efficient) than a constructor call, and thus less-error prone.
struct bag_of_data{
bool online = false;
const char* message = "hello world";
};
void bar(bag_of_data);
void foo(){
bar(bag_of_data{.online = true, .message = "welcome"});
}
noexcept
functions
In some projects, the use of noexcept
is highly encouraged.
noexcept
can mean different things. For the reader of the code, it often that means the function never throws. The standard library uses it as documentation; noexcept is a contract. The function does not throw by design, not just because the current implementation happens not to throw.
For the compiler, noexcept
means that the function never throws and invokes std::terminate()
if something inside the function throws and does not catch
it. Which means that the compiler might generate additional code for adding the call to std::terminate()
, even if it might never get invoked.
For inline functions, the compiler can see by itself if a function throws or not, and for the reader, it often adds little benefit if the functions is short.
Personally, I think there are only a subset of non-inline
functions where it makes sense to mark them as noexcept
.
The first one are operators (+
, <
, ==
, conversion operators, …) as they are "hidden" function calls. If the operation can fail, I woudl strongly argue to use a named function.
Then there are move operators, because some classes (for example std::vector
) inspect if the move constructor can throw, and if it does not, it uses a more efficient algorithm. Similarly to move constructors, some algorithms verify if swap
can throws; but more on swapping values later.
Note 📝 | destructors are automatically declared as noexcept , even if you do not write it. The only way to have a non-noexcept destructor is by marking it explicitely noexcept(false) . So better leave them alone. |
I also mark some constructors as noexcept
, especially if they are used for initializing global data, as it can help static analyzers to determine that there is no unrecoverable error.
Conversion operators
operator T()
, with T
different than bool
, probably needs to be implicit to work correctly. Also note that a named function is often a better alternative to operator T()
.
operator bool
Make it always explicit. There are simply too many places in the language where things can convert implicitely to bool, and such conversion will bite you.
struct convertible{
explicit operator bool();
};
int main(){
auto s = convertible();
if(s){} // always compiles
bool b1 = bool(s); // explicit conversion, always compiles
bool b2 = s; // fails to compile, unless implicit
}
Others
There are some situations where a implicit conversion are preferred to named functions.
In my first case, it was when working with a code-base that had not std::string_view
.
So we defined our own string_view
class with the same API of std::string_view
, but we did not want to modify the std::string
of the toolchain to add a constructor for creating a std::string
from our custom string_view
. Thus we added an implicit and potentually throwing operator std::string()
.
The advantage of this approach was that once we where able to upgrade our toolchain, we could replace our custom string_view
with std::string_view
and recompile the code without further changes.
Another use-case if lazy_factory, from my notes for inserting values efficiently in a map.
The same factory can also be used in other contexts, and also takes advantage of an implicit (and depending on the callback, potentially throwing) conversion operator.
Apart for those two use-cases, I almost always replaced conversion operators with named functions.
Destructors
There is rarely a need to define a destrcutor, I actually have only following "common" use-cases in my mind:
-
support for incomplete types
-
virtual
destructors -
define a class with side effects in the destructor, for example an action to roll back in case of errors
The Rule of Five seems to say you should also define destructor, and tools like clang-tidy follow it to the letter. The destructor is different from cosntructors and copy-assignment operators, and it is always generated. Thus there is no need to make it more common than necessary to provide it.
swapping values
A fundamental operation that is often forgotten: swapping values.
If the class does not implement an efficient move constructor, then swapping to values (with std::swap
or std::ranges::swap
) means making another (costly) copy.
At least it is possible to privde a own specialisation, making swapping values cheaper even for types that have a costly copy and move constructor.
#include <utility>
#include <cassert>
#include <memory>
struct data{
// ...
};
struct s{
std::unique_ptr<data> d; // is always != nullptr
s() : d(std::make_unique<data>()){
assert(d != nullptr);
}
s(const s& other) : d(std::make_unique<data>(*other.d)) {
assert(d != nullptr);
}
s& operator=(const s& other){
if(this != &other){
s tmp = other;
swap(*this, other);
}
assert(d != nullptr);
return *this;
}
s(const s&& other) : d(std::make_unique<data>(*other.d)) {
assert(d != nullptr);
}
s& operator=(s&& other) noexcept {
swap(*this, other);
assert(d != nullptr);
return *this;
}
friend void swap(s& l, s& r) noexcept {
std::swap(l.d, r.d);
}
~s(){
assert(d != nullptr);
}
};
Once that a class has a specialized swap
operation that does not rely on the copy or move constructor, the implementation of the copy-assignment and move-assignment operator can be simplified a little bit; just swap this
with the other instance.
While customizing swap
will not make a vector reallocating memory more efficient, it will speed up different algorithms that move values around; for example sorting.
Note that if your class already has an efficient copy or move constructor, specializing swap
will not bring as many benefits.
You should generally not declare the move constructor as deleted
If an inefficient move constructor seems unacceptable, one could realize that there is another alternative: delete it.
#include <utility>
#include <cassert>
#include <memory>
struct data{
// ...
};
struct s{
std::unique_ptr<data> d; // is always != nullptr
s();
s(const s& other);
s& operator=(const s& other);
s(s&& other) = delete;
s& operator=(s&& other) noexcept;
~s();
};
Having a copy constructor and a deleted only move constructor is generally not a good idea. Most code assumes that if a class is copyable, then a move operation is either more efficient, or it will behave like a copy.
Without this assumption, one should manually check if a class has a move constructor, and if not, use the copy constructor.
Which means to complicate a lot of code for no gain, because this is what already happens with a move constructor that is as efficient as a copy-constructor.
Also, such a class will not be swappable (std::swap
will fail to compile), unless it provides a specialisation for swap
.
Similar consideration also holds for a class having the copy-assignment operator, and a deleted move-assignment operator.
Categorization of classes
With so many guidelines and things to consider, one would think that writing a class is incredibly complex.
In practice there is only a subset of combination that really make sense.
I’ve tried to categorize most use-cases (a class might appear in more than one category).
Value types
Those tye can be copied and assigned, the copies act independently one from another.
Most classes should follow the Rule of Zero, from time to time it might be necessary to follow the Rule of Five.
Examples: std::string
, int
, std::string_view
, std::vector<T>
where T
is a value type, …
Move-only types
Classes that cannot be copied, but moved.
You can follow also in this case the Rule of Zero, just like value types.
Sometimes I still decide to follow the Rule of Five
struct s{
std::unique_ptr<int> ptr;
s();
s(const s& other) = delete;
s& operator=(const s& other) = delete;
s(s&& other) noexcept = default;
s& operator=(s&& other) noexcept = default;
};
The main reason is that the MSVC compiler generates inscrutable error messages in some situations (mainly when used with std::vector
) Hopefully this will improve in the future, and I might be able to reduce most if not all cases to
struct s{
std::unique_ptr<int> ptr;
s();
};
No-move and no-copy types
A class that cannot be copied or moved.
This seems happens often with class hierarchies, either by design or accident.
MutexedObj is an example of class without move and copy constructors.
Just mark all copy and move operations as deleted, or follow the Rule of Zero if a member variable is non moveable and non copyable (like a mutex
)
Functional types
They generally have little state, and an operator()
.
It could also be a named function instead of operator()
, I expect such types to get more common thanks to std::function_ref
.
The main examples are lambdas, but there are also class hierarchies when they define a small API.
Those types might be copyable and moveable or maybe not, for most use-cases it is not relevant. If copyable, such classes might have a shared state, thus they might not have value semantic.
Guard types
A class whose destructor has side-effects.
You should not need to write such a class often, a templated one like the following
template <class T>
struct scope_guard
{
explicit scope_guard( T t_ ): t( t_ ){}
~scope_guard(){
t();
}
scope_guard(const scope_guard& other) = delete;
scope_guard& operator=(const scope_guard& other) = delete;
scope_guard(scope_guard&& other) noexcept = delete;
scope_guard& operator=(scope_guard&& other) noexcept = delete;
private:
T t;
};
template <typename T>
scope_guard( T ) -> scope_guard<T>;
int main(){
auto g = scope_guard([]{std::puts("bye!");});
}
should cover most use-cases.
You might add other features like
-
support for
constinit
-
execute the side-effects in the constructor only if an exception has been thrown
-
a
discard
/reset
function for manually disabling the side effect in the destructor -
if possible, automatically convert
B
tovoid(*)()
to reduce the amount of template instantiations
The presented scope_guard
is also no-move and no-copy, but if it has a discard
/reset
functionality, then there is one "natural" state for moved-from. The main issue is that the semantic of the move constructor and move assignment operator is not obvious:
template <class T>
struct unique_scope_guard
{
explicit unique_scope_guard( T t_ ): t( t_ ), enabled(true){}
~unique_scope_guard(){
if(enabled) t();
}
unique_scope_guard(const unique_scope_guard& other) = delete;
unique_scope_guard& operator=(const unique_scope_guard& other) = delete;
unique_scope_guard(unique_scope_guard&& other) noexcept: t(other.t), enabled(other.enabled) {
if (this != &other){
other.enabled = false;
}
}
unique_scope_guard& operator=(unique_scope_guard&& other) noexcept {
if (this != &other){
this->t = other.t;
this->enabled = other.enabled;
other.enabled = false;
}
return *this;
}
void reset() noexcept {
this->enabled = false;
}
private:
bool enabled;
T t;
};
template <typename T>
unique_scope_guard( T ) -> unique_scope_guard<T>;
int main(){
auto g = unique_scope_guard(+[]{std::puts("hello");});
g = unique_scope_guard(+[]{std::puts("bye!");});
}
Should the output should just bye!
, or should it be hello
followed by bye!
? Should the assignment operator swap
the contained values, or assign other.enabled
to false
? If in doubt, it is probably better not to have a move constructor and leave the class no-move and no-copy even in the presence of a discard
/reset
function.
Views
Those types are cheap to create, copy and move, the implementation often consists in some non-owning pointers.
Such classes should either have an immutable API (like std::string_view
), or not have comparison operators (like std::span
).
I personally use them often as function parameters, and they have broad (many types can convert to them) implicit constructors.
Class hierarchies
The base class is either non-copyable and non-movable (in this case also the leafs of the class hierarchy are also non-copyable and non-movable), or the copy and move operations are defined in the base class as protected
.
If the operations are protected
, then the leaf classes can decide if they are copyable or moveable.
Base classes should in general not have public copy and move operations, because they’ll cause slicing. This is not necessarily an issue, but it is unexpected in most situations.
Bag of data
A struct where the member variables are only loosely related, if at all, thus there are little to no invariants between the member variables.
Everything is public, there are little to no invariant.
They should normally not have any special member functions, which also has the advantage that the user can aggregate-initialize the class.
The standard library offers, as (bad) examples std::pair
, std::tuple
. They are bad because they have user-defined constructors, with complex rules which cannot be removed for breaking backwards compatibility. A much more simple (and efficient) implementation for std::pair
would be
template <class A, class B>
struct pair{A first;B second;};
Conclusion
The "Rule of Zero" is a good rule of thumb. It should cover most classes in a codebase. The "Rule of Five" should cover the others.
Another rule of thumb is that copyable classes are always moveable; in the worst-case scenario the move-constructor does a copy.
If the move constructor is not noexcept
, or cheap, then one should consider specializing swap
.
If possible, move and swap operations should be noexcept
; but not because the compiler might generate different code. Some algorithms will query such properties and do different operations, especially if they provide a strong exception guarantee.
Swapping is important (as it is part of the definition of "value semantic"), but often overlooked functionality, and by providing a custom implementation, it is possible to avoid some copies without introducing a moved-from state.
Operators are better defined and implemented as noexcept
, otherwise consider a named function. operator bool
should always be explicit.
Do you want to share your opinion? Or is there an error, some parts that are not clear enough?
You can contact me anytime.