The C++ logo, by Jeremy Kratz, licensed under CC0 1.0 Universal

Name your conversions

8 - 10 minutes read, 1906 words
Categories: c++
Keywords: c++ cast constructor conversion explicit guideline

In C++ there are multiple ways to convert from one type to another, one of those is by casting.

The CppCoreGuidelines have multiple rules against casting, more in detail:

The common suggestion is to avoid all types of casts. If possible, it is sound to follow those guidelines, even if see there is a fundamental logical flaw.

A cast performs a type conversion, just like a constructor.

If we want to disallow all casts because of some bad property, for the same reason we should disallow constructors.

But then, shouldn’t we disallow all types?

Obviously, there is some middle ground, because having a statically typed language has many advantages.

Casts have two main issues. The cast itself does not tell very well why the type conversion is necessary, and a cast can do a lot of different logical operations. The same holds for constructors.

The C-style cast is the "worst" cast, as it can do many possible conversion, and it is also difficult to search.

The static_cast, normally sold as a safe cast, is also problematic in my view.

It can do a lot of conversions, call functions (constructors) behind the scenes, but at least, like other C++ casts, it’s easy to search for.

const_cast is indeed a problematic cast, it normally denotes a code smell, as it makes easy to trigger undefined behaviors. Contrary to static_cast the cast itself is more self-explanatory and does not execute any code behind the scenes.

dynamic_cast casts between class hierarchies, it is also self-explanatory and might denote a code-smell too, or a leaky abstraction.

reinterpret_cast is also a self-explanatory cast, and does not execute code.

The main problem with casts is when one argument changes.

Suppose you have

void bar(int); // external library

void wrap_bar(long l){ // as our code uses long in too many places
	// cast is safe because of XXX

The cast is safe because the parameter l is never going to be out-of-range for an int type. This is encapsulated in the wrap_bar function, which is better than casting at every call of bar or relying on implicit conversions. The main reason is that when searching for problematic casts or analyzing compiler warnings, there is no reason to analyze the same potential issue more than once.

In a more generic context

auto ul = static_cast<long>(i);

as static_cast can do many conversion, if i changes from int to float, bool or some user-defined type that can convert to long, the code might be wrong. But as there is an explicit cast, the compiler will not emit any warning.

Otherwise, consider

std::string s = zs;

According to the guideline should be preferred, as it is "more explicit"

auto s = static_cast<std::string>(zs);

Which makes the code only harder to read, with no real benefit.

reinterpret_cast suffers from the same problem,

void* p = nullptr;

auto c = reinterpret_cast<char*>(p)

is according to the guideline, worse than

auto c = static_cast<char*>(p)

static_cast fails if p is changed to int p = 0;, while reinterpret_cast not, on the other hand

struct s {
	operator char*() const {
		return nullptr;

s p {} ;
char* c1 = reinterpret_cast<char*>(p);
char* c2 = static_cast<char*>(p);

static_cast compiles, while reinterpret_cast does not. Depending on the context, one is better than the other. But if the type of p is changed afterward, it is difficult to foresee how to write the less problematic explicit cast.

Both types of casts can cover some types of error, but there is a non-empty set that both do not cover.

And then there are of course implicit conversions. Those are caused by the integral types (narrowing and not), and by implicit constructors.

My experience so far has been that in the long run, most implicit constructors, even if they had some benefit, should have really been made explicit. The reason is that they hid some error, like temporarily creating an object which allocated in the constructor and thus caused performance issues, or causing complex code constructs while an explicit cast would have made the code more readable, and made the author realize what really happened behind the scenes.

My personal recipe for dealing with casts is:

  • Avoid c-style casts. Both GCC and Clang support the same compiler flag: -Werror=old-style-cast

  • Do not cast qualifiers (volatile and const) with anything except const_cast. Again GCC and Clang support the same compiler flag: -Werror=cast-qual

  • Avoid implicit narrowing conversion. GCC and Clang: -Werror=narrowing

  • Flag all useless casts: GCC and Clang -Werror=useless-cast

  • Use named operations (aka functions) for converting between unrelated types.

  • Prefer explicit constructor and conversions to implicit, unless the benefits outweigh the drawbacks (and those design decisions should really get documented).

The first guideline is there to help to inspect all casts. C-style casts are difficult to search for and provide way too much flexibility for most contexts. They are still used very much, partly because it permits to create headers files that are both compatible with C and C++, and partly because typing any other type of cast is much longer, and does not necessarily increase readability. Thus, in an existing code-base, it might not be always possible or easy to enable -Werror=old-style-cast.

A C-style cast permits to cast cost and volatile away (by accident) without using const_cast. As it might not be possible to enable -Werror=old-style-cast, -Werror=cast-qual permits to avoid this type of behavior, making it easier to inspect those places where const (or volatile) is ignored, which is normally a hint for a logical flaw or design issue.

Implicit narrowing conversion are mostly hidden bugs or correct but hardly documented code. Making those conversions explicit helps to inspect those places. If there are many conversion of this type, maybe some design considerations should be made, but at least there is some metric to use to make such a decision: the number of casts. In the case of implicit conversion, it is hard to take such decisions as it is not seen to recognize them in the code.

Useless casts are just…​ useless. They normally clutter the code and hide intent.

Most useless cast I’ve encountered were because the surrounding code changed, and the code with the cast was not updated, or the author did misinterpret a function signature. In both cases, a warning, or better, a compiler error, would have helped to spot those issues.

To sum it up, common drawbacks of wrong casts are that

  • they are hard to search

  • indicate a misunderstanding or make easier to misunderstand, what the code does

  • code that should not compile actually compiles

The GSL provides a nice example of how casts should be added in a codebase: narrow and narrow_cast.

From me of those functions makes it clear why and what type of cast we are performing. A correct function signature can also prevent many wrong casts, it permits to avoid some errors without fiddling with compiler flags, thus enhancing the quality of the code without relying on external tools.

int i = 42;
int j = narrow<int>(i); // does not compile as it is not narrowing, while a static-cast would have worked.
bool b = narrow<bool>(i); // does not compile as `int` and `bool` do not represent both numeric/integral types

Depending on the interface of external libraries, a common conversion I’ve noticed in some products is between char* and unsigned char*. The "correct" cast would be a reinterpret_cast, but the scope of such conversion much smaller, and the cast between those two types is not as dangerous as a typical reinterpret_cast would be.

const char* as_char(const unsigned char* uc){return reinterpret_cast<const char*>(uc);}
      char* as_char(      unsigned char)    {return reinterpret_cast<      char*>(uc);}

const unsigned char* as_uchar(const char*){return reinterpret_cast<const unsigned char*>(uc);}
      unsigned char* as_uchar(      char*){return reinterpret_cast<      unsinged char*>(uc);}

Those conversion functions makes the conversion even easier (no accidental cast from char* to char* when a types changes), safe (const-correct and cannot convert other unrelated types), and makes it possible to distinguish this particular type of conversion from other reinterpret_cast s to char* and unsigned char*.

Depending on external library or legacy design decisions, a "common" conversion might be from void*, used in some interface, to something else.

A static_cast to another pointer type would suffice, but casting void* to some other pointer type means reinterpreting data. So static_cast does not express the intent very well. Both static_cast and reinterpret_cast cover some error types, but again, the use-case of such conversion is much smaller compared to the capability of those functions.

template <class T, class V>
T reinterpret_void(V* p) {
	static_assert(std::is_void<V>::value, "is not void*");
	return static_cast<T>(p); // or reinterpret_cast, it's the same

As converting to void*, it needs to be a template with a static_assert. A function signature of template <class T> T reinterpret_void(void* p) would be no safer than a normal reinterpret_cast.

std::move and std::forward are two examples of the standard library on how it is possible to name a cast, even if unfortunately std::move is not a helpful name, and generally causes confusion as it does not move anything.

As already stated, use std::move in generic code, and something like fmove to avoid possible errors. fmove could avoid some errors in generic code too, but it might have some false positives, depending on how much the code is generic.

There are still some edge-cases to consider before applying my guidelines, as enabling those compiler warnings might make some code more complex.

Sometimes some conversions are conditionally necessary, depending on the platform

int j = ...;
auto i = static_cast<int32_t>(j);

The cast is unnecessary when j and i have the same precision, but this depends on the platform architecture.

Such conversion, if possible, should get encapsulated in specific functions for abstracting these platform-specific differences away.

But even on the same platform, different types might be aliased one to another

Does enforcing a conditional cast make the code clear? Not on all platforms that do not require the cast.

Proper type encapsulation, thus driving away from the fundamental types, can help to avoid most conversion issues.

Last but not least, also const_cast should if possible, get encapsulated in a function.

While enabling -Werror=cast-qual, I noticed that some of my code did not compile, because of some casts hidden behind a macro of an external library.

Untile the external library gets "fixed", the solution was to create following function

// use this cast to suppress compiler about const_cast warning which are not our fault
template <class T>
T* openssl_const_cast(const T* p) {
	return const_cast<T*>(p);

It also makes easier, on a library upgrade, to search for all those casts and see which are no longer necessary.

Do you want to share your opinion? Or is there an error, same parts that are not clear enough?

You can contact me here.