Deprecate and delete

This article was published as a guest post on fluentcpp.com.

Function poisoning is an interesting option to prevent the usage of a function in a codebase, but it is not always available. In some environments, your code is immune to poison. The pragma is also compiler-specific, as for now it only works with gcc and clang.

That’s why I would like to present alternative approaches: deprecate and delete.

Use = delete; to remove function overloads

The closest thing to #pragma GCC poison we have in the C++ standard is, since C++11, is = delete;.

= delete; is a language feature available since C++11, which addressed the problem of inhibit the creation of compiler generated constructors (default and copy), but that can be used in other scenarios too. It came together with = default;, which is out of scope for this article.

Consider following function signature: void foo(int). Because of implicit conversion between numeric types, it is easy to call foo with an an unsigned int, a long, bool, char and so on.

= delete; can help us by marking different overloads, and if one of those is selected, then the program fails to compile.

For example:

void foo(int);
void foo(unsigned int) = delete;

// ...

foo(1u); // fails to compile
foo(1); // compiles fine

Thus = delete; helps to cover some use cases that #pragma GCC poison couldn’t: banning a subset of the overloads of a function. Indeed, function poisoning prevents all the usages of a symbol, and does not distinguish between several prototypes.

Consider the function std::memset: void* std::memset(void*,int,std::size_t);

Its function signature is not type safe at all: it uses a void* as parameter whereas many types can’t be used as parameter since they are not POD.

Apart from that, the second and third parameters are two numeric values of different signeddes, but because of implicit conversion it’s easy to overlook the right sequence and swap them.

It is possible to provide replacement functions that compiles only with POD types, and that through default parameters is less error prone to use.

In a big codebase, it could be a lot of work replacing all those function calls with something else. As long as we are calling it on a trivially copyable type, and with the arguments in the right order, the code is fine.

Even if easy to automate, changing all usages from one function to another may irritate some of your fellow developers, especially if there was no real issue. If there was no need to change the called function (yet), all those changes may be perceived as polluting the history of your repository, and your name now appears in a lot of places where you have no idea how the code works.

Wouldn’t it be even better if we could trigger a compiler error when using std::memset incorrectly?

Consider the following snippet, that deletes a subset of the overloads of std::memset that we’d like to prevent the usage:

#include <type_traits>
#include <cstring>

namespace std{
	template <typename T, class = typename std::enable_if<!std::is_trivially_copyable<T>::value>::type>
	void* memset(T*, int ch, std::size_t count) = delete;
	void* memset(void*, size_t count, int ch) = delete;
}

template <typename T, class = typename std::enable_if<!std::is_trivially_copyable<T>::value>::type>
void* memset(T*, int ch, std::size_t count) = delete;

void* memset(void*, std::size_t count, int ch) = delete;

The following function still compiles:

struct foo{
    // trivially copyable data
};

void bar() {
    foo b[10];
    std::memset(&b, 0, sizeof b);
    std::memset(&b, 0u, sizeof b);
}

But this one does not (which is a good thing)

struct foo {
    // trivially copyable data
};

void bar() {
    std::string a;
    std::memset(&a, 0, sizeof a); // does not compile

    foo b[10];
    std::memset(&b, sizeof b, 0); // does not compile
}

Even if I’ve tested it and it works as intended with GCC (here and here), Clang (here and here), MSVC (here and here) and icc (here and here), this code is, strictly speaking, not valid.

I thought it would work on every configuration, but there are actually some versions of GCC where this hack does not work as intended.

GCC 6.3.0 on GNU/Linux (ARM64) seems to complain that we are deleting an overload of an intrinsic function. I fear that there is nothing that we can do except commenting out void* memset(void*, size_t count, int ch) = delete; for this compiler.

Fortunately the "bug" has been fixed for gcc 7, so we can also use this hack with most gcc version on the arm platform.

GCC 6.3.0 on GNU/Linux (arm) issues an error because it is unsure which overload to choose from. An explicit cast will fix the issue in this case:

std::memset(&b, static_cast<int>(value), sizeof(b));

It arguably also makes the code more readable when hunting for bugs about uninitialized data when reading code, since

std::memset(&b, static_cast<int>(sizeof(b)), value);

looks fishy.

We are not allowed to add functions in namespace std (there are a couple of exceptions, but this is not one of those). Even if we added functions only to delete them, we have still added them, and GCC 6.3.0 on arm complained (unfortunately rightfully).

Nevertheless this trick works reliably on all major compilers with every version I tested, granted with some hiccups on arm and arm64.

Modules and the increasing statistical analysis of the compilers might prevent us from doing something like that in the future.

In any case, even if it would not work with the standard library, this hack is still useful with other third party libraries.

Similar to the advice I wrote for function poisoning, we are "enhancing" an API we do not fully control.

This is generally a very Bad Thing™, and can lead to problems in the long-term. If for some reason we are not able to build our code anymore, we can either fix it by adding explicit casts, or remove some of those checks we added. But in order to reduce possible pitfalls and additional work, we should only delete overloads that brings a meaningful benefit to our codebase and helps us to prevent common errors.

In the case of the standard library we could be tempted to say: "yes, it’s UB, but it has no nasty side-effects." It’s a trap, there is nothing like benign UB!

Even if I cannot imagine how deleting an overload from the standard library could create a program that does not behave as intended, it is not a good practice to rely on UB. Undefined behavior can result in time travel, erase your disk, let your program freeze, crash, and many other things.

So how could we be absolutely sure that those overloads do not interfere with our program at runtime?

Performing a dry-run

One way is not to add them temporarily, just to check for compile errors, without pushing them to the repository. Just try to build the program with those overload added in every file. If it does not build, fix the possible errors. It it builds, recompile the program without adding those overloads in every file.

After all, what we want are the static checks.

Probably defining a separate build job would be the easiest solution. GCC has a handy compiler flag -include, that Clang supports too. Icc seems to support the same flag, even if I was not able to find anything in the documentation. MSVC has a similar flag too. Through those flags the compiler includes a specified file before parsing anything.

Thanks to those flags, we can include a header with all deleted functions in every file of our codebase, in order to ensure that a function is not used incorrectly or in a strange way through an implicit conversion.

Since we are also not interested in the compiled program, we could use -fsyntax-only as compile parameter. This way GCC will not generate any object file, which should speed up the generation of the possible error messages. Clang supports -fsyntax-only and --analyze, you might want to use that to gather other warnings. MSVC has an /analyze flag too, and it also recognizes the usage of deleted functions.

This way, we might reduce compilation times, or gather other important information’s, making those separate builds more significant for our work.

All the above concerned the standard library. If we are deleting some overloads from a third party library there is no undefined behavior. It is still a Bad Thing™ to hack their interface, it can lead to compile-time problems if the library adds overloads, but there is no undefined behavior.

Quick comparison between deleting a function and poisoning it

Even if we can use #pragma GCC poison and =delete; to improve our codebase by preventing some usages, they work in a very different way.

#pragma GCC poison is not part of the standard, it’s a compiler directive. Therefore:

  • It does not matter if functions (or classes, variables, keywords, anything else) are defined or not.

  • It does not understand namespaces.

  • Through aliases, it is still possible to use the poisoned tokens, which we exploited for providing more expressive alternatives.

  • It also works in C, and can therefore be used in C compatible headers

= delete is part of the language. However:

  • It works only on functions, but it understands namespaces, member functions, and overloads.

  • We cannot use it on macros, structures or other languages keywords.

  • We cannot delete a function that already has a body.

Therefore we cannot use it for deleting functions provided or implemented by third-party libraries (or the standard library).

Indeed, once a library declares:

void foo(int);

Then we cannot delete it in our code:

void foo(int) = delete;

All we can do would be to add and delete overloads to prevent implicit conversions:

void foo(short) = delete;

We can use it to add and delete overloads, in order to avoid implicit conversions of the arguments. This restriction will apply to every client code, even the standard library, and third-party library headers. So it might not always be possible to delete a function we do not want to use in our codebase, if it gets used, for example, in a template instantiation of a function in the standard library, since we cannot change the body of such template.

In case the function is only used in our code, we can still call the function by explicitly casting the arguments, instead of leveraging on implicit conversions. This makes it clearer in the code that something possibly fishy is happening.

Notice that a poisoned function is poisoned, even if we are trying to delete it.

To illustrate, suppose that a third party library provides foo(int) as a function, and we would like to delete foo(unsigned int). After some time, we notice that we do not want foo to get used at all because there is a better alternative.

#pragma GCC poison foo
void foo(unsigned int) = delete;

won’t compile, we have to change it to

void foo(unsigned int) = delete;
#pragma GCC poison foo

or simply

#pragma GCC poison foo

Compiler warnings are fine too

It might be that deleting a function is not doable. There might be some false positive that we cannot fix, for example inside a template instantiation of a class that does not belong to us.

Therefore, instead of a compiler error, a warning might be sufficient. For this we can use deprecated, an attribute that was added to the language in C++14:

[[deprecated("Replaced by fillmem, which has an improved interface")]]
void* memset(void*, int, size_t);

Using the function will trigger a compiler warning when building, and not a build failure, which might be enough. I do not know if deprecating a function from the standard library is fine, annotations have no visible effects, so I’m assuming that strictly speaking it is not even an ODR-violation.

However the function signature from my memory header on Debian GNU/Linux with GCC 8.2 is extern void *memset (void *s, int c, size_t n) THROW __nonnull 1;.

On Windows it will be surely different, on Mac too, and obviously it will depend on the version of your standard library. So in my case it might be an ODR-violation, depending how THROW is defined, since the throw specification might differ. And other version could use nothrow, or restrict for example for std::memcmp, or other compiler/library specific details.

The following piece of code failed to compile for exactly that reason:

int foo() noexcept {
    return 1;
}

[[deprecated("Replaced by bar, which has an improved interface")]]
int foo();

int baz() {
    return foo();
}

Whereas:

int foo() {
    return 1;
}

[[deprecated("Replaced by bar, which has an improved interface")]]
int foo();

int baz() {
    return foo();
}

Compiles successfully and generates as expected a warning if the compiler supports the [[deprecated]] attribute.

I cannot imagine how this hack will lead to bad things when deprecating something from the standard library. But to be on the safe side, as proposed for = delete;, if you’re hacking in the standard library, you can limit yourself to make a separate build and analyze the compiler warnings.

I was also happy to verify that deprecating std::memset did work with all compilers that supported attributes, even with the GCC version on arm! Of course your experience could be different if the function has, depending on the platform and version, a different exception specification or other compiler-specific details that creates a different function signature.

Unlike the function signature of memset, that officially should be void* memset(void* s, int c, size_t n);, but as mentioned before will vary greatly between compiler, compiler versions and language version, many libraries do not use as many compiler specific attributes for their function signatures, and those will be therefore more stable. Of course a different version of the library could change the signature of a function of its interface, but it is less common that a compiler upgrade will change it, even if not impossible.

This means that deprecating a function of another library should be easier.

It does not mean we should just deprecate them because it’s easy, but because we have found use cases in our codebase where replacing those functions with something else might have some benefits, like increasing readability and reducing the chance of making common mistakes.

As when abusing = delete;, changing the interface of code we do not control is generally a bad idea, it must be done with great care.

From warnings back to errors

Most compilers also have the possibility to turn some warnings into errors We could therefore use the [[deprecated]] attribute for banning functions, and providing a custom error message. It might get tricky if the function is used in another context that we do not want to update yet.

Maybe it would not be that bad to have something like a [[removed("message")]] attribute in the standard, or extend = delete; with a custom message, like = delete("message");:

  • a library author could use such attributes after deprecating some functions to ease the transition for its users.

  • we could use abuse it (since [[deprecated]] was not meant to be used that way) in our codebase to provide more helpful messages when banning a function.

Conclusion

Poisoning, deleting and deprecating functions from a third part API are not-so-standard techniques that permits us to try to remove common errors in our code base.

The goal is to discourage the usage of certain functions, and when it comes to third party API’s, there is not much we can do without those techniques.

Other approaches involve creating a facade to completely hide the offending library, but in many cases it’s a giant effort that does only partially fix the issue.

First of all by using a facade we have still to maintain the other side of the facade and avoid the common pitfalls there. So the techniques presented here should probably be used on the other side of the facade.

Second, with a facade we now a have another interface to learn. Which means that if something does not work as expected, we will probably have to look at the other side of the facade in order to debug the problem.

It also introduces a lot of code that is probably unused. You’ll probably not need all of the functionality of a third-party library, and if you do, you’ll probably want to see the interface of the library you are using, because you’ll probably need to know it better.

This is actually the biggest problem I have faced until now with most facades or wrappers. It seems to me that we are trying to crush walnut with a steamroller.

Another possibility is to use an external tool. It might provide other benefits too, like better explanations to why a function is is banned. But parsing C++ files is not easy. On the other hand injecting this information in the source code has the advantage that we do not need a separate tool to execute (and eventually maintain).

Since all mainstream compilers have some sort of include parameter, the simplest way to ensure that a function gets banned is to create header files with the poisoned identifiers, deleted and deprecated functions, and include them in every file.

An easy way to organize such files is to put them in a separate directory, and create one file per library.

In CMake, it’s as simple as adding following line for MSVC:

target_compile_options(${PROJECT_NAME} PRIVATE /FI "${CMAKE_SOURCE_DIR}/ban/foo.hpp")

And, for GCC and Clang:

target_compile_options(${PROJECT_NAME} PRIVATE -include"${CMAKE_SOURCE_DIR}/ban/foo.hpp")