The C++ logo, by Jeremy Kratz, licensed under CC0 1.0 Universal

Macros usages in C++


26 - 33 minutes read, 6529 words
Categories: build system c c++
Keywords: build system c c++ macro

There are a lot of places where I see still #defines (both in C and C++), and I’m almost sure that in most places, those defines are not really needed. From time to time, I also learn that there is a way to express/replace a macro with templates or other constructs, so I finally decided to enlist the use-cases I’ve encountered, where it makes sense to replace a macro with something else, and where I am not aware of any alternative.

Notice: In those notes, I’m not considering external code generators, or tools that can alter a compiled binary (for example for adding something in the .rodata section). It would remove many more usages for macros (especially X-Macro), but it requires a more complex build system, especially because we either need to have a prebuilt code-generator or compile it with our source code for the host platform (which cannot always be done automatically if cross-compiling).

This does not mean that a code generator is not a viable alternative, on the contrary, but having the possibility to express more directly in the source code has advantages too.

include guards

There is no standard replacement for include guards (both in C and C++), but there is an almost-standard practice: #pragma once which is supported by most compilers.

As the header guards add a lot of noise when searching for other macros, this is one of those rare places where I’m fine using a non-standard extension.

#ifndef <unique macro>
#define <unique macro>

// content of header file

#endif

vs

#pragma once

// content of header file
Note if some future compiler does not support this extension, with some scripting it is possible to programmatically replace the #pragma once with a random-generated include guard.

"issues" with #pragma once

Ignoring the fact that a compiler might not support it, for those claiming that the #pragma once is not a feasible alternative because it might fail in cases where the include guard would not…​ let’s be practical.

Say, for instance, that a header file is duplicated somewhere. This might happen because

  • you copied a header file and replaced its content, except for the include guard

  • you messed up a merge and your version control system duplicated some files

  • there are multiple mount points or symlinks involved

  • the file got duplicated in another part of the project

The header guard will work in all those four cases because it does not matter where a file comes from, it matters if the guard is literally the same or not. With #pragma once, depending on how the compiler implemented it (for example by mapping internally the file names) might fail, in the sense that you might get compiler errors.

This is a good thing.

You copied a header file, changed the content, and forgot to replace the header guard? It’s an error, the code might not compile (depending on the inclusion order of the headers), but with #pragma once this error possibility does not exist anymore.

A file that has been duplicated with eventually minor changes (failed merge, file copied in another part of the project and not kept in sync, …​). This is an error and will probably invoke UB (it’s basically an ODR violation), and the header guard might hide this error. If it fails to compile because of #pragma once it’s a good thing, not a disadvantage. If the code compiles, the program might not do what the programmer intended

If there are multiple mount points or symlink involved, and the same file is referenced by different names…​ Maybe it’s time to clean up the source directory/how the code is built, independently from the fact that #pragma once might or might not fail. If the compiler cannot distinguish if two file names point to the same file…​ how should a developer do it? This will probably harm build times too.

There might be rare cases where #pragma once is not suitable, but I did not encounter one yet.

avoid function call

This holds both for C and C++.

Consider something like

#define LO(x) (x & 0xff)
#define HI(x) ((x & 0xff00) >> 8)

one might argue that using macros is better, as it ensures the code is inlined/there is no function call. Unfortunately, a smart compiler could also put repetitive code in a function, since a function call is not observable for the C++ specification. I am not aware of any compiler having implemented such feature, which is sad as it would be great for refactoring.

Since both C and C++ support inline functions (which does not mean the function gets inlined) and there are compiler-specific directives for inlining, there seems to be no reason to use a macro instead of a function

inline int lo(int x) { return x & 0xff;}
inline int hi(int x) { return (x & 0xff00) >> 8;}

One might argue that the directives for forcing inlining are compiler-specific, while the macro works for all compilers even with optimizations disabled. This is true, nevertheless, it should be proved that using a macro compared to a function has any measurable benefit (like all optimization).

Another argument is that both lo and hi will generate a function that takes some place in the resulting binary, but this is true only if the function is exported.

Also, code size impacts performance, thus (improbably in this case) the resulting code might be faster when using lo and hi instead of the macros if the compiler chooses not to inline the functions at every call.

TLDR; better read the compiler documentation and test its optimization flags and how they transform the resulting code before using a macro instead of a function.

Constants

For different types of constants, there are different approaches.

integrals

Consider

#define SIZE 5

In C++, for integral values

const int size = 5;

Is a compile-time constant, it can be used, for example, for defining the size of an array, or an enum value. This is a special rule that holds only for integral values.

Since C++11 we should prefer to use constexpr

constexpr const int size = 5;

as it makes clear both to human and compiler that size is a compile-time value by design, and accidental changes should be an error. It also makes handling int consistent with other data types, as

const float pi = 3.14;

is not a compile-time constant.

Integrals in C

Even in C, for integral values, it is possible to avoid macros. There is no constexpr, and const int cannot be used at compile-time, but an enum can

enum{my_size=5};


int arr[my_size];

literals

For many cases, a constexpr std::string_view or constexpr const char* const (or without constexpr in C) should be the way to go, but in some cases, the macro approach is still necessary.

It is possible to handle strings at compile-time with constexpr functions, which reduces the need to use macros for concatenating literals, but some parts of the language, for example, static_assert, requires a string literal.

#define VERSION "1.2.3"
/// ...

static_assert( /* ... */ , "Test on version " VERSION); // cannot use a constexpr string, it must be a literal

Another place where a macro is preferred (but not necessary) to a constant, is when the same formatting string for multiple calls to printf. Compilers can generate warnings if the formatting string and the arguments do not coincide, but works only if the parameter passed to printf is a string literal.

floating point numbers

#define PI 3.14

should be replaced by

constexpr double pi = 3.14 /* and a lot of other digits */;

Unfortunately, the semantics might change slightly. PI might be more precise, as it’s not typed, while pi is a double, thus the assigned value is rounded.

This does not make any difference when pi or PI are assigned directly to a value, but might if they are part of an expression:

double local_PI = PI;
double local_pi = pi;
assert(local_PI == local_pi);

double two_PI = 2*PI;
double two_pi = 2*pi;
assert(two_PI == two_pi); // might fail

Notice that this difference is compiler-dependent, so it’s hard to argue that using the macro approach is superior from this point of view.

Stable algorithms are much more important than the representation of a single value.

Other types

While one might argue that creating a constant takes some place, a macro is generally just copy-pasting the same value in multiple places. It means that the same value will appear multiple times in the binary unless the optimizer/linker removes all those copies.

Contrary to macros, adding globals might introduce problems in C++. As globals can execute code (not in C), they can depend on each other, and as the initialization order of globals is not defined (they might even be in different libraries), they might access uninitialized data.

The simplest way to avoid this issue is by making all globals constexpr, ie compile-time constants, or using singletons, which ensures that dependencies are at least initialized when their values are needed.

Working with member variables and functions

Conditionally calling a member functions

Consider some code like

struct s{
    bool is_feature_enabled();
    void execute_feature();
};

where you are supposed to call execute_feature only if is_feature_enabled (there are of course better APIs, but sometimes this is what you get).

It’s tempting to write

#define COND_EXEC_FEATURE(s, check, exec)\
    if(s.check()){s.exec();}

void foo(s& instance){
  COND_EXEC_FEATURE(instance, is_feature_enabled, execute_feature);
}

as it works. It is possible to replace the COND_EXEC_FEATURE macro with a template function (it does not have to be templated if it is useful only for a given class):

template <class C, class F1, class F2>
void cond_exec(C& c, F1 f1, F2 f2){
    if( (c.*f1)(1)){
        (c.*f2)(2);
    }
}


void bar(s& instance){
    cond_exec(instance, &s::is_feature_enabled, &s::execute_feature );
}

The syntax looks strange, as operator .* is rarely used.

Verify member variables

#include <cassert>

// FIXME: replace with true variadic/recursive function
// For now, overload till 5 substructures should be good enough
// internal macro, do not call directly, see FEK_ASSERT_DEREF_CHAIN
#if defined(FEK_GET_MACRO) || defined(FEK_C_DEREF_CHAIN1) || defined(FEK_C_DEREF_CHAIN2) || defined(FEK_C_DEREF_CHAIN3)                        \
    || defined(FEK_C_DEREF_CHAIN4) || defined(FEK_C_DEREF_CHAIN5) || defined(FEK_CHECKED_DEREF_CHAIN) || defined(FEK_ASSERT_DEREF_CHAIN)
#error "*_DEREF_CHAIN* already defiend"
#endif

#define FEK_C_DEREF_CHAIN1(ptr1, check_ptr) (check_ptr((ptr1)), void(), (ptr1))
#define FEK_C_DEREF_CHAIN2(ptr1, ptr2, check_ptr) (check_ptr((ptr1)), void(), FEK_C_DEREF_CHAIN1(ptr1->ptr2, check_ptr))
#define FEK_C_DEREF_CHAIN3(ptr1, ptr2, ptr3, check_ptr) (check_ptr((ptr1)), void(), FEK_C_DEREF_CHAIN2(ptr1->ptr2, ptr3, check_ptr))
#define FEK_C_DEREF_CHAIN4(ptr1, ptr2, ptr3, ptr4, check_ptr) \
        (check_ptr((ptr1)), void(), FEK_C_DEREF_CHAIN3(ptr1->ptr2, ptr3, ptr4, check_ptr))
#define FEK_C_DEREF_CHAIN5(ptr1, ptr2, ptr3, ptr4, ptr5, check_ptr) \
        (check_ptr((ptr1)), void(), FEK_C_DEREF_CHAIN4(ptr1->ptr2, ptr3, ptr4, ptr5, check_ptr))
#define FEK_GET_MACRO(_1, _2, _3, _4, _5, check_fun, NAME, ...) NAME

/// Usage:
/// Given a structure like `a->b->c`, where every pointer is never nullptr, use
/// `FEK_ASSERT_DEREF_CHAIN(a, b, c)` to make intent clear.
/// You can also write `auto ptr = FEK_ASSERT_DEREF_CHAIN(a, b, c)` and `ptr` will point to
/// `a->b->c`
/// If a substructure could be nullptr, you might prefer using FEK_CHECKED_DEREF_CHAIN, that accepts as last parameter a
/// function name (function can be overloaded) for checking if a structure is null, and that should throw on error).
#define FEK_CHECKED_DEREF_CHAIN(...) \
        FEK_GET_MACRO(__VA_ARGS__, FEK_C_DEREF_CHAIN5, FEK_C_DEREF_CHAIN4, FEK_C_DEREF_CHAIN3, FEK_C_DEREF_CHAIN2, FEK_C_DEREF_CHAIN1) \
        (__VA_ARGS__)
#define FEK_ASSERT_DEREF_CHAIN(...) FEK_CHECKED_DEREF_CHAIN(__VA_ARGS__, assert)

struct S1{
    int data;
};
struct S2{
    S1* ptr1;
};
struct S3{
    S2* ptr2;
};


void throw_on_null(const void* ptr){if(ptr == nullptr){throw 42;}}

int bar(S3* ptr3){
    auto v = FEK_CHECKED_DEREF_CHAIN(ptr3, ptr2, ptr1, throw_on_null)->data;
    return v;
}

Also in this case, it is possible to drop the macro:

template <class F, class C1>
auto deref_chain(F f, C1* p1){
    f(p1);
    return p1;
}

template <class F, class C1, class C2, class... CN>
auto deref_chain(F f, C1* p1, C2 p2, CN... cn){
    f(p1);
    return deref_chain(f, p1->*p2, cn...);
}

template <class C1>
auto assert_deref_chain(C1* p1){
    assert(p1 != nullptr);
    return p1;
}

template <class C1, class C2, class... CN>
auto assert_deref_chain(C1* p1, C2 p2, CN... cn){
    assert(p1 != nullptr);
    return assert_deref_chain(p1->*p2, cn...);
}

struct S1{
    int data;
};
struct S2{
    S1* ptr1;
};
struct S3{
    S2* ptr2;
};

int bar(S3* ptr3){
    auto v = deref_chain(throw_on_null, ptr3, &S3::ptr2, &S2::ptr1)->data;
    return v;
}

Only disadvantage: passing overloads is more difficult, and if one want’s to use assert, he needs to write a separate functions, as assert itself is not a function and thus cannot be used as parameter.

Using a macro in a callback

In the previous example, I had to define a separate templated function assert_deref_chain, because assert is not a function and thus cannot be passed as a parameter

std::vector<int*> v;
// verify all values are != nullptr
std::for_each(v.begin(), v.end(), assert);

This is not an issue it the macro aliases a function and is not a function macro. For example:


#define FEK_WRAP_ASSERT [](auto v){assert(v);}


std::vector<int*> v;
// verify all values are != nullptr
std::for_each(v.begin(), v.end(), FEK_WRAP_ASSERT);

One might argue that FEK_WRAP_ASSERT is not necessary, and in this case, it does nothing useful. If it gets overhauled and partially reimplements assert in this way:

#define FEK_WRAP_ASSERT [func = __func__](auto v){ /* if not v, print an error message and func*/ }

Then we can print to the console where the macro has been written (not called!), which cannot be done with a function. I’m not sure if there are valid use cases.

Conditionally returning

Suppose you have an API that consists of multiple functions that take some input parameter, and if the parameter does not meet specific criteria, then the function returns with a specific error code. This checking can be wrapped in a macro and reused in every function.

#define FEK_RETURN_IF_VERIFY_FAILS(p) \
  do{ if(verify(p)){ return errorcode::wrong_param; } }while(false)

errorcode foo(int i){
  FEK_RETURN_IF_VERIFY_FAILS(i);
  // business logic of foo
  return errorcode::success;
}

errorcode bar(const std::string& s){
  FEK_RETURN_IF_VERIFY_FAILS(s);
  // business logic of bar
  return errorcode::success;
}

Generally calling return (or break, continue, and goto) from a macro is not a good practice, as it changes the control flow without the reader acknowledging it (even if the same is true for exceptions…​). Hopefully naming the macro …​RETURN_IF…​ makes the intent clear enough.

In this case, it is not possible to replace this macro with a function, because a function cannot force a return on the caller site.

The macro can be dropped if one is ready to add another layer of indirection, and restructure its code:

template <class T, class F>
errorcode exec_if_verify(T&& p, F f){
    if(verify(p)){ return errorcode::wrong_param; }
    return f(std::forward<T>(p));
}

errorcode foo(int i){
    return exec_if_verify(i,
        [](auto i) {
            // business logic of foo
            return errorcode::success;
        }
    );
}

errorcode bar(const std::string& s){
    return exec_if_verify(s,
        [](auto s) {
            // business logic of bar
            return errorcode::success;
        }
    );
}

While this approach permits to drop the macro, the code is not necessarily easier to understand.

True, there are no hidden returns, but this approach requires duplicating the function signatures, adding other useless scopes, an indirection with the lambda, and replacing one line with a lot of boilerplate. It also still hides to the user if the business logic of foo or bar gets executed, as the parameter validation and branches are somewhere else. Without a lambda, one needs to write even more code, as it needs to duplicate the function signature.

At least the generated code, for simple cases, is identical.

I’m still using a macro if there are a lot of functions that requires parameter validation in a similar way, otherwise removing the code duplication is not worth, the template-approach is too "complex".

life-time extension

Inside a constructor

A use-case where I thought adding a macro was when dealing with constexpr data structures.

Consider

struct data{ int i; };

struct node{
    data d;
    const node& next;
};
constexpr const node Null{data{0}, Null};
#define FEK_NODE(d, next) node{d.i == 0 ? data{42} : d, next} // or throw, stop compilation, or something else

constexpr node leaf(data d){
    return FEK_NODE(d, Null);
}

constexpr const node list{
    data{-1}, FEK_NODE(data{0}, leaf(data{1}))
};

Adding a constructor to node (or a factory function) would create a dangling reference (except for leaf), as it disables lifetime extensions. The macro FEK_NODE does not have the same issue, as it does not introduce a new scope.

Inside a lambda capture, and as alternative to a cast/std::as_const

When using mutexed_obj, I had a similar issue with lifetime extensions, and again a macro helped to reduce the amount of code one needs to write.

Given a class mutexed_obj similar to the one presented in this article, and following structures:

struct data{
    std::string str = "Hello World!";
    int i = -1;
};

struct s {
    // type with no default constructor, or expensive default constructor,
    // and possibly no move or copy constructors (or expensive)
    s(bool, const std::string&);
};

Consider the following function:

bool foo(mutexed_obj<data>& d_m){
    // ...
    const s val = lock(d_m, [](data& d){
        bool res = d.str == get_value();
        if(res){d.i++;}
        return s(res, d.str);
    });
    // ...
}

get_value is a function declared somewhere else, we might even not control it. Its signature is one of the following, it should not matter which one:

const std::string& get_value();
// or
std::string get_value();
// could also be std::string& get_value(), std::string&& get_value(), ... but leaving out for simplicity

What happens inside the function foo is suboptimal at least, possibly problematic because the code is calling external/unknown code (get_value) while locking a mutex.

A first approach for fixing the problem is to call get_value in foo directly:

bool foo(mutexed_obj<data>& d_m){
    // ...
    const auto& value = get_value(); // or const std::string& value = get_value();
    const s val = lock(d_m, [&value](data& d){
        bool res = d.str == value;
        if(res){d.i++;}
        return s(res, d.str);
    });
    // ...
}

As a side-effect, the scope of the result of get_value is much bigger. If get_value returns by value, the memory is freed only when foo exists. It would be nicer if the scope would be as small as possible.

Writing

bool foo(mutexed_obj<data>& d_m){
    // ...
    s val;
    {
        const auto& value = get_value(); // or const std::string& value = get_value();
        val = lock(d_m, [&value](data& d){
            bool res = d.str == value;
            if(res){d.i++;}
            return s(res, d.str);
        });
    }
    // ...
}

might or might not work, depending on s constructors, but makes it generally impossible to make it const, unless using a second lambda:

bool foo(mutexed_obj<data>& d_m){
    // ...
    s val = [](){
        const auto& value = get_value(); // or const std::string& value = get_value();
        return lock(d_m, [&value](data& d){
            bool res = d.str == value;
            if(res){d.i++;}
            return s(res, d.str);
        });
    }();
    // ...
}

This is…​ not really readable, especially considering that lambdas can (since C++14) call functions when capturing variables:

bool foo(mutexed_obj<data>& d_m){
    // ...
    const s val = lock(d_m, [value = get_value()](data& d){
        bool res = d.str == value;
        if(res){d.i++;}
        return s(res, d.str);
    });
    // ...
}

This example has a drawback too. If get_value returns a const reference, we are doing an unnecessary copy, while all previous solutions did not.

To avoid this unnecessary copy, one could write

bool foo(mutexed_obj<data>& d_m){
    // ...
    const s val = lock(d_m, [&value = get_value()](data& d){
        bool res = d.str == value;
        if(res){d.i++;}
        return s(res, d.str);
    });
    // ...
}

but this code, won’t compile if get_value returns by value, as the lambda captures by reference, and not const-reference. The error message is in fact (with GCC): error: non-const lvalue reference to type 'basic_string<…​>' cannot bind to a temporary of type 'basic_string<…​>'.

One could use std::as_const for creating a temporary reference, but

  • it does not compile, as std::as_const does not accept rvalue as parameters

  • even if it would compile, in the case of an rvalue it would create a dangling reference.

One could do what std::as_const does by hand, ie adding const with a cast, and letting the lambda capture deduce the type

bool foo(mutexed_obj<data>& d_m){
    // ...
    const s val = lock(d_m, [&value = static_cast<const std::string&>(get_value())](data& d){
        bool res = d.str == value;
        if(res){d.i++;}
        return s(res, d.str);
    });
    // ...
}

This works correctly, but it is extremely verbose and might introduce unwanted conversions by accident (for example if get_value returns a char*) Also in this example, there is only the output of one function captured in the lambda. If it would be two or three, the code would be much less readable.

Because a function does not work (it would create a dangling reference), a macro seems the only viable approach to reduce the boilerplate:

#define FEK_C_REF(...) static_cast<const std::remove_reference_t<decltype(__VA_ARGS__)>&>(__VA_ARGS__)

bool foo(mutexed_obj<data>& d_m){
    // ...
    const s val = lock(d_m, [&value = FEK_C_REF(get_value())](data& d){
        bool res = d.str == value;
        if(res){d.i++;}
        return s(res, d.str);
    });
    // ...
}

Conditional compilation

Useful for compiling platform-specific code, or for enabling disabling features, like runtime checks, at compile-time.

// foo.h
#ifdef FOO
void foo();
#endif

Alternatively, those situations can be handled by the build system.

For example, there could be two foo.hpp files, one empty, the other with the declaration of the function foo. For the targets where FOO would not be defined, the empty foo.hpp is included.

Similarly, the build system can help to provide implementations for a different target, by compiling one source file or another.

This approach does not scale well if inside a file only a very small subset differs.

It is thus possible to avoid macros in some cases, but generally, they can be more readable and easy to maintain than using the build system.

Handling __func__/std::source_location

Because introducing a new function (a lambda too) means changing the value of __func__, if we want to keep it unchanged there is generally no substitute for a macro. Even with the addition of std::source_location, there are use-cases that can’t be covered. For example, a variadic function cannot have default parameters, thus having std::source_location as parameter is not as useful, and other functions, like binary operators or overridden functions, cannot add other parameters.

Unify C functions in overloads

When dealing with libraries that provide a C API, it happens often that there are multiple different functions that in C++ would be normally handled with function overloading.

For example the C standard library defines abs, labs, fabs, fabsf, fabsl and other functions. Calling the "wrong" function does not yield a compile-time error, but causes an implicit conversion with unwanted effects. In C++ there is an overloaded std::abs function that makes those types of errors impossible.

Macros, in some situations, can help to "generate" some glue code, especially if the function names contain the type.

For example:

// from the C API
int foo_int(int);
long foo_long(long);


// macro to generate glue code
#define TO_OVERLOAD(type, name) inline type name(type v){return name##_##type(v);}

// generate overload set
TO_OVERLOAD(int, foo)
TO_OVERLOAD(long, foo)


// forget about foo_int and foo_long, use foo, it will call the correct overload
i = foo(i);

Common functions

Sometimes macros are used for implementing common functionalities. The first example that comes to mind is implementing operator!= in terms of operator==:

#define FEK_OP_UNEQUAL_INLINE(type) friend bool operator!=(const type& lhs, const type& rhs)noexcept(noexcept(rhs == lhs)){ return not (rhs == lhs);}

struct s{
    friend bool operator==(const s& a, const s& b);
    FEK_OP_UNEQUAL_INLINE(s);
};

// or

#define FEK_OP_UNEQUAL_DECL(type) friend bool operator!=(const type& lhs, const type& rhs)noexcept(noexcept(rhs == lhs))
#define FEK_OP_UNEQUAL_IMPL(type) bool operator!=(const type& lhs, const type& rhs){ return not (rhs == lhs);}
struct s{
    friend bool operator==(const s& a, const s& b);
    FEK_OP_UNEQUAL_DECL(s);
};

// in the .cpp file
FEK_OP_UNEQUAL_IMPL(s)

Such pattern can be avoided with the CRTP pattern:

template<typename T>
class gen_uneq_from_eq {
    friend bool operator!=(const T& lhs, const T& rhs) noexcept(noexcept(rhs == lhs)){ return not (rhs == lhs); }
};


struct s : private gen_uneq_from_eq<s> {
    friend bool operator==(const s& a, const s& b);
};

The main disadvantage of the CRTP pattern is that it forces the implementation to be in a header file, as the whole class is templated, while with the macro approach it is possible to split the definition and implementation. For operator!= this should not be a problem, but if the functionality of the base class needs to include some other headers which are not needed by the end-user, it would at least negatively affect compilation times.

X-Macro

enum class e {e1,e2,e3,e4,e5,e6,e7,e8};

const char* to_string(e v){
  switch(v){
      case e::e1: return "e::e1";
      case e::e2: return "e::e2";
      case e::e3: return "e::e3";
      case e::e4: return "e::e4";
      case e::e5: return "e::e5";
      case e::e6: return "e::e6";
      case e::e7: return "e::e7";
      case e::e8: return "e::e8";
  }
}

vs

enum class e {e1,e2,e3,e4,e5,e6,e7,e8};

const char* to_string(e v){
  switch(v){
    #define CASE_RET_LIT(v) case v: return #v
    CASE_RET_LIT(e::e1);
    CASE_RET_LIT(e::e2);
    CASE_RET_LIT(e::e3);
    CASE_RET_LIT(e::e4);
    CASE_RET_LIT(e::e5);
    CASE_RET_LIT(e::e6);
    CASE_RET_LIT(e::e7);
    CASE_RET_LIT(e::e8);
    #undef CASE_RET_LIT;
  }
}

Some libraries solve this problem without resorting to macros (at least macros visible to the user), but they generally rely on compiler extensions and thus depend on both compiler vendor and version.

X-macros can be used to reduce further the written code:

#define ENLIST_ENUMS(X) \
    X(e1) \
    X(e2) \
    X(e3) \
    X(e4) \
    X(e5) \
    X(e6) \
    X(e7) \
    X(e8) \

enum class e {
    #define AS_VALUE(a) a ,
    ENLIST_ENUMS(AS_VALUE)
    #undef AS_VALUE
};

const char* to_string(e v){
  switch(v){
    #define CASE_RET_LIT(a) case e::a: return "e::"#a;
    ENLIST_ENUMS(CASE_RET_LIT)
    #undef CASE_RET_LIT
  }
}

Also, X-Macros permits to create of easily more complex data structures, not only mapping enums to literals (and vice-versa)

Consider for example, what would normally be a tabular data structure

item     | property1   | property2   | property3   | ...

item_a   | property1_a | property2_a | property3_a | ...
item_b   | property1_b | property2_b | property3_b | ...
item_c   | property1_c | property2_c | property3_c | ...
...

Thanks to constexpr, it is possible to create such table/matrix at compile-time, but sometimes just like mapping an enum to some string, with a macro it is possible to further reduce code duplication (and generate more compact code)

// file.mpp
X(item_a, property1_a, property2_a, property3_a, ...)
X(item_b, property1_b, property2_b, property3_b, ...)
X(item_c, property1_c, property2_c, property3_c, ...)
....

A more concrete example: encoding different information of a Pokémon:

// pokemon.mpp
// Name,       if starter,    possible evolutions, eventually other data...
X(Bulbasaur,   starter::yes,  EVOLVES_TO(Ivysaur))
X(Ivysaur,     starter::no,   EVOLVES_TO(Venusaur))
X(Venusaur,    starter::no,   EVOLVES_TO())
X(Charmander,  starter::yes,  EVOLVES_TO(Charmeleon))
X(Charmeleon,  starter::no,   EVOLVES_TO(Charizard))
X(Charizard,   starter::no,   EVOLVES_TO())
// ...
X(Eeve,        starter::yes,  EVOLVES_TO(Vaporeon, Jolteon, Flareon))
X(Vaporeon,    starter::no,   EVOLVES_TO())
X(Jolteon,     starter::no,   EVOLVES_TO())
X(Flareon,     starter::no,   EVOLVES_TO())
// ...
Note I’m using .mpp`as file extension as it is not valid C++ code (unless `X is defined to something) and thus it’s neither a header file nor an implementation file.
// pokemon.hpp
#pragma once
enum class starter{yes,no};
using id_helper = char[__LINE__+3];
enum pokemon {
#define X(name, starter, evolves) name = __LINE__ - sizeof(id_helper),
#include "pokemon.mpp"
#undef X
};
};

std::string_view name(pokemon p);
int id(pokemon p);
std::span<pokemon> evolutions(pokemon);
}
// pokemon.cpp

// helper for creating local array with EVOLVES_TO
template <std::size_t N>
constexpr std::array<pokemon, N> to_evolutions(const pokemon (&a)[N]){
  return std::to_array(a);
}
constexpr std::array<pokemon, 0> to_evolutions(int){
  return {};
}

const std::string_view name(pokemon p) {
  switch (p) {
#define X(name, starter, evolves) case name: return #name;
#include "pokemon.mpp"
#undef X
  }
}

bool id(pokemon p) {
  static_assert(static_cast<int>(Bulbasaur)==1, "sanity check");
  return static_cast<int>(p);
}

std::span<const pokemon> evolutions(pokemon p){
  switch (p) {
#define EVOLVES_TO(...) {__VA_ARGS__}
#define X(name, starter, evolves) case name : {static constexpr auto evolutions = to_evolutions( evolves ); return evolutions;}
#include "pokemon.mpp"
#undef X
#undef EVOLVES_TO
  }
}

Note that all functions do not require any memory allocation, and can be used at compile-time if one wants to, by moving the implementation to a header and by adding constexpr. Writing an alternative implementation, without using macros adds a lot of boilerplate and duplicated code, which makes it harder and more error-prone to maintain.

short-circuiting

With macros, it is generally possible to short-circuit, while function arguments are always evaluated, even if the function is completely inlined (as inlining cannot change the behavior of a well-written program).

#define LAZY_COMPARE_EQ_IF_NOTNULL_BAD(a, b) ((a) != nullptr) and ((b) != nullptr) and ((a) == (b))
#define LAZY_COMPARE_EQ_IF_NOTNULL_BETTER(a_, b_) [](){const void* a = a_; if(a==nullptr) return false; const void* b = b_; return (b != nullptr) and (a == b);}()

bool compare_eq_if_notnull(const void* a, const void* b){
  return (a != nullptr) and (b != nullptr) and (a == b);
}

int* g() noexcept;
const int* h() noexcept;

void foo(){
  // does not call h() if g() returns null, but might call the functions more than once
  if(LAZY_COMPARE_EQ_IF_NOTNULL_BAD(g(), h())){
  }

  // does not call h() if g() returns null, every function called at most once
  if(LAZY_COMPARE_EQ_IF_NOTNULL_BETTER(g(), h())){
  }

  // calls g and h once, always
  if(compare_eq_if_notnull(g(), h())){
  }
}

by adding another layer of indirection, it is possible to solve this problem without resorting to macros

template <class F, class G>
bool compare_eq_if_notnull(F f, G g){
  const void* a = f();
  if(a==nullptr) return false;
  const void* b = g();
  return (b != nullptr) and (a == b);
}
void foo(){
  // does not call h() if g() returns null, every function called at most once
  // just like COMPARE_EQ_IF_NOTNULL_BETTER
  if(compare_eq_if_notnull([]{return g();}, []{return h();})){
  }
}

The advantage is, apart from the fact we do not use a macro anymore, that it looks and works like "normal" C++ code. There is no short-circuiting, the function parameter (the lambda), is evaluated, but g and h only if the lambda is executed inside the function. On the other hand, we need to remember to add this (verbose) indirection every time we are going to use compare_eq_if_notnull, while with the macro it was not necessary.

Take advantage of compiler extensions where possible

Compiler extensions are especially useful if they do not alter the behavior of a program, or make an ill-formed program well-formed, but provide additional possibilities for finding or avoiding errors.

For example, the Microsoft compiler supports the SAL annotation. As other compilers do not support it, wrapping it in macros ensures that the code still compiles with another compiler, but when it is compiled with MSVC we can take advantage of compiler-specific diagnostics.

Since C++11 there is a standardized syntax for annotations, but not all compilers support all attributes with the standard syntax, and not all annotations can be mapped to the standardized syntax.

Thus where possible using the standardized syntax makes avoids adding an indirection through the macro for leaving the code portable.

Annotations are provided by compilers or external tools, thus expect for the given tools, the macro should do nothing. It’s a form of conditional compilation.

Also, some annotations can be replaced by appropriate language constructs or specifically designed classes, like std::string_view, std::span, std::unique_ptr and so on. Those library facilities are a superior solution as all tools (especially the compiler) that support C++ should understand those constructs, which can set invariants both at runtime and at compile time.

Guidelines for macros

When to use a macro

Use it when there is no alternative for reducing code duplication.

As macros are a lower-level tool than a function, there will always be valid use-cases for it. But as they have clear drawbacks, it is important to avoid them when not necessary.

Stay practical: use them if it makes the code easier to understand and maintain (for example a single file with one or more #ifdef versus multiple files with duplicated content), but not if the advantages are negligible, because tools (and humans) have more difficulties to understand the surrounding code, as macro can hide a lot of relevant information about the structure of the program.

To avoid errors, reduce the scope/functionality encapsulated in a macro as much as possible; forward the work to a C++ construct like a class or function as soon as possible. The macro should do what cannot be accomplished with something else.

Make the macro upper-case and prefixed

Better both, at least one, never none. This rule is to ensure that there are no clashes with "normal" code, as macros take precedence over other C++ constructs.

Multi-statement macros

Macros that consist of multiple statements can be problematic, as they can accidentally change the meaning of the surrounding code. There are multiple approaches, depending on how the macro will be used, in particular, if we want to assign some result, or change the control flow (return, break, continue, goto).

// = nothing
// * statement can be splitted by accident (for example when writing if(cond) FOOBAR();)
// * forces end-uses to write ;
// * can change control flow (return, goto, break and continue)
// * result of first statement can be assigned
// * can define variables that might clash with others in the same scope
#define FOOBAR() foo(); bar()

// = scope
// * statement is always intact
// * cannot force user to write ; at the end and if the users writes ; at the end, it can create invalid code if(cond) FOOBAR(); else FOOBAR();
// * can change control flow (return, goto, break and continue)
// * can define local variables (no clashes)
// * cannot assign result of any statement to variable
#define FOOBAR_SCOPE() {foo(); bar();}

// = do{...}while(false)
// * statement is always intact
// * can change control flow (return, goto), but no break or continue
// * cannot assign result of any statement to variable
// * forces end-uses to write ;
#define FOOBAR_DOWHILEFALSE() do{foo() return; bar();}while(false)

// comma op
// * statement is always intact
// * cannot change control flow (return, goto break, and continue)
// * cannot assign result to variable
// * comma operator can be overloaded -> use void()
// * result of last statement can be assigned
// * cannot define local variables
// * forces end-uses to write ;
// * usable in c++11 constexpr context
#define FOOBAR_COMMA() (foo(), void(), bar())

// lambda (since c++11)
// * statement is always intact
// * cannot change control flow (return, goto, break and continue)
// * can assign result of any statement to variable
// * can define local variables
// * changes __func__, unless used in the capture (possible since c++14),
//   it's otherwise equivalent to writing a separate function and using it in the macro directly
// * can be passed as callbacks to other functions
// * if written without `()`, and not passed as parameter, it's not an compilation error,
//   hopefully the compiler reports it as unused, as it is probably an error
#define FOOBAR_LAMBDA [](){foo(); return bar();}


// immediately invoked lambda (since c++11)
// * statement is always intact
// * cannot change control flow (return, goto, break and continue)
// * can assign result of any statement to variable
// * can define local variables
// * changes __func__, unless used in the capture (possible since c++14), or passed as param to the lambda
//   it's otherwise equivalent to writing a separate function and using it in the macro directly
#define FOOBAR_INVOKEDLAMBDA() [](){foo(); return bar();}()
construct statement intact forces ; control flow (except throw) assign result local variables usable as callback

nothing

no dark red

yes

return, goto, break continue

first statement

no

no

scope

yes

no (red)

return, goto, break continue

no

yes

no

do-while-false

yes

yes

return, goto

no

yes

no

comma

yes

yes

no

last statement

no

no

lambda

yes

yes

no

any statement green

yes

no

invoked lambda

yes

yes

no

any statement green

yes

yes green

As a guideline, I would use an invoked lambda, as it reduces what a macro can do over a function, and is flexible as it can return values like a real function. The do-while-false approach is useful when one wants, for example, to conditionally return from a function.

If break or continue are needed, consider putting the loop in a separate function. This makes it possible to reuse the do-while-false trick with a return. Otherwise, consider using a goto.

The comma operator is mostly useful in C++11 in a constexpr context.

For single-statement macros, none of those workarounds are necessary.


Do you want to share your opinion? Or is there an error, same parts that are not clear enough?

You can contact me here.