Global variables in C++ libraries

While working on a big project, we noticed there where some issues while the application exited.

Most of the time those issues manifested themselves as crashes, but it was not obvious which piece of code caused it.

After investigating the issue for a long time, we learned that (global) constants have a lot of bad side effects we were not aware of, even if those were confined in a single translation unit.

Long story short, we had an issue similar to those that crypto++ had in 2010: a destructor of a global instance was executed more than once.

This is a minimal code example for reproducing the issue (depending on the environment):

// header file
const std::string hello = "Hello World!"; // possible double free!

or, to make the issue even more explicit:

#include <iostream>

struct my_struct{
	my_struct(){
		std::cout << this << " hello\n";
	}
	~my_struct(){
		std::cout << this << " goodbye\n";
	}
};

const my_struct instance;

Baffled that undefined behavior gets triggered with such completely innocent looking code, I decided to test different environments and configurations to see how to safely write global variables. While it might be a dubious task, as constant are best avoided, there are valid use cases (a cache), and most important constants are globals too. After all, I want to write code that works in all environments, be it Windows, GNU/Linux or as an application or a library.

Most classes are already designed in a way that they can be used in most places. An int, like any other primitive type, can be used as a local variable, member variable of another object and global variable. There is no need to use a special syntax for initializing or using those different types of variables.

The same is true for any classes that are part of the standard library, and for most custom classes. Of course, it is possible to disable such behavior, like hiding constructors and destructors, overriding new and so on.

Therefore I find it very annoying that some code, apparently completely valid and designed to be used in all situations, causes undefined behavior in some environments.

Long story short, the simplest example I could create was a static library (lib0), used by two shared libraries (lib1 and lib2), which were linked to an executable. If lib0 would be shared, or if all libraries would be static then the issue is not reproducible. Of course, there are a lot of other factors: depending on where the global instance is defined made a difference, just as the platform; on Windows the issues reproducible only in one particular scenario, while on GNU/Linux there were other scenarios.

The triggering factors were mainly symbol visibility. Other factors like linkage contributed to worsen or improve the situation, for more details, see the coming tables.

To make the situation more complex and confusing, different language features have multiple effects, for example declaring a variable as const might not only make it const, in case of global variables it might change it’s linkage too (in C++, in C it does not change the linkage)!

Test environment

The class used for testing looks like

struct my_struct{
	my_struct(){
		std::cout << this << " hello\n";
	}
	~my_struct(){
		std::cout << this << " goodbye\n";
	}
}

and the test environment looks as follow

add_library(lib0 STATIC lib0/lib0.hpp lib0/lib0.cpp)
target_include_directories(lib0 PUBLIC lib0)

add_library(lib1 lib1/lib1.hpp lib1/lib1.cpp)
target_include_directories(lib1 PUBLIC lib1)
target_link_libraries(lib1 lib0)

add_library(lib2 lib2/lib2.hpp lib2/lib2.cpp)
target_include_directories(lib2 PUBLIC lib2)
target_link_libraries(lib2 lib0)

add_executable(main main/main.cpp)
target_link_libraries(main lib1 lib2)

as described above, there is a static library (lib0), used by two separate shared libraries (lib1 and lib2), and both are linked to the same executable (main).

From the description, it looks very similar to the diamond problems when supporting multiple inheritance. While writing this article I found the term Diamond dependency, which seems like an appropriate description. Unfortunately, this term is used to describe the following issue: two libraries depend on a common library, but at different versions. The issue described in this article arises even when there are no version mismatches!

Summary

Some consideration of the results:

  • No libraries have been loaded by hand.

  • Binaries have been compiled and executed under GNU/Linux (thanks to WINE for testing Windows executables).

  • A further static analysis can be made with nm, objdump and readelf for those who know how to interpret the result.

  • Everything tested is undefined behavior or implementation-specific (as the C++ standard does not define libraries in any way!), so it might break and change.

  • The row "ctor/dtor" gives the number of times the constructor (or destructor) has been called.

  • The row "inst" gives the number of different objects that have been constructed (and destructed).

  • The row "visibility" states which -fvisibility flag has been passed to the compiler.

  • The row "weak" states if the attribute weak has been used or not.

  • The row "const" states the instance has been declared const or not.

  • The row "static" states the instance has been declared static or not.

  • The row "link" states what linkage the instance has. The result depends on qualifiers like extern, static, weak and const.

  • "n/a" means that the option was not available (or that I’m not aware of it) for the given compiler and/or configuration.

  • If a symbol has been hidden, then it has not weak linkage, even if annotated with the weak attribute.

  • If a symbol has weak linkage, then it has also external linkage.

  • Since C++17, it is possible to declare variables as inline. In the case this made a difference, there is a separate row to show the effects.

Global in header file

// .hpp
STATIC CONST my_struct instance = {};
ctor/dtor inst compiler visibility weak const static link other settings

4

4

gcc/clang

default/hidden

na

c/nc

s/ns

int

if nc, then static

4

4

msvc

na

na

c/nc

s/ns

int

if nc, then static

4

4

mingw

default/hidden

na

c/nc

s/ns

int

if nc, then static

2

2

gcc/clang

hidden

na

c

ns

ext

inline (since C++17)

1

1

gcc/clang

default

na

c

ns

weak

inline (since C++17)

2

2

mingw

default/hidden

na

c

ns

ext

inline (since C++17)

2

2

msvc

na

na

c

ns

ext

inline (since C++17)

For those not acquainted with the term "translation unit", this code might not do what the author wanted when defining a global instance. Every translation unit (rule of thumb: every .cpp file that include the given header file) will have a different instance of instance, unless inline (since c++17) is used.

Global in source file, with extern declaration

// .hpp / or .cpp
extern CONST my_struct WEAK_ATTR instance;

// .cpp
CONST my_struct instance = {};
ctor/dtor inst compiler visibility weak const static link other settings

2

1

gcc/clang

default

nw

c/nc

na

ext

2

2

gcc/clang

hidden

w/nw

c/nc

na

ext

1

1

gcc/clang

default

w

c/nc

na

weak

2

2

msvc

na

na

c/nc

na

ext

2

2

mingw

default/hidden

w/nw

c/nc

na

ext

1

1

gcc/clang

default

nw

c/nc

na

ext

inline ((since C++17)

Global in source file, in unnamed namespace

// file.cpp
namespace {
	STATIC CONST my_struct instance = {};
}
ctor/dtor inst compiler visibility weak const static link other settings

2

2

gcc/clang

default/hidden

na

c/nc

s/ns

int

2

2

msvc

na

na

c/nc

s/ns

int

2

2

mingw

default/hidden

na

c/nc

s/ns

int

This instance is not accessible from other translation units but could be made through a function. The static modifier does not have any effect. An unnamed namespace in a translation unit defines an object with internal linkage and (apparently) hidden visibility. I could not find this fact (that anonymous namespaces have hidden visibility) stated anywhere in the documentation, but it makes sense and is a nice feature.

Global in source file

// .cpp
STATIC CONST my_struct instance = {};
ctor/dtor inst compiler visibility weak const static link other settings

2

2

gcc/clang

default/hidden

na

c/nc

s/ns

int

not static and const and default visibility

2

1

gcc/clang

default

na

nc

ns

ext

2

2

msvc

na

na

c/nc

s/ns

int

2

2

mingw

default/hidden

na

c/nc

s/ns

int

1

1

gcc/clang

default

na

c/nc

ns

int

inline (since C++17)

The anonymous namespace is a superior alternative, as it is less error-prone.

Singleton instance

// .hpp, optional
LIB0_API CONST my_struct& create_my_struct();

// .cpp
CONST my_struct& create_my_struct(){
	static CONST my_struct instance;
	return instance;
}

// or:
// .hpp
inline CONST my_struct& create_my_struct(){
	static CONST my_struct instance;
	return instance;
}
ctor/dtor inst compiler visibility weak const static link other settings

1

1

gcc/clang

default

na

c/nc

s

int

2

2

gcc/clang

hidden

na

c/nc

s

int

2

2

msvc

na

na

c/nc

s

int

2

2

mingw

default/hidden

na

c/nc

s

int

Notice: There does not seem to give a configuration where the number of constructor and destructor calls mismatches with the number of instances. Notice that with "gcc/clang" with default visibility, the singleton behaves as if it has weak linkage, but it does not; it has internal linkage.

Static member variable

// .hpp
struct my_struct{
	// ...
	static CONST my_struct WEAK_ATTR instance;
};

// .cpp
CONST my_struct my_struct::instance = {};  // unless defined inline (since c++17)
ctor/dtor inst compiler visibility weak const static link other settings

2

1

gcc/clang

default

nw

c/nc

s

ext

2

2

gcc/clang

hidden

w/nw

c/nc

s

ext

1

1

gcc/clang

default

w

c/nc

s

weak

2

2

msvc

na

na

c/nc

s

ext

2

2

mingw

default/hidden

w/nw

c/nc

s

ext

1

1

gcc/clang

default

nw

c

s

ext

inline (since C++17)

It is not possible to define a member variable with internal linkage unless the whole class has internal linkage. This means that the class needs to be defined(!) in an anonymous namespace, as shown in the next table.

Static member variable with internal linkage

// .cpp
namsepace {
	struct my_struct{
		// ...
		static CONST my_struct WEAK_ATTR instance;
	};
	CONST my_struct my_struct::instance = {};
}
ctor/dtor inst compiler visibility weak const static link other settings

2

2

gcc/clang

default/hidden

na

c/nc

s

int

2

2

msvc

na

na

c/nc

s

int

2

2

mingw

default/hidden

na

c/nc

s

int

It behaves like a normal global instance in an anonymous namespace.

While this might be a potential use case for unnamed namespaces in header files, it might make more sense to try to avoid static member variables.

Conclusion

Testing different compiler with different types of global variables has been interesting, and some surprises came up.

Supposing that you do not want different (or the same) shared libraries interfere with each other, the only solution is to prefer internal linkage and avoid public visibility for everything that is not part of the interface of the library.

The weak attribute seemed to help in some situation, but after looking better at the documentation it became clear that it is not a proper fix. It might be a viable solution in some environments, but it opens the door for everyone for overriding symbols (even by accident). Thus it does not solve the problem for different shared libraries, just only for the same library used multiple times.

A much more robust solution is to define a symbol as private as possible: If we want some independence from the compiler (and not resort to macros), the anonymous namespace (in a .cpp file), and a singleton instance are the less error-prone solutions.

Nonetheless, it is advised to set hidden visibility with GCC and clang (even the documentation states it!). As there is no way to trigger an error instead of relying on subtle undefined behavior, it is the safest option. It will also normally improve both the compile times as the runtime performance of the executable.

And of course, avoid globals with mutable state as much as possible. While normally when speaking about state we tend to ignore constructors and destructors, for globals those are relevant too.

Some build systems, like CMake, provide some tools for helping settings the visibility. The settings VISIBILITY_INLINES_HIDDEN and CXX_VISIBILITY_PRESET ensure that the correct flag is passed to the compiler, while the GENERATE_EXPORT_HEADER function defines macros like those described by GCC for controlling the interface of a shared library.