Dependency injection in CMake
Dependency injection
Dependency injection is a programming technique in which an object or function receives other objects or functions that it depends on. Dependency injection aims to separate the concerns of constructing objects and using them, leading to loosely coupled programs. The pattern ensures that an object or function that wants to use a given service should not have to know how to construct those services. Instead, the receiving 'client' (object or function) is provided with its dependencies by external code (an 'injector'), which it is not aware of. Dependency injection makes implicit dependencies explicit […]
Wikipedia
Dependency injection is a technique mainly used for organizing code, but it can also be used for libraries and applications.
The use case
It is not uncommon to be able to configure a library in different ways.
For example, in OpenSSL, it is possible to enable or disable at compile time support for different algorithms.
Some libraries (or applications) have optional dependencies, for vim, there are for example multiple packages: tiny, small, normal, …, and then versions with different graphical interfaces.
The first use case I encountered in a private project was a library providing constants.
Depending on the final product, one needed to use one set of constant values instead of another.
A second use-case was a library for allocating resources (mutexes, file handles, … but in particular memory)
Since there are different allocation strategies, the library could be configured in different ways. For example, for single-threaded programs, the lock operation of mutexes was nop.
A probably more common use-case where libraries with additional debug facilities. Like logging, tracing, additional assert, ….
Last but not least, the library could be a facade for another library. It is not uncommon to have different libraries solving more or less the same issue, and having a facade makes it possible to switch from one library to another, or even between different versions of the same library when some API changes.
So far nothing new, but if one needs to build different build permutations…. it gets out of hand quickly.
Note 📝 | For my use cases there are no concerns about the API and ABI of the configurable library. It does not matter how the library was configured, the code using it does not care, and will work regardless. |
An example project
The structure of the project looks like the following
cmake_minimum_required(VERSION 3.25.1)
project(example VERSION 0.0.1)
include(GenerateExportHeader)
set(CMAKE_POSITION_INDEPENDENT_CODE ON)
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
set(CMAKE_C_VISIBILITY_PRESET hidden)
set(CMAKE_VISIBILITY_INLINES_HIDDEN 1)
add_library(lib0 lib0.h lib0.c)
target_include_directories(lib0 PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
generate_export_header( lib0 )
if (PROJECT STREQUAL "standard")
target_compile_definitions(lib0 PRIVATE PROJ_IMPL_STR=".")
elseif (PROJECT STREQUAL "important")
target_compile_definitions(lib0 PRIVATE PROJ_IMPL_STR="!")
else()
message( FATAL_ERROR "define project: -DPROJECT=standard or -DPROJECT=important" )
endif ()
add_library(liba liba.h liba.c)
target_link_libraries(liba PRIVATE lib0)
generate_export_header(liba)
target_include_directories(liba PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
add_library(libb libb.h libb.c)
target_link_libraries(libb PRIVATE lib0)
generate_export_header(libb)
target_include_directories(libb PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
add_library(libc libc.h libc.c)
generate_export_header(libc)
target_include_directories(libc PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
if (PROJECT STREQUAL "standard")
add_executable(main1 main1.c)
target_link_libraries(main1 liba libb libc)
elseif (PROJECT STREQUAL "important")
add_executable(main2 main2.c)
target_link_libraries(main2 liba libb libc)
else()
message( FATAL_ERROR "project not defined" )
endif()
In this example, a different configuration means using a different macro (but only inside lib0
; the ABI and API are not affected).
Since having a different configuration has nothing to do with macros, here is an example project that compiles a different source file depending on how it was configured.
cmake_minimum_required(VERSION 3.25.1)
project(example VERSION 0.0.1)
include(GenerateExportHeader)
set(CMAKE_POSITION_INDEPENDENT_CODE ON)
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
set(CMAKE_C_VISIBILITY_PRESET hidden)
set(CMAKE_VISIBILITY_INLINES_HIDDEN 1)
add_library(lib0 lib0.h)
target_include_directories(lib0 PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
generate_export_header( lib0 )
if (PROJECT STREQUAL "standard")
target_sources(lib0 PRIVATE lib0_standard.c)
elseif (PROJECT STREQUAL "important")
target_sources(lib0 PRIVATE lib0_important.c)
else()
message( FATAL_ERROR "define project: -DPROJECT=standard or -DPROJECT=important" )
endif ()
add_library(liba liba.h liba.c)
target_link_libraries(liba PRIVATE lib0)
generate_export_header(liba)
target_include_directories(liba PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
add_library(libb libb.h libb.c)
target_link_libraries(libb PRIVATE lib0)
generate_export_header(libb)
target_include_directories(libb PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
add_library(libc libc.h libc.c)
generate_export_header(libc)
target_include_directories(libc PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
if (PROJECT STREQUAL "standard")
add_executable(main1 main1.c)
target_link_libraries(main1 liba libb libc)
elseif (PROJECT STREQUAL "important")
add_executable(main2 main2.c)
target_link_libraries(main2 liba libb libc)
else()
message( FATAL_ERROR "project not defined" )
endif()
With the given project structure, if one wants to build both main1
and main2
, one needs to configure and build everything twice
cmake -S source -B build_standard -DPROJECT=standard
cmake --build build_standard --target main1
cmake -S source -B build_important -DPROJECT=important
cmake --build build_important --target main2
# deliver main1 from ./build_standard and main2 from ./build_important
By looking at the project, there is room for improvement.
libc
is built twice, and does not even depend on lib0
.
liba
and libb
depend on libc
but we could build it only once because the source code used for compiling those libraries does not depend on lib0
(remember: API and ABI are stable between different configurations). Ideally, we would build those libraries only once, copy them, and then use two separate linker commands.
What I would like to reach, from a high-level perspective, is the following workflow
cmake -S source -B build
cmake --build build --target main1 main2
# deliver main1 and main2 from ./build
It also has the advantage that for the end-user there would be one parameter less when invoking cmake
, making the process less error-prone.
On the other hand, I would like to be able to reach such a workflow without rewriting the application or the build system, thus those are the (self-imposed) constraints
No C or C++ code should need any change. The main reason is that a refactor could introduce unwanted changes in the program logic. The second reason is that how the application is built (by configuring and building twice or only once) should not determine how the application works. Thus any application should work unchanged before and after the refactor.
In case some refactoring makes sense (the CMake code might get too complex), I do not want to introduce functions like dlopen
🗄️ or LoadLibrary
🗄️.
The solution should also work both with and without BUILD_SHARED_LIBS=ON
, simply because the current workflow works with and without this configuration.
It should also not be necessary to build liba
and libb
twice.
It would diminish the advantage of a single build, as more components are built unnecessarily more than once.
Since my goal is to reduce build times (and simplify the delivery workflow), it might be faster and less error-prone to just leave the project as is, throw more resources into it, and continue to build everything twice.
duplicate lib0
, first approach
The first thing to try is to duplicate lib0
a naive approach brings to the following project
cmake_minimum_required(VERSION 3.25.1)
project(example VERSION 0.0.1)
include(GenerateExportHeader)
set(CMAKE_POSITION_INDEPENDENT_CODE ON)
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
set(CMAKE_C_VISIBILITY_PRESET hidden)
set(CMAKE_VISIBILITY_INLINES_HIDDEN 1)
add_library(lib1 lib0.h lib0.c)
generate_export_header(lib1 BASE_NAME LIB0)
target_include_directories(lib1 PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
target_compile_definitions(lib1
PRIVATE PROJ_IMPL_STR="."
)
add_library(lib2 lib0.h lib0.c)
target_include_directories(lib2 PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
target_compile_definitions(lib2
PRIVATE lib1_EXPORTS=1 # hack, use same exports of lib1
PRIVATE PROJ_IMPL_STR="!"
)
add_library(liba1 liba.h liba.c)
target_link_libraries(liba1 PRIVATE lib1)
generate_export_header(liba1 BASE_NAME LIBA)
target_include_directories(liba1 PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
add_library(liba2 liba.h liba.c)
target_link_libraries(liba2 PRIVATE lib2)
target_include_directories(liba2 PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
target_compile_definitions(liba2
PRIVATE liba1_EXPORTS=1 # hack, use same exports of lib1
)
add_library(libb1 libb.h libb.c)
target_link_libraries(libb1 PRIVATE lib1)
generate_export_header(libb1 BASE_NAME LIBB)
target_include_directories(libb1 PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
add_library(libb2 libb.h libb.c)
target_link_libraries(libb2 PRIVATE lib2)
target_include_directories(libb2 PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
target_compile_definitions(libb2
PRIVATE libb1_EXPORTS=1 # hack, use same exports of lib1
)
add_library(libc libc.h libc.c)
generate_export_header(libc)
target_include_directories(libc PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
add_executable(main1 main1.c)
target_link_libraries(main1 liba1 libb1 libc)
add_executable(main2 main2.c)
target_link_libraries(main2 liba2 libb2 libc)
Because we do not want to configure the project twice in the same run; we need to split lib0
into lib1
and lib2
lib0
does not exist as a target anymore, thus the targets liba
and libb
must change too.
Since the targets liba
and libb
cannot depend both on lib1
and lib2
, the easiest solution is, at the moment, to duplicate them.
This approach has multiple drawbacks
It adds a non-trivial amount of complexity in CMake, as the old targets do not exist anymore, and I needed to manually define the macros for exporting the API. It also requires duplicating all targets that depend (directly, but more importantly transitively) on a configurable target, and all dependant targets need to be adapted too.
Last, but not least, those targets are compiled more than once, even if it is the same, unchanged, code.
The advantages are that no changes to the source code are necessary.
duplicate lib0
, second approach
Thanks to OBJECT
targets 🗄️, it is possible to build the source code only once.
cmake_minimum_required(VERSION 3.25.1)
project(example VERSION 0.0.1)
include(GenerateExportHeader)
set(CMAKE_POSITION_INDEPENDENT_CODE ON)
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
set(CMAKE_C_VISIBILITY_PRESET hidden)
set(CMAKE_VISIBILITY_INLINES_HIDDEN 1)
add_library(lib1 lib0.h lib0.c)
generate_export_header(lib1 BASE_NAME LIB0)
target_include_directories(lib1 PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
target_compile_definitions(lib1
PRIVATE PROJ_IMPL_STR="."
)
add_library(lib2 lib0.h lib0.c)
target_include_directories(lib2 PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
target_compile_definitions(lib2
PRIVATE lib1_EXPORTS=1 # hack, use same exports of lib1
PRIVATE PROJ_IMPL_STR="!"
)
add_library(liba OBJECT liba.h liba.c)
target_include_directories(liba PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
generate_export_header(liba)
target_compile_definitions(liba
PRIVATE liba_EXPORTS=1 # should not be necessary...
)
add_library(liba1 liba.h $<TARGET_OBJECTS:liba>)
target_link_libraries(liba1 PRIVATE lib1)
add_library(liba2 $<TARGET_OBJECTS:liba>)
target_link_libraries(liba2 PRIVATE lib2)
add_library(libb OBJECT libb.h libb.c)
target_include_directories(libb PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
generate_export_header(libb)
target_compile_definitions(libb
PRIVATE libb_EXPORTS=1 # should not be necessary...
)
add_library(libb1 $<TARGET_OBJECTS:libb>)
target_link_libraries(libb1 PRIVATE lib1)
add_library(libb2 $<TARGET_OBJECTS:libb>)
target_link_libraries(libb2 PRIVATE lib2)
add_library(libc libc.h libc.c)
generate_export_header(libc)
target_include_directories(libc PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
add_executable(main1 main1.c)
target_link_libraries(main1 liba1 libb1 libc)
add_executable(main2 main2.c)
target_link_libraries(main2 liba2 libb2 libc)
In this case, the targets liba1
and liba2
use the compiled code of the target liba
, thus the same code is not built twice anymore.
Note that target_compile_definitions(liba PRIVATE liba_EXPORTS=1)
should not be necessary, but without it, the Microsoft Compiler warns that the linkage is inconsistent when building the target liba
.
While no target is built unnecessarily more than once, it does require even more changes to CMake, and it does not handle the previously mentioned issue.
Note 📝 | The libraries and executables produced with and without an OBJECT target are binary identical. |
Inject a dependency
The previous approaches had the disadvantage that libraries need to be duplicated, while ideally we only want to duplicate lib0
.
Since at compile time, only the headers with the declarations are needed, what if the targets liba
and libb
have an INTERFACE
target as a dependency?
cmake_minimum_required(VERSION 3.25.1)
project(example VERSION 0.0.1)
include(GenerateExportHeader)
set(CMAKE_POSITION_INDEPENDENT_CODE ON)
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
set(CMAKE_C_VISIBILITY_PRESET hidden)
set(CMAKE_VISIBILITY_INLINES_HIDDEN 1)
add_library(lib0 INTERFACE lib0.h)
target_include_directories(lib0 INTERFACE ${CMAKE_CURRENT_BINARY_DIR})
target_compile_definitions(lib0
INTERFACE lib1_EXPORTS=1 # hack, use same exports of lib1
)
add_library(lib1 lib0.c)
generate_export_header(lib1 BASE_NAME LIB0) # cannot generate them on INTERFACE target
target_link_libraries(lib1 lib0)
target_compile_definitions(lib1
PRIVATE PROJ_IMPL_STR="."
)
add_library(lib2 lib0.c)
target_link_libraries(lib2 lib0)
target_compile_definitions(lib2
PRIVATE lib1_EXPORTS=1 # hack, use same exports of lib1
PRIVATE PROJ_IMPL_STR="!"
)
add_library(liba liba.h liba.c)
target_link_libraries(liba PRIVATE lib0)
generate_export_header(liba)
target_include_directories(liba PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
add_library(libb libb.h libb.c)
target_link_libraries(libb PRIVATE lib0)
generate_export_header(libb)
target_include_directories(libb PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
add_library(libc libc.h libc.c)
generate_export_header(libc)
target_include_directories(libc PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
add_executable(main1 main1.c)
target_link_libraries(main1 liba libb libc lib1)
add_executable(main2 main2.c)
target_link_libraries(main2 liba libb libc lib2)
Contrary to the previous approach, the changes in the CMake code are much more localized.
lib0
changed to an interface library, so it still exists as a target. All targets that depend on it (liba
and libb
) do not need any change, only the main applications (main1
and main2
), which are responsible for injecting the required library, need to specify if they depend on lib1
or lib2
.
This approach scales well, as it is possible to support multiple targets that can be configured in multiple ways.
It also makes the dependency between the application and the desired configuration more explicit instead of tracking it by hand with variables.
Is dependency injection the silver bullet?
Is the last approach the perfect solution for avoiding configuring a project twice?
If one does not need to support -DBUILD_SHARED_LIBS=ON
, or if the component that can be configured are always linked statically, then the answer is unambiguous: "yes".[1]
A simple approach for ensuring that the application logic did not change, is to compare the binaries. With the presented CMake files, the generated binaries are 100% identical, even in different build folders.
This might not be the case in more complex setups, with diffoscope it might be possible to visualize if the differences are relevant.
Shared libraries on Windows
When building the last example project on Windows with -DBUILD_SHARED_LIBS=ON
, the linker fails with the following error message
liba.obj : error LNK2019: unresolved external symbol mylog referenced in function funa [Y:\tmp\msbuild-shared\liba.vcxproj]
Y:\tmp\msbuild-shared\Debug\liba.dll : fatal error LNK1120: 1 unresolved externals [Y:\tmp\msbuild-shared\liba.vcxproj]
The error makes sense.
If an application loads liba.dll
, then this library needs to load either lib1.dll
or lib2.dll
. But in CMake, the liba
target does not have this information.
Shared libraries on GNU/Linux
If you tested the project on a GNU/Linux system, you will not have a linker error, and the code compiles and the binaries run without issues, just like in the case of the static build.
There are some notable differences between this approach (injecting dependencies) and configuring and building the code twice. While in this environment the behavior of the application is the same, it is not necessarily so in all environments.
ldd
is useful for verifying, on "trusted" binaries, which libraries are loaded at runtime.
On the example project, the output of ldd
looks like
ldd /tmp/build-shared-standard/main1 /tmp/build-shared-standard/libliba.so /tmp/build-shared-standard/liblib0.so
/tmp/build-shared-standard/main1:
linux-vdso.so.1 (0x00007ffc59fda000)
libliba.so => /tmp/build-shared-standard/libliba.so (0x00007f41a7f67000)
liblibb.so => /tmp/build-shared-standard/liblibb.so (0x00007f41a7f62000)
liblibc.so => /tmp/build-shared-standard/liblibc.so (0x00007f41a7f5d000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f41a7d56000)
liblib0.so => /tmp/build-shared-standard/liblib0.so (0x00007f41a7d51000)
/lib64/ld-linux-x86-64.so.2 (0x00007f41a7f73000)
/tmp/build-shared-standard/libliba.so:
linux-vdso.so.1 (0x00007ffc81bf2000)
liblib0.so => /tmp/build-shared-standard/liblib0.so (0x00007f389eb9d000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f389e996000)
/lib64/ld-linux-x86-64.so.2 (0x00007f389eba9000)
/tmp/build-shared-standard/liblib0.so:
linux-vdso.so.1 (0x00007fffe3147000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ff98af40000)
/lib64/ld-linux-x86-64.so.2 (0x00007ff98b14e000)
The output of ldd
, in case of duplicated targets, looks like
ldd /tmp/build-shared/main1 /tmp/build-shared/libliba1.so /tmp/build-shared/liblib1.so
/tmp/build-shared/main1:
linux-vdso.so.1 (0x00007ffc5ecdc000)
libliba1.so => /tmp/build-shared/libliba1.so (0x00007f844c83b000)
liblibb1.so => /tmp/build-shared/liblibb1.so (0x00007f844c836000)
liblibc.so => /tmp/build-shared/liblibc.so (0x00007f844c831000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f844c62a000)
liblib1.so => /tmp/build-shared/liblib1.so (0x00007f844c625000)
/lib64/ld-linux-x86-64.so.2 (0x00007f844c847000)
/tmp/build-shared/libliba1.so:
linux-vdso.so.1 (0x00007ffcfacb1000)
liblib1.so => /tmp/build-shared/liblib1.so (0x00007f5106411000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f510620a000)
/lib64/ld-linux-x86-64.so.2 (0x00007f510641d000)
/tmp/build-shared/liblib1.so:
linux-vdso.so.1 (0x00007ffc162a0000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd1e02a4000)
/lib64/ld-linux-x86-64.so.2 (0x00007fd1e04b2000)
While in the case of injected libraries
ldd /tmp/build-shared/main1 /tmp/build-shared/libliba.so /tmp/build-shared/liblib1.so
/tmp/build-shared/main1:
linux-vdso.so.1 (0x00007ffd7d5d6000)
libliba.so => /tmp/build-shared/libliba.so (0x00007ff5e15b2000)
liblibb.so => /tmp/build-shared/liblibb.so (0x00007ff5e15ad000)
liblibc.so => /tmp/build-shared/liblibc.so (0x00007ff5e15a8000)
liblib1.so => /tmp/build-shared/liblib1.so (0x00007ff5e15a3000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ff5e139c000)
/lib64/ld-linux-x86-64.so.2 (0x00007ff5e15be000)
/tmp/build-shared/libliba.so:
linux-vdso.so.1 (0x00007fffa1325000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe01f13b000)
/lib64/ld-linux-x86-64.so.2 (0x00007fe01f349000)
/tmp/build-shared/liblib1.so:
linux-vdso.so.1 (0x00007ffd56de6000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc139336000)
/lib64/ld-linux-x86-64.so.2 (0x00007fc139544000)
The outputs explain how the code can work correctly. In the case of injected shared libraries, liba
and libb
are not loading lib1
or lib2
(which would be the case on Windows, and is the case in the original project).
In all cases, main1
is also directly loading the target lib1
, and since this is a private dependency of the target liba
, it was, at least for me, unexpected.
Underlinking is not a well-accepted practice in some environments.
Gentoo recommends avoiding it when enabling hardening flags.
The Mageia wiki lists several disadvantages, the main one is that "`--as-needed` can not be used when compiling a program with an underlinked library", and uses flags for detecting underlinking when packaging libraries.
According to the ALT Linux Wiki, it is not possible to prelink an underlinked library (untested from my side). Considering that glibc does not support prelink anymore (which is the main reason why it is not packaged in different distributions anymore), this should not be an issue.
The last issue is that an application loading libliba.so
also needs to load liblib1.so
or liblib2.so
, which might be unexpected, especially if the dependency is private. In the case of source code, injecting a dependency is not a surprise as the API requires the user to pass the dependency as a parameter. In the case of binaries loading libraries, there is no API for expressing such intent.
Thus on GNU/Linux systems, injecting dependencies with shared libraries is not necessarily an issue, as long as underlinking is not an issue.
inline
functions as they can cause ODR violations.
Do you want to share your opinion? Or is there an error, some parts that are not clear enough?
You can contact me anytime.