Detect member variables since C++11
Reflection is a language feature that permits to inspect and query information about types and act accordingly.
A common usage of reflection is logging and serialization.
Through reflection, one could iterate over all member variables of a structure, and serialize them recursively. The same approach would then be used for deserializing.
Reflection is of course not strictly necessary. It is possible to write all serialization code by hand, but like most language and library features, it enables code reuse thus avoiding common errors and unneeded complexity.
Many high-level programming languages provide some sort of dynamic reflection. In those languages, it is possible to query during the execution of the program types, member variables, functions, and many other pieces of information.
In C++ there is no "official" (or full-featured) support for reflection. There are libraries, and there are proposals for adding reflection to the language. All proposals are about refection at compile time, maybe for C++23 there will be something official. A nice comparison of those proposals can be found at the meetingcpp.com blog.
In the meantime, we should acknowledge that since C++11, the ISO committee added more and more language features and utilities under <type_traits>
that can help to achieve similar goals without too much boilerplate (at least compared to previous language revisions).
In the case of C++, contrary to other languages, all utilities are at compile time, not runtime. That’s actually a very good thing, since runtime reflection, like in Java, bypasses the type system and safety net provided by a static analyzer like the compiler. Thus it makes it possible to have programming errors that would normally get caught by the compiler, and reduce correctness guarantees.
There are of course libraries that permit to use of reflection at runtime, but they have normally different costs. The first major and probably most important and obvious drawback is that very often it is necessary to rewrite or add code to add the missing meta-informations. The second is that reflection at runtime also has a cost at runtime, which might or might not be negligible depending on the use case.
I had to work with some generated code, and there were many classes with the exact same definition, but some of them had an additional member variable, which needed to be set or queried. So it happened that there was some boilerplate code, which ideally can be applied to all different generated structures, except for the fact that by some of them, a member variable did not exist so one line code should not be executed. Since the structures are generated and depend on external data, hand-tracking every time that there is no extra member variable is error-prone. Even if the external input does not change, if one structure every ten does have a member variable that needs to be updated (the opposite is easy to not oversee, since the compiler would complain), it’s easy to oversee or miss when changing the source code.
A possible solution would have been to provide a setter method, that would also set the extra member variable if present, instead of accessing directly the member variable. This means changing the code generator, which might not be easy, or parsing separately the generated code and patching it, which might even be more difficult. Trying to solve the issue at a language level might be a better option, and at least for my use case, the solution was straightforward.
Thanks to templates and function overloading, detecting at compile-time if a structure has a member variable, and acting accordingly, is nearly straightforward:
#include <type_traits>
template <typename T, typename = void>
struct has_member : std::false_type{};
template <typename T>
struct has_member<T, decltype((void)T::member, void())> : std::true_type {};
template <class T>
void foo_impl(const T& t, std::false_type){
// do something, T does not have queried member variable
}
template <class T>
void foo_impl(const T& t, std::true_type){
// do something, T does have queried member variable
}
template <class T>
void foo(const T& t){
return foo_impl(t, has_member<T>{});
}
I’ve provided two overloads for foo_impl
, one for structures that have a member variable member
, and another for functions that do not have this member variable.
Through has_member
we define two different types depending on the fact that the member variable is there or not.
The non-specialized version is based on std::false_type
, as by default a generic structure does not have this specific member variable. The specialization is based on std::true_type
, and applies only to types that have the queried member variable.
But how is that achieved?
This is the part where the code looks a little strange, and works thanks to decltype
and the comma operator.
T::member
might or might not be a valid expression, so it would normally cause a compilation error. Since we are evaluating a template specialization, we avoid the compilation error since SFINAE (Substitution Failure Is Not An Error) kicks in. If the expression is ill-formed, the more generic has_member
structure is selected by the compiler. If the expression T::member
is well-formed, then decltype( (void)T::member, void() )
evaluates as decltype(void())
, which evaluates to void
. Without using the comma operator, the code would still compile, but has_member<T>
would always evaluate to std::false_type
, unless T::member
is of type void
, which is not possible as void
is not really a valid type.
If you are asking yourself, why T::member
is cast to void
: it’s for avoiding the compiler warning about unused variables, since we are not using T::member
for anything.
Beware that GCC, when compiling with -Wall
or -Wunused-value
, will happily trigger a confounding compiler warning 🗄️. "The left operand of the comma operator has no effect". My guess is that the warning message wants to express that the left operand is discarded, thus the value is unused. As we are not modifying it, there are normally no side effects. Unfortunately, the warning states that it has no effect, which in this case is not true, as removing the left operand changes the behaviors of the program. Clang (tested with -Weverything
) does not trigger any warning, and MSVC (tested with -Wall
) either. While the warning could be correct in some situations, inside this decltype
expression it does not make much sense. The GCC maintainers already created a patch for this issue 🗄️, apparently they are much faster at improving the compiler than me writing this article or opening the bug.
Note 📝 | See the update section, the comma operator is not necessary |
It feels like a dirty hack, and it actually is, but it’s much more readable compared to what one had to do until C++03. While the behavior is well defined in the standard, compilers are not 🗄️ perfect 🗄️. At least when dealing with non-public member functions, different compilers generate code with different behavior, or no code a all. It seems to me that Clang implements all the cases I’ve tested correctly. For public member variables, I was not able to find discrepancies, so this technique should be robust enough for those use cases. Hopefully, the bugs reported to both GCC and MSVC will be fixed soon, and ensure consistent behavior across all compilers.
Nevertheless, with this hack, we can query if a member variable is available, and its usage for the client is really easy. The templates and overload mechanisms will call the correct function, and the caller does not need to worry about anything (except maybe for compiler messages in case of errors):
struct s1{
unsigned int member;
};
struct s2{
char member;
};
struct s3 {};
int main(){
foo(s1{});
foo(s2{});
foo(s3{});
}
Of course, if we do not need to read or write to the queried member variable, we do not need to write different overloads to handle the situation:
struct s1{
unsigned int member;
};
int main(){
if(has_member<s1>::value){
std::puts("structure has member variable");
} else {
std::puts("structure does not have member variable");
}
}
While it is not much code, it would be nice to be able to parameterize the member variable that one wants to query. AFAIK, the only way to remove the boilerplate is to pack it into a macro :-(
#define HAS_MEMBER(m) \
template <typename T, typename = void>\
struct has_member : std::false_type{};\
\
template <typename T>\
struct has_member<T, decltype((void)T::m, void())> : std::true_type {};
Which can be easily reused
HAS_MEMBER(member);
template <class T>
void foo_impl(const T& t, std::false_type){
// ...
}
template <class T>
void foo_impl(const T& t, std::true_type){
// ...
}
template <class T>
void foo(const T& t){
return foo_impl(t, has_member<T>{});
}
In case we want to check if structures have a member variable of a given type, we need to make two minor adjustments: replace void
with the given type, and remove the comma operator.
#include <type_traits>
template <typename T, typename = X>
struct has_member : std::false_type{};
template <typename T>
struct has_member<T, decltype(T::member)> : std::true_type {};
Where X
is the type of member
we want to check. In case there is a member variable member
, but its type is not X
, then the specialization is discarded as it is not really a specialization.
Of course, it is also possible to specialize types further.
From now on, for simplicity, I’ll suppose that we only want to verify if a member variable exists and that we are not (yet) interested in its type. Suppose that in one specific case, the member
variable exists, but the structure has a different semantic and thus needs special handling.
We can specialize has_status
further, and provide another custom structure.
#include <type_traits>
class MySpecialtype;
class MySpecialtype2;
struct data_type_without_status {};
struct data_type_with_status {};
struct data_type_with_special_status {};
template <typename T, typename = void>
struct has_status : data_type_without_status {};
template <typename T>
struct has_status<T, decltype((void)T::member, void())> : data_type_with_status {};
template <>
struct has_status<MySpecialtype> : data_type_with_special_status {};
template <>
struct has_status<MySpecialtype2> : data_type_with_special_status {};
template <class T>
void foo(const T& t, data_type_without_status){
// ...
}
template <class T>
void foo(const T& t, data_type_with_status){
static_assert(not std::is_same<T, MySpecialtype>::value, "MySpecialtype needs to be handled separately");
// ...
}
void foo(const MySpecialtype& t, data_type_with_special_status){
// ...
}
void foo(const MySpecialtype2& t, data_type_with_special_status){
// ...
}
template <class T>
void foo(const T& t){
return foo(t, has_status<T>{});
}
Thanks to template specialization, we can define the most appropriate "category" for our types. In case the code gets more complex, errors tend to be difficult to diagnose. Thus as a safety belt, it might make sense to add some static_assert
with std::is_same
and check the type explicitly, and thus double-check that we have implemented everything as intended.
Doing a runtime check with a hand-made structure is cumbersome, as we need to use std::is_same
. In the previous sample with std::true_type
and std::false_type
, the code was easier to read, as those structures can be converted to boolean types.
With std::integral_constant
it is possible to achieve a similar effect by using integral types like int
, or by defining an enum
:
enum class data_type {without_status, with_status, with_special_status};
using data_type_without_status = std::integral_type<e::without_status>;
using data_type_with_status = std::integral_type<e::with_status>;
using data_type_with_special_status = std::integral_type<e::with_special_status>;
In this case, through std::integral_constant
it is possible to define as before many different types to exploit the overload mechanism. As in the previous example before, data_type_without_status
, data_type_with_status
, and data_type_with_special_status
are unrelated structures, but this time, through the static member variable value
, it is possible to use an integral value for writing the logic. In this case, the enum
values, which are much easier to handle at runtime and in static_assert
, and generally ==
is easier to read compared to std::is_same
.
A more generic n-ary status with specializations is still relatively easy to handle, the number of statuses would grow linearly if there are no ambiguities to solve.
#include <type_traits>
class MySpecialtype;
class MySpecialtype2;
enum class data_type {with_status1, with_status2, with_status3, with_status4};
using data_type_with_status1 = std::integral_constant<data_type, data_type::with_status1>;
using data_type_with_status2 = std::integral_constant<data_type, data_type::with_status2>;
using data_type_with_status3 = std::integral_constant<data_type, data_type::with_status3>;
using data_type_with_status4 = std::integral_constant<data_type, data_type::with_status4>;
template <typename T, typename = void>
struct status : data_type_with_status1 {};
template <typename T>
struct status<T, decltype((void)T::member, void())> : data_type_with_status2 {};
template <typename T>
struct status<T, decltype((void)T::another_member, void())> : data_type_with_status3 {};
// has both member and another_member
// we need to solve the ambiguous partial specializations
template <>
struct status<MySpecialtype> : data_type_with_status2 {};
template <>
struct status<MySpecialtype2> : data_type_with_status4 {};
struct s1{
int member;
};
struct s2{
int member;
};
struct s3 {
int another_member;
};
struct s4 {
};
struct MySpecialtype{
int another_member;
int member;
};
struct MySpecialtype2{
int member;
};
int main(){
static_assert(status<s1>::value == data_type::with_status2);
static_assert(status<s2>::value == data_type::with_status2);
static_assert(status<s3>::value == data_type::with_status3);
static_assert(status<s4>::value == data_type::with_status1);
static_assert(status<MySpecialtype>::value == data_type::with_status2);
static_assert(status<MySpecialtype2>::value == data_type::with_status4);
}
Update (2024-01-28)
An initial implementation (as documented on the different compiler bug trackers) looked like
#include <type_traits>
template <typename T, typename = void>
struct has_member : std::false_type{};
template <typename T>
struct has_member<T, decltype(T::member, void())> : std::true_type {};
this version, as explained in these notes, needs to use the comma operator to work correctly.
To silence a warning about the unused variable, I’ve subsequently changed the code to
#include <type_traits>
template <typename T, typename = void>
struct has_member : std::false_type{};
template <typename T>
struct has_member<T, decltype((void)T::member, void())> : std::true_type {};
But at this point, the whole ,void()
is redundant, as decltypevoid)T::member)
already evaluates as decltype(void(
!
Thus, scratch this section from valid valid use-cases for the comma operator, I should have noted it sooner, but better late than never
Do you want to share your opinion? Or is there an error, some parts that are not clear enough?
You can contact me anytime.