C++ Comparison operators

During the last days, I tried to clean up various side projects I had at home. I quickly noticed that a common source of boilerplate code is the implementation of the comparison operators, like operator<, operator> and operator==. The complete list of comparison operators can be found here.

Normally we do not pay a lot of attention to those operators, but when we use them, there are a lot of hidden expectations. And sometimes, unfortunately, those are broken and there are unexpected side effects.

My expectations

I have normally a lot of expectations about those operators. Most of those are written nowhere, but without those writing and reading code would be much harder without any sensible reason.

Consistency

The first expectation is that those operators are consistent to each other. For example, if the expression a != b is true, then we expect a == b to be false and vice versa. If that would not be true, we wouldn’t know how to compare two values for equality. We normally write our programs as

void compare(A a, A b){
	std::cout << "a and b are ";
	if(a == b){
		std::cout << "equal\n";
	} else {
		std::cout << "unequal\n";
	}
}

and not

void compare(A a, A b){
	std::cout << "a and b are ";
	if(a == b){
		std::cout << "equal\n";
	} else if (a != b) {
		std::cout << "unequal\n";
	} else {
		// ???
		std::cout << "not equal, and not unequal\n";
	}
}

Immutability

The second expectation is immutability. If the first time the expression a == b is true, then we expect the second time we evaluate a == b to be true again, unless we have changed the value of a or b. Of course, when calling those operators, our class may update it’s internal state (for example updating a cached value), but those internal changes should be const-correct and not change the comparison semantic. If this is not the case, instead of writing

A a = ...
A b = ...
... // may change a or b, therefore they are not declared as const
if( a < b ){
	...
}

we should always write

A a = ...
A a = ...
... // may change a or b, therefore they are not declared as const
const A& ref_a = &a;
const A& ref_b = &b;
if( ref_a < ref_b ){ // may not compile
	...
}

This way, we would at least get a compile-time error it the operator< tries to modify a parameter. If the value is copyable, we can copy it before comparing it.

A a = ...
A a = ...
...
A cp_a = a;
A cp_b = b;
if( cp_a <= cp_b ){
	// a and b will have their original values, unless the copy operator makes a shallow copy...
}

Even if we can write our programs this way, the code gets cluttered with temporary variables or const references. It will, therefore, be more difficult to read, and more error prone to write. And there may be situations where we can’t copy values before comparing them. Maybe because there is no copy constructor, or maybe because the copy constructor only makes a shallow copy, so any changes made to cp_a will apply to a too.

Value ordering

The ordering of the values is not important, or at least it shouldn’t be. The expressions a == b and b == a should both to be true or false. The same holds for the inequality operator (a != b and b != a). We do not expect the other operators to be symmetric, but to have some relation to when the values are switched. For example, we expect a < b and b > a to yield the same result, and at most only one between a<b and b<a to be true, but not both.

Return value

There is also the expectation that those operators return a bool, or at least a value that can be converted without an explicit cast to bool.

Transitivity

There are other, more hidden expectations, for example, that the operators have the transitivity property. For example if a==b and b==c, we would expect a==c. Same holds for the inequality operator, if a<b and b<c the we expect a<c.

As pure as possible

We expect these operators to be as pure as possible. It may not be possible for them to be completely pure, maybe some intermediary value needs to be calculated and some memory gets allocated, but we would expect that the result of those operators depends only on their input parameters. We also expect the environment to be changed as little as possible, even if it is not completely unreasonable to have some side effects, for example because of logging.

All those properties are taken for granted, and I’ve surely missed others. The problem is that none of those is checked at compile time since they are not enforced by the language.

Horror code

All the following samples have been tested with Clang and the -Weverything parameter and with GCC with -Wall -pedantic -Wextra and a lot of other parameters. None of this examples issued a single warning that operator== is doing something unexpected. Unfortunately, not even Cppcheck did complain. I did not test those samples with other static analysis tools, but I fear that the result would be the same.

For example, following code compiles without a hiccup:

#include <cassert>

struct brokenop{
	int v;
};
bool operator==(brokenop& lhs, brokenop& rhs){
	const bool toret = (lhs.v == rhs.v);
	++lhs.v;
	return toret;
}
int main(){
	brokenop a = {1};
	brokenop b = {2};
	assert(a.v == 1);
	assert(a == b);
	assert(a.v == 2);
}

It’s not wrong, it’s a completely valid operator== and the compiler is not obliged to give any diagnostic. But it breaks a lot of assumptions. And using such an operator will create a lot of headaches when interacting with the rest of the world, for example when using std::find on a std::vector<brokenop>.

This is another, completely valid C++ program that does not issue any warning, but hopefully will never find a place in any codebase.

#include <cstdlib>

struct brokenop{
	int v;
};
bool operator==(const brokenop&, const brokenop&){
	return std::rand() % 2 == 0;
}
int main(){
	const brokenop a = {2};
	const brokenop b = {2};
	return (a == b) ? 0 : 1;
}

If you want someone to go mad, you now know what to do ;-)

Another example that compiles without a warning and thus shows that the return values can be anything

struct brokenop{
	int v;
};
inline brokenop operator==(const brokenop& lhs, const brokenop&){
	return lhs;
}
int main(){
	const brokenop a = {2};
	const brokenop b = {2};
	brokenop c = (a == b);
	(void)c;
}

Of course in normal code basis you won’t find such monstrosities, but maybe something more like

#include <cassert>
struct brokenop{
	int v1;
	int v2;
};
inline bool operator==(const brokenop& lhs, const brokenop& rhs){
	return lhs.v1 == rhs.v1
		&& lhs.v2 == rhs.v1;
}
int main(){
	brokenop a = {2,2};
	brokenop b = {2,2};
	assert(a == b);
}

The error is more subtle, since it pop ups for only some values, which makes it more difficult to notice. If you simply need to compare an array of different values, you can, most of the time, use std::tie. It does not have any cost at runtime, and at compile time it only needs the tuple header:

#include <cassert>
#include <tuple>
struct good_op{
	int v1;
	int v2;
};
inline bool operator==(const good_op& lhs, const good_op& rhs){
	return std::tie(lhs.v1, lhs.v2) == std::tie(rhs.v1, rhs.v2);
}
int main(){
	good_op a = {2,2};
	good_op b = {2,1};
	assert(!(a == b));
}

does the spaceship operator change anything?

Apparently not:

#include <compare>

struct brokenop{
	friend brokenop operator<=>(brokenop, brokenop){
		return {};
	}
};

Don’t repeat yourself

Writing a comparison operator is normally not a difficult task. Just declare it taking two values by copy or const-reference and returning a bool, and half of the expectations are already satisfied. Implementing the symmetry and transitivity property isn’t in general difficult, even if it depends on which operation we are implementing.

The real problem is, you normally want two (for equality) or six (for ordering) of those comparison operators. And it is a lot of code that gets normally copied and pasted for every class and structure.

C++20 and beyond mitigates those problems thanks to operator<=>, unless you have some special requirement, or if you need to tune some operators for performance.

Otherwise given operator< you can normally implement all the others operator. If you only want equality, you can normally implement operator!= on top of operator==:

struct s_order{
	// ....
	friend bool operator<(const s_order& lhs, const s_order& rhs){
		// do the real comparison
	}
}
inline bool operator> (const s_order& lhs, const s_order& rhs){return  (rhs < lhs);}
inline bool operator<=(const s_order& lhs, const s_order& rhs){return !(rhs < lhs);}
inline bool operator>=(const s_order& lhs, const s_order& rhs){return !(lhs < rhs);}
inline bool operator==(const s_order& lhs, const s_order& rhs){return !(rhs < lhs) && !(lhs < rhs);}
inline bool operator!=(const s_order& lhs, const s_order& rhs){return  (lhs < rhs) ||  (rhs < lhs);}

struct s_equal{
	// ....
	friend bool operator==(const s_equal& lhs, const s_equal& rhs){
		// do the real comparison
	}
}
inline bool operator!=(const s_equal& lhs, const s_equal& rhs){return !(rhs == lhs);}

This has the implicit advantage that now all comparison operators will meet our expectations regarding the consistency because they are implemented according to the same rules.

Of course if operator< is doing some complex computation, it will not be a great idea implementing operator== with two comparisons. You will probably prefer to implement an operator== that makes the actual comparison and be potentially faster. Doing so will, of course, increase the chances to introduce some errors, for example, if the structure gets a new member and only operator< gets updated.

A more generic approach (for C++17 and less)

Even if is easy to meet all expectations, we still need to write the definition and implementation of those operators (in some cases you may be able to use the CRTP pattern to avoid this nuisance too). And we should also test them. Because if one operator is not implemented correctly, you will get errors that are hard to find or debug. So better be safe than sorry.

And since most test cases are very similar, and some comparison operators are so simple that it makes no sense to implement a test case, I’ve grouped some of my definitions and implementation together in a single project, and release it under the Boost Software license.

The usage of the library is pretty trivial, just include the header cop.hpp and use the provided macros to generate the comparison functions you want to base on operator< or operator==.

#include "cop.hpp"

// with macros, declare comparison operators as free functions
class myclass1{
	...
	friend bool operator<(const myclass1& lhs, const myclass1& rhs){
		// do actual comparison
	}
}
cop_gen_ordering_ops_fromlessthan(myclass1,inline)


// with CRTP
class myclass2 : private cop::less_comparable<myclass2>{
	...
	friend bool operator<(const myclass2& lhs, const myclass2& rhs){
		// do actual comparison
	}
}

cop works pretty well for "regular types", ie types that you want to be EqualityComparable and/or LessThanComparable. If you have different needs, you might need to define some operators by yourself.

When some assumptions are broken…​

There are actually use cases when some assumptions about the comparison operators are broken. I think that the most famous example are the nan values of type float and double. nan stands for "not a value" and does not represent any numeric value. This is the reason why comparing some number to nan does not really make sense. All comparison operators return false if nan is involved, except for != which always return true, for example

  • 1 == nan is false

  • 1 != nan is true

  • infinity <= nan is false

  • infinity >= nan is false

  • nan == nan is false

  • nan != nan is true

  • nan <= nan is false

  • nan >= nan is false

and so on. This means that most of the relations between the operators are broken for floating point types. For example we can’t rely anymore that a<b and !(a>=b) are equivalent. Interestingly operator< on floating points still defines a strict weak ordering and is therefore LessThanComparable, but unfortunately it is not EqualityComparable.

Unfortunately this also has some consequences.

The first one is that generic algorithms like std::find or std::sort are not able to fulfill their work for some fundamental types. Fortunately those algorithms have a second version that can take an user-provided comparator. This way we can for example, search a nan value inside a std::vector<double>.

The second consequence is that we are not able to completely automate the generation of comparison operators in a sensible way. If our class contain a float or double value, we can’t, or at least we shouldn’t, implement operator>= in terms of operator<, at least if this member variable could be nan.

Floating point is of course not the only type where the comparison rules may not behave as we expect. With C++17 we will have, for example, std::optional, and it’s comparison rules are pretty complex too.

Unfortunately std::array, std::pair, std::tuple and other containers rely on most assumptions listed above for implementing their own comparison operators, instead of using those provided by the type they contain. If you try to compile and execute the following piece of code

#include <array>
#include <tuple>
#include <utility>
#include <vector>

#include <cmath>
#include <iostream>

struct alwaysfalse{};
bool constexpr operator<(alwaysfalse,alwaysfalse){return false;}
bool constexpr operator<=(alwaysfalse,alwaysfalse){return false;}
bool constexpr operator>(alwaysfalse,alwaysfalse){return false;}
bool constexpr operator>=(alwaysfalse,alwaysfalse){return false;}
bool constexpr operator==(alwaysfalse,alwaysfalse){return false;}
bool constexpr operator!=(alwaysfalse,alwaysfalse){return false;}

int main(){
	const double f1 = std::nan("1");
	const double f2 = 1;
	std::tuple<double> t1 = {f1};
	std::tuple<double> t2 = {f2};
	std::vector<double> v1 = {f1};
	std::vector<double> v2 = {f2};
	std::array<double, 1> a1 = {{f1}};
	std::array<double, 1> a2 = {{f2}};
	std::pair<double, double> p1 = {f1, f1};
	std::pair<double, double> p2 = {f2, f2};

	std::cout << "double:\n";
	std::cout << "f1 < f2:  plain: " << (f1<f1)  << ", tuple: "<< (t1<t1)  << ", array: " << (a1<a1)  << ", pair: " << (p1<p1)  << ", vector: " << (v1<v1) << "\n";
	std::cout << "f1 > f2:  plain: " << (f1>f1)  << ", tuple: "<< (t1>t1)  << ", array: " << (a1>a1)  << ", pair: " << (p1>p1)  << ", vector: " << (v1>v1) << "\n";
	std::cout << "f1 <= f2: plain: " << (f1<=f1) << ", tuple: "<< (t1<=t1) << ", array: " << (a1<=a1) << ", pair: " << (p1<=p1) << ", vector: " << (v1<=v1) << "\n";
	std::cout << "f1 => f2: plain: " << (f1>=f1) << ", tuple: "<< (t1>=t1) << ", array: " << (a1>=a1) << ", pair: " << (p1>=p1) << ", vector: " << (v1>=v1) << "\n";
	std::cout << "f1 == f1: plain: " << (f1==f1) << ", tuple: "<< (t1==t1) << ", array: " << (a1==a1) << ", pair: " << (p1==p1) << ", vector: " << (v1==v1) << "\n";
	std::cout << "f1 != f1: plain: " << (f1!=f1) << ", tuple: "<< (t1!=t1) << ", array: " << (a1!=a1) << ", pair: " << (p1!=p1) << ", vector: " << (v1!=v1) <<  "\n";
	std::cout << "f1 != f2: plain: " << (f1!=f2) << ", tuple: "<< (t1!=t2) << ", array: " << (a1!=a2) << ", pair: " << (p1!=p2) << ", vector: " << (v1!=v2) << "\n";
	std::cout << "f1 == f2: plain: " << (f1==f2) << ", tuple: "<< (t1==t2) << ", array: " << (a1==a2) << ", pair: " << (p1==p2) << ", vector: " << (v1==v2) << "\n";

	std::cout << "alwaysfalse:\n";
	alwaysfalse af;
	std::tuple<alwaysfalse> t_af1 = {af};
	std::vector<alwaysfalse> v_af1 = {af};
	std::array<alwaysfalse, 1> a_af = {{af}};
	std::pair<alwaysfalse, alwaysfalse> p_af = {af, af};
	std::cout << "af < af:  plain: " << (af<af)  << ", tuple: "<< (t_af1<t_af1)  << ", array: " << (a_af<a_af)  << ", pair: " << (p_af<p_af)  << ", vector: " << (v_af1<v_af1) << "\n";
	std::cout << "af > af:  plain: " << (af>af)  << ", tuple: "<< (t_af1>t_af1)  << ", array: " << (a_af>a_af)  << ", pair: " << (p_af>p_af)  << ", vector: " << (v_af1>v_af1)<< "\n";
	std::cout << "af <= af: plain: " << (af<=af) << ", tuple: "<< (t_af1<=t_af1) << ", array: " << (a_af<=a_af) << ", pair: " << (p_af<=p_af) << ", vector: " << (v_af1<=v_af1)<< "\n";
	std::cout << "af => af: plain: " << (af>=af) << ", tuple: "<< (t_af1>=t_af1) << ", array: " << (a_af>=a_af) << ", pair: " << (p_af>=p_af) << ", vector: " << (v_af1>=v_af1)<< "\n";
	std::cout << "af == af: plain: " << (af==af) << ", tuple: "<< (t_af1==t_af1) << ", array: " << (a_af==a_af) << ", pair: " << (p_af==p_af) << ", vector: " << (v_af1==v_af1)<< "\n";
	std::cout << "af != af: plain: " << (af!=af) << ", tuple: "<< (t_af1!=t_af1) << ", array: " << (a_af!=a_af) << ", pair: " << (p_af!=p_af) << ", vector: " << (v_af1!=v_af1)<< "\n";
}

You should get the following output:

double:
f1 < f2:  plain: 0, tuple: 0, array: 0, pair: 0, vector: 0
f1 > f2:  plain: 0, tuple: 0, array: 0, pair: 0, vector: 0
f1 <= f2: plain: 0, tuple: 1, array: 1, pair: 1, vector: 1
f1 => f2: plain: 0, tuple: 1, array: 1, pair: 1, vector: 1
f1 == f1: plain: 0, tuple: 0, array: 0, pair: 0, vector: 0
f1 != f1: plain: 1, tuple: 1, array: 1, pair: 1, vector: 1
f1 != f2: plain: 1, tuple: 1, array: 1, pair: 1, vector: 1
f1 == f2: plain: 0, tuple: 0, array: 0, pair: 0, vector: 0
alwaysfalse:
af < af:  plain: 0, tuple: 0, array: 0, pair: 0, vector: 0
af > af:  plain: 0, tuple: 0, array: 0, pair: 0, vector: 0
af <= af: plain: 0, tuple: 1, array: 1, pair: 1, vector: 1
af => af: plain: 0, tuple: 1, array: 1, pair: 1, vector: 1
af == af: plain: 0, tuple: 0, array: 0, pair: 0, vector: 0
af != af: plain: 0, tuple: 1, array: 1, pair: 1, vector: 1

And as you can see, those containers are not simply a collection of values, they do also add a comparison semantic, that is normally different from the underlying type.

If you look at the documentation of std::vector, you’ll notice that it requires the underlying type to be EqualityComparable and LessThanComparable in order to use the corresponding comparison functions. std::vector also requires total ordering, which the float type does not satisfy.

Since double and alwaysfalse do not satisfy those concepts, it is, I guess, undefined behavior to invoke the comparison operator on the std::vector class. Unfortunately compilers (tested with GCC and Clang) are not able to detect (through a warning) this problem, at least for the floating point types, since the compiler knows it’s semantic.

How do we handle that situation?

Personally I do not like containers to adding an unclear semantic on the top of the contained type. I also do not find the provided comparisons (ordering especially) operations to be always sensible. My advice is, when working with containers, to use an algorithm instead of relying on the comparison operators. For example use std::equal instead of operator==, and std::lexicographical_compare instead of operator<, and so on.

Those algorithms will use internally operator== and operator< of the underlying type, and do also have an overload that accepts a comparison function.

Otherwise avoid adding comparison operators to classes where it does not make much sense. If you need to compare your classes in some ways, use a named function.