Thoughts on public virtual inheritance

After realizing how private inheritance can be used to restrict interfaces, it was unavoidable to rethink how and when to use public and virtual inheritance.

Notice that most thoughts mentioned in these notes are not about inheritance per se, but (dynamic) polymorphism. Inheritance is a low-hanging fruit for showcasing such problems, especially because it does not offer good tools for avoiding such problems. The same issues can be replicated with many other programming techniques.

Introduction to public (virtual) inheritance

In Java, polymorphism is mainly introduced through inheritance, even if it can be achieved in other ways.

One of the common examples used to introduce inheritance is through geometric/platonic figures, for example, a circle and an ellipse.

The code samples are in Java, as every class and method are public and virtual by default. The same reasoning applies unchanged to C++ and possibly other languages, but it feels less natural, as it is more common to use other paradigms instead of class inheritances.

Ellipse is a superset of a circle

If we look at the mathematical definitions, an ellipse is a generalization of a circle. Thus all circles are also ellipsis, while only some ellipsis are also circles.

Thus if we have some functions foo that works on an ellipse, it feels natural to be able to pass a circle without manually converting it to an ellipse.

If the classes are defined in the following way (implementation left to the reader):

class Ellipse {
	public Ellipse(double a, double b);
	public double a();
	public double b();
}

class Circle extends Ellipse {
	public Circle(double r);
	public double r();
}

Then there is no problem at all.

Circle expands the functionality of Ellipse by providing public double r(), but this is not a problem as foo sees only the interface of Ellipse. Also Circle can provide sensible implementations for a() and b(), ie r() == a() == b().

The problem arises if we change the interface of Ellipse, in particular, if we add mutable methods like set_a and set_b.

class Ellipse {
	public Ellipse(int a, int b);
	public double a();
	public double b();
	public void set_a(double a);
	public void set_b(double b);
}

class Circle extends Ellipse {
	public Circle(double r);
	double r();
}

set_a() and set_b() do not have an obvious implementation for Circle.

there are several possibilities

  1. Not returning. The Java library does it in several places by throwing errors. Unfortunately, it hides the underlying problem/a logical issue: a programming error. There is no good reason why a valid Ellipse should not be able to implement without throwing such basic functionality. A bad class hierarchy is IMHO not a good reason.

  2. Do something possibly unexpected, like making set_a equivalent to set_r. This destroys some "implicit" invariants.

The second approach breaks a lot of assumptions and makes the whole class difficult, if not impossible, to use. One might think about the Liskov substitution principle, but in reality, it is a much weaker assumption

Ellipse e = new Circle(1);
foo(e);

void foo(Ellipse e){
	s.set_a(1);
	s.set_b(2);
	assert(ellipse.a() == 1); // might fail
}

or

Ellipse e = new Circle(1);
foo(e);

void foo(Ellipse e){
	a = e.a();
	b = e.b();

	assert(e.area() = a*b*MATH.PI);

	e.set_a(2*a);
	e.set_b(2*b);
	assert(e.area() = 4*a*b*MATH.PI); // might fail
}

In those examples, calculating circumference and most other operations would fail.

Of course, my current class had no documentation at all, so it might be strange to talk about an unexpected functionality or invariants taken for granted. But if I had to write some documentation, it would mimic the mathematical definition and thus (at least implicitly) that a and b do not depend on each other. The same holds for the documentation or Circle, it would mimic the mathematical definition.

And then we have to define how set_a and set_b works on a Circle, which from a mathematical point of view does not make much sense, as Circle has only a radius.

So adding a mutating function broke our interface. One might argue that mutation seems to be the underlying issue, but reality is more complex.

Ellipse as a subclass of Circle

class Circle {
	public Circle(double r);
	double set_r(double r);
	double area();
}

class Ellipse extends Circle {
	Ellipse(double a, double b);
}

In this case, we have mutators, and made Ellipse a subclass of Circle.

Given foo(Circle), with those interfaces, it is possible to provide a sane implementation that does not break any expectation.

But not every Ellipse is a Circle, then why do we not have any issue with this hierarchy?

Let’s ignore construction, as this method is not virtual. Querying the area is an operation possible on both mathematical objects, so it’s not problematic.

Setting a radius, contrary to set_a and set_b, is a sensible operation for both classes. In the case of Ellipse it’s like creating an Ellipse with a==b==r. It means that this particular instance of Ellipse is a Circle.

Before, when I added set_a and set_b, I had exactly this problem, as those functions would break the invariant of Circle.

Because, apart from construction, querying the area and setting the radius is an operation possible on both platonic objects.

Should we thus invert the relation between classes when thinking about platonic objects?

Not every operation can be generalized

Short answer: no, invert the relation generally does not help.

In this example, adding some non-mutating method (r()) can break our interface:

class Circle {
	public Circle(double r);
	public double r();
	public void set_r(double r);
	double area();
}

class Ellipse extends Circle {
	public Ellipse(double a, double b);
}

As only the Ellipse with a==b is a circle, not every ellipse has a radius. Thus the function r() breaks the class hierarchy.

Specializing is easy as long the data is immutable

In the first example, adding mutating methods broke the class hierarchy. In the second example, it was adding an immutable method that broke the hierarchy.

Generally speaking, specializing is not problematic if the class is immutable. If the data is immutable, the parent class cannot change at runtime if it is a special case representable by a subclass.

This means that it is fine to have Circle as a subclass of Ellipse, as long as it is not possible to convert an Ellipse that is a circle (because of its properties, not the type) to an ellipse that is not a circle.

This does not generally mean that adding mutating methods is prohibited:

class Ellipse {
	public Ellipse(int a, int b);
	public double a();
	public double b();
	// used for drawing the objects
	void set_center(double c);
};

class Circle extends Ellipse {
	public Circle(double r);
	double r();
};

The problem is that it is generally more difficult to prove or ensure that all subclasses can implement the mutating methods. For example, if someone added to the hierarchy CircleCenteredAt0, adding set_center would probably break the hierarchy. Thus adding mutating methods is safe only if the author of the parent class has control over all subclasses, the reason is that any mutating method might break an invariant of a specialization.

On the other hand, it is easy to see that every subclass can implement all non-mutating methods. A valid implementation is to call the method of the superclass.

What about mutating data?

Subclasses can both specialize and extend their parent class. Java uses extends as a keyword for defining subclasses, fortunately, other languages did not use such a specific word. As long as all the invariants, pre, and postconditions hold, there will be no problems.

I’ve just shown it with two examples, where the role of Ellipse and Circle could be substituted depending on the interface of the classes.

Extending a superclass, on the other hand, is generally more difficult.

It seems that in most codebases this difficulty is not recognized. For once, every Java program has to deal with a bad class hierarchy because of the collection library. Thus, at the end of the day, even "bad" hierarchies seem to do their job pretty well.

Most programs do not have classes modeling real-world or mathematical entities, those are, in my experience mostly toy programs to show how class hierarchies work and fail miserably when adding functionalities.

Thus, when reading a class with a couple of getters and setters, we do not have as many expectations as we have when reading about geometrical figures.

The main issue with Ellipse and Circles is exactly that: all expectations we (implicitly) have. It is of course possible to define an r() method for an ellipse, or a() for a circle and implement something.

If this is the contract described by the parent class, and the subclass does not violate it, so be it, even if one could complain about the naming of such classes. As my Ellipse and Circle class did not say anything about calculating an area or circumference, one could blame who is using the class for making such an assumption. Hopefully, no one would blame the user, but rethink the interface of the provided functionality.

Adding area(), r() and all possible mathematical operations is not ideal, as it is generally not possible to foresee all use-cases. This also implies that foo(Ellipse) and foo(Circle) would have no reason to exists, as there should be a member function that already provides the desired functionality. In practice, its hardly the case even for simple classes.

Ideally, an interface would expose at least the required functionalities one needs to make all sensible operations.

We talked about adding or changing methods, what about removing (public) methods?

It is safe if we are talking about compiled languages, as the compiler will inspect that all usages of the class will not use the method. Removing the method might even make the class hierarchy more consistent. It is also safe, only if, the class hierarchy is used internally, otherwise someone else (a subclass or a cosnumer of the interface), might rely on the method we do want tot remove.

Are those issues/difficulties a property of class hierarchies?

No, as I mentioned at the beginning, the same issue appears in other places. With class hierarchies, it’s easy to show the issue because it pops up naturally and subclassing does not provide good solutions.

Actually, all sorts of polymorphic techniques, like implicit conversion, generics (java), templates (C++}), function overloading macros ©, function pointers, and class hierarchies have the same underlying issue.

static (templates, function overloading, …​) polymorphism gives the possibility to trigger an error at compile-time, and thus ensure that the issue is somehow tackled or avoided.

Consider for example (in C++)

struct circle {
	circle(double r_) : r(r_) {}
	double a() const;
	double b() const;
	double r() const; // notice: no sensible implementation for Ellipse
};

struct ellipse {
	ellipse(double a_, double b_) : a(a_), b(b_) {}
	double a() const;
	double b() const;
	double set_a() const; // notice: no sensible implementation for Circle
};

template <class T>
double area(const T& obj) {
	return obj.a()*obj.b()*pi;
}

As long as we use functions that are both present in Ellipse and Circle, everything is fine. If we try to use a function that makes sense only for one class, like r() or set_a(), and as long we did not add a non-sensical/dummy/empty implementation for the other class, we will get a compilation error.

Thus dynamic polymorphism (function pointers, virtual inheritance, …​) has fewer tools for avoiding those errors at compile-time. Maybe, in the case of dynamic polymorphism, r() is never used, so the subclass Ellipse from Circle is not an issue in practice. But it introduces ambiguity because we cannot rely on the declared interface anymore and need to check every possible implementation. And we cannot rely on the compiler to avoid someone calling r() on an Ellipse.

Another, maybe more common, example is that base class might require that all functions are pure (for example because of thread-safety). A subclass might add logging statements, thus it’s operation are not pure anymore, and thus break the contract.

In this case, we added functionality to existing methods in the subclass, we did not even need to add new functions in the base class. Even for static polymorphism, it is not that easy to ensure that those invariants are held all the time.

Thus, generally speaking, extending in any form is, or at least can be, problematic. This is also whats stated by the Liskov substitution principle, and that all non-trivial abstractions, to some degree, are leaky.

Taken to an extreme even specializing can be an issue, as it can change the behavior of the software, like improve performance or reduce CPU consumptions (albeit it is normally considered an advantage), or change logging statements (which might be an issue).

When should we use class hierarchies

I find it much easier when talking about inheritance, not speaking about platonic identities or objects in real life.

It is easier to think about what functions are exposed, what’s the intended usage, and about their pre and postconditions. Doing so for platonic identities and real-life objects generally leads to infinite lists of examples and counterexamples.

Notice that I’m not proposing to ban all "bad hierarchies", also because it is not always easy to idetify them, and as always, it’s a tradeoff between multiple factors.

Apart from code compatibility, which I suppose played the major role for the Java Collection classes, there are other criteria to take into account.

We might be in control of all or most subclasses, and introducing a not-perfect class hierarchy might reduce most of the duplicated code. Because after all, the ability to publicly subclass another class is a form of code reuse. All thought and considerations about invariants, conditions, contracts, is for reducing the complexity of the code and make it easier to reason about it, they are not dogmas. Reducing the quantity of code can also help to reduce the workload for humans, thus it should not be dismissed so easily.

Sometimes, introducing unnecessary class hierarchy (for the scope of the application) helps to ease testing, and for the scope of the test, we do not necessarily need, or even want to implement all methods

class Circle {
	public Circle(double r);
	public double r();
	public String label();
};

class Circle_without_label extends Circle {
	public Circle_without_label(double r);
	public String label(){
		System.exit(42); // or something else to ensure that the testsuite does not call this method at runtime
	}
}
// ...

void test_area(){
	Circle c = new Circle_without_label(1);
	double area = calculate_area(c);
	assert(area = pi);
}

In this case, we are applying the Liskov substitution principle to check the correctness of our program.

If calculate_area is implemented correctly, then any class, in this case, Circle and Circle_without_label, can be used without affecting the correctness of the program.

Last, but not least, another reason for adding a class hierarchy is to reduce and break dependencies between modules.

So multiple reasons might conflict with each other.

While ideally, we should prioritize correctness, "good enough" is what we get to work with, and sometimes we get even less. Platonic interfaces also do not automatically prove that some piece of code is correct.

For correcting mistakes it is necessary to understand why we have a hierarchy, as there are multiple reasons, and it turns out that many times those are not even needed. A class like Circle and Ellipse do not necessarily need a class hierarchy, it seems nice and polished, but depending on the exposed functionality it is not.

For code reuse, we can also leverage on other mechanisms.

Some might need more or less code, and others might need more or less resources at runtime, or more or less time to get compiled.

Make an operation optional

This is a strawman argument because if every operation is optional, the semantic of the class is not clear.

On the other hand, for some type of operations, it is a good approach to declare nearly all operations as optional or possibly failing. The paradigm "Everything is a file" in POSIX systems (or "Everything is an object" on Windows systems) is what makes it possible to avoid a lot of code duplication, and for such a paradigm to work, it needs to have nearly every function marked as failing or optional.

While the file and object interfaces are not represented by class hierarchies (as those do not exist in C), but by an opaque type, the conditions are the same for class hierarchies.

We use types to demark what operations are possible, but then, because there are so many different types of instances of this type, we end marking nearly all of those operations as optional.

Consider if we have a write-only class hierarchy

class File {
    File(filename);
    void write(String);
}

class DevNull extends File {
    DevNull();
}

DevNull would be a class that mimics /dev/null (or NUL on Windows), and compared to File it gives important optimization opportunities, as writing to the disk is generally an expensive operation. So, if we are not interested in the output, just as we redirect to /dev/null the output of programs, we can do something by subclassing File, and implement all operations as a no-op.

But adding other methods might make the class hierarchy inconsistent, for example, there is no sensible way to implement a size or read method that would succeed.

foo(File f){
	String content = f.read();
	assert(content.equals(""));
	f.write("Hello");
	String content2 = f.read();
	assert(content.equals("Hello"));
}

As read might fail even for File, as a file is an external resource that might get changed by other processes to, it is not as bad as with Circle and Ellipse to have a subclass that always throws.

On the other hand, if we had a "readonly" class hierarchy:

class File {
	File(filename);
	String read();
}

class DevZero extends File {
	DevZero();
}

where DevZero mimicks /dev/zero.

This is also a valid hierarchy, and again provides important optimization possibilities, but adding a write method breaks the hierarchy, as there is no way to implement the functionality without causing some surprises.

Again, marking all operations as optional, and thus implementing for DevZero a throwing write, is not that bad as write can already file because of other reasons.

In this case, as all those methods can also fail for an implementation that provides the requested functionality, the class hierarchy does not increase the code complexity and add new execution paths.

One could argue that it would be better to provide a class hierarchy for read-only and write-only files:

class ROFile {
	File(filename);
	String read();
}

class DevZero extends ROFile {
	DevZero();
}

class WOFile {
	File(filename);
	void write(String);
}

class DevNull extends WOFile {
	DevNull();
}

but then we are missing a type that can be read and written to.

Removing the "only" attribute permits us to create a subclass that provides both operations:

interface RFile {
	String read();
}

class DevZero implements RFile {
	DevZero();
}

interface WFile {
	void write(String);
}

class DevNull implements WFile {
	DevNull();
}

interface File extends RFile, WFile {
}

class FileOnDisk implements File {
	FileOnDisk(filename);
}

In this case, I also needed to change class to interface as Java does not support multiple inheritances. With this hierarchy, the issues we had before when mixing read and write methods, are gone. Unless someone adds a method that should not belong to the interface.

For example, adding a filename() method that should return a representation of the filename on the file system.

This would break the hierarchy if someone provided an implementation that stores the data directly in ram, as there would be no file on the drive.

So, again, adding a completely orthogonal method might break some implementations.

Of course, it would be possible to add a new interface, but this approach leads to an explosion of interfaces, while in practice there might be just a couple of implementations. One could argue that the design of those classes is more polished, as every interface has a single responsibility (at the end a single method), but it makes understanding the actual code and documentation more difficult.

For once, all methods are decoupled, while in reality there is a certain dependency between those. Thus, without any context, it is harder to explain what the intention of the function is.

For example, suppose we want to have a Writeable interface, with only a write method. We can entangle it together with other interfaces like Readable, with only a read method and Closeable with only a close method. But it is hard to document properly those interfaces, as they could be used also in other places, because the operations they provide, without further context, are too generic.

And if we are not going to use them in other places, then what are the benefits of extrapolating those functions in separate classes and interfaces? Doing so tends to create a system with thousands moving parts, with interdependencies where only one or two works by design, are sensible or needed. It’s an overengineered system.

In practice, marking all operations as failing, gives a more consistent system, especially because most of those operations might fail for an implementation that provides the requested functionality for different reasons.

How to better restrict interfaces

It depends on the language. When talking about interfaces, function signatures, and documentation, the first thing that comes to mind is

  • the number and type of parameter that a function accepts as input

  • the output parameter

  • the function name

In Java, we can even statically encode what types of exceptions can be thrown (but there are also drawbacks), while in C++ we can use noexcept to avoid a subclass from throwing something like "not implemented". GCC has language extension to mark a function as pure/without side effects, but those attributes, unfortunately, do not count as part of the function signature, thus they do not affect overridden methods. But even if, none of those techniques can prevent a subclass from returning dummy values or do other nonsensical operations.

While it is generally not possible to avoid such implementations (dynamically and statically), having something like contracts that can be enforced through class hierarchies, would prevent some type of unhappy design decisions and help to detect implementation errors.

Another possibility, without resorting to external tools, is to make all public functions non-virtual (or final) and provide protected or private functions as customization points.

A trivial example would be:

class Ellipse {
	private double a;
	private double b;
	public Ellipse(double a_, double b_){
		this.a = a_;
		this.b = b_;
	}
	final double a(){ return this.a; } // !
	final double b(){ return this.b; } // !
	double area_impl(){
		return this.a * this.b * Math.PI;
	}
	final double area(){
		double area = area_impl();
		assert(area == this.a*this.b*Math.PI);
		assert(area>=0.0);
		return area;
	}
}

class Circle extends Ellipse {
	public Circle(int r){
		super(r, r);
	}
	final double r(){
		return this.a();
	}
	double area_impl(){
		double r = this.r();
		return Math.pow(r, 2)*Math.PI;
	}
}

As area() cannot be overridden, we can put some logic for testing that pre and postconditions hold, for example, that the return value should not be negative.

By doing so, we are restricting and controlling the customization points in the class hierarchy, while having a public virtual (and non-final) function, gives in practice any subclass much more freedom, and makes it harder to detect possible errors.

As already mentioned, unfortunately, it is not always easy, efficient, or even possible to test everything that is stated in the documentation, or expected by the programmer.