Lazy initialization, an alternative to double-checked locking

Notes published the
9 - 11 minutes to read, 2225 words

Double-checked locking is a programming pattern that emerges when optimizing the lazy initialization of some global data in front of a multithreaded environment.

One of the disadvantages of global variables is that if they are not trivial to create, they will consume resources even if unused. By initializing those variables only when needed, one can at least ensure that no resources are wasted unnecessarily.

While the pattern does not seem difficult, there are many (subtly) broken implementations, and verifying the correctness of an implementation is not trivial.

Where does this pattern "naturally" come up?

As far as I can see, it is often the result of an iterative approach bundled with the singleton pattern.

Warning โš ๏ธ
I’m not promoting the usage of global variables through the singleton pattern (as described in Design Patterns - Elements of Reusable Object-Oriented Software), just showing up a place where the double-checked locking is often used

Consider those two equivalent versions:

class Resource {
    private Resource() {}
    public static final Resource resource = new Resource();

    public static String answerOfLife(){
        return "42";
    }
}

same as before

class Resource {
    private Resource() {}
    private static final Resource resource = new Resource();
    public static Resource getResource(){
        return Resource.resource;
    }
    public static String answerOfLife(){
        return "42";
    }
}

Contrary to other languages, globals/static are initialized lazily in Java: when the class file is loaded. A class file is loaded when the runtime needs to access a symbol from it otherwise it would not be possible to have conditional dependencies.

Thus using a static variable might not be "lazy enough", calling the function Resource.answerOfLife() will load the corresponding class file, and initialize Resource.resource too, even if unused.

The laziest possible approach is to initialize Resource.resource when the function Resource.getResource() is called.

class Resource {
    private Resource() {}
    private static Resource resource = null;
    public static Resource getResource(){
        if(Resource.resource == null){
            Resource.resource = new Resource();
        }
        return Resource.resource;
    }
}

This approach works, but only when working in a single-threaded environment.

Since getResource can be called from multiple threads, it better ensures that the variable is always initialized correctly

The simplest solution is to synchronize the whole function

class Resource {
    private Resource() {}
    private static Resource resource = null;
    public static synchronized Resource getResource(){
        if(Resource.resource == null){
            Resource.resource = new Resource();
        }
        return Resource.resource;
    }
}

The implementation is obviously correct (assuming there is no other function that accesses Resource.resource directly, reflection included), but it is inefficient (citation needed).

A more efficient solution would be to verify if Resource.resource has been initialized without synchronizing. If it has already been initialized, then there is nothing to do. If not, lock and initialize the resource.

It might sound simple, but there are different things to pay attention to.

A first nรคive implementation:

class Resource {
    private Resource() {}
    private static Resource resource = null;
    public static Resource getResource(){
        if(Resource.resource != null){
            return Resource.resource;
        }
        synchronized(Resource.class){
            Resource.resource = new Resource();
            return Resource.resource;
        }
    }
}

the bug is obvious after reading the code a couple of times, inside the synchronized block we need to verify (again) if Resource.resource has been initialized in the meantime.

class Resource {
    private Resource() {}
    private static Resource resource = null;
    public static Resource getResource(){
        if(Resource.resource != null){
            return Resource.resource;
        }
        synchronized(Resource.class){
            if(Resource.resource != null){
                return Resource.resource;
            }
            Resource.resource = new Resource();
            return Resource.resource;
        }
    }
}

The implementation looks correct but still has one issue. Resource.resource = new Resource(); does mulitple thing, it creates a new Resource, and it assigns a value to Resource.resource, but the order is not specified. The caller to getResource might get a non-null and uninitialized (or not completely initialized) value back!

This can be fixed with the volatile keyword:

A write to a volatile field (ยง8.3.1.4 ๐Ÿ—„๏ธ) happens-before every subsequent read of that field.

Finally, a thread-safe implementation, that initializes the variable only once and locks only if necessary:

class Resource {
    private Resource() {}
    private static volatile Resource resource = null;
    public static Resource getResource(){
        if(Resource.resource != null){
            return Resource.resource;
        }
        synchronized(Resource.class){
            if(Resource.resource != null){
                return Resource.resource;
            }
            Resource.resource = new Resource();
            return Resource.resource;
        }
    }
}

According to Jousha Bloch there is still some room for improvement; the following implementation might even be more performant

class Resource {
    private Resource() {}
    private static volatile Resource resource = null;
    public static Resource getResource(){
        Resource result = Resource.resource;
        if(result != null){
            return result;
        }
        synchronized(Resource.class){
            result = Resource.resource;
            if(result != null){
                return result;
            }
            result = new Resource();
            Resource.resource = result;
            return result;
        }
    }
}

By using a local copied reference result, the accesses to a volatile field are halved (three accesses instead of six), which is slower enough to make a relevant difference.

In particular, the need for the local variable (result) may be unclear. What this variable does is to ensure that the field is read only once in the common case where it’s already initialized. While not strictly necessary, this may improve performance and is more elegant by the standards applied to low-level concurrent programming. On my machine, the method above is about 1.4 times as fast as the obvious version without a local variable.

— Joshua Bloch
Effective Java (Third edition)

Now imagine having to write some similar code, every time you want to initialize something lazily.

You might want to try to encapsulate it in a function, but by taking a different route, there is a much simpler solution.

Alternatives

As mentioned, some languages provide a "builtin" solution to this pattern

Java

In Java, global variables are already initialized lazily.

It is possible to delay the initialization even further by adding an indirection; a helper class, whose sole purpose is to hold the global variable.

By using an inner class, it works also for classes that do not have a public constructor:

class Resource {
    private Resource() {}
    private static class Helper {
        private static final Resource resource = new Resource();
    }
    public static Resource getResource() {
        return Helper.resource;
    }
}

Another advantage of this approach is that it makes it possible to mark the reference as final, and it is not necessary to use volatile.

The JVM ensures that final variables are initialized correctly even when multiple threads are trying to access them concurrently.

In this case, Helper.class is loaded when Resource.getResource() is executed, thus only when someone is trying to access the resource.

If the class Resource does not have any other static member variables or functions, adding the indirection is unnecessary; at that point one can even drop the function for accessing the resource

class Resource {
    private Resource() {}
    public static final Resource resource = new Resource();
    // no other static data or functions
}

In case of exceptions

What if your lazy-initialized variable might throw an exception?

Unfortunately, in this case relying on the JVM loading the class file has some drawbacks.

Take following snippet as example

// Type your code here, or load an example.
class Main {
    public static void main(String[] args) {
            System.out.println("Hello, World!");
            Resource.counter = 0; // let Resource.Resource fail
            Resource.getResource();
            Resource.counter = 1;
            Resource.getResource();
    }
}

class Resource {
    static public int counter = 0;
    private Resource() {
        if(counter == 0){
            ++counter;
            throw new NullPointerException();
        }

    }
    private static class Helper {
        private static final Resource resource = new Resource();
    }
    public static Resource getResource() {
        try{
            return Helper.resource;
        } catch(ExceptionInInitializerError ex){
            return null;
        }
    }
}

The first time the constructor of Resource is executed, it will throw a NullPointerException.

It is not possible to catch it directly, as the JVM wraps it in a ExceptionInInitializerError, thus the catch(ExceptionInInitializerError ex).

If you call a second time Resource.getResource, you’ll get a NoClassDefFoundError.

The JVM will not try to initialize the static member variable again!

Note ๐Ÿ“
This is a dumb example just to show how the JVM works. A more realistic example might be trying to load a configuration file, or another fallible action.

It is possible to catch the NoClassDefFoundError in Resource.getResource too, and return another object or null, but this is often not the desired approach.

I am not aware of any method for telling the JVM to to reload the classfile of Helper and execute the Resource constructor again, until Helper.resource is correctly initialized. It might be possible with a custom classloader, but it would hurt the portability of an application taking advantage of this feature.

C++

In C++, this pattern relies on static memory allocation of local variables

// header file
const std::string& get_data();

// source file
const std::string& get_data(){
    static const std::string data = "some data";
    return data;
}

Contrary to

// header file
const std::string& get_data();

// source file
const std::string data = "some data";

const std::string& get_data(){
    return data;
}

or

// header file
extern const std::string data;

// source file
const std::string data = "some data";

when initializing the data lazily, std::string data is always initialized before someone tries to access it and only initialized if someone is accessing it.

This fixes both the initialization order fiasco, and the fact that the application might consume unnecessary memory if it does not need to access data.

The C++ language specifies that, in the absence of exceptions, the constructor of data is executed only once, even if multiple threads are calling get_data without synchronizing. If an exception is thrown, the constructor is executed again the next time the line that initializes data is executed.

In practice, it means that the line static const std::string data = "some data"; behaves as if there is a mutex that protects the initialization of data until data is initialized correctly.

Thus this method does not have the same limitations of the one presented in Java.

This language construct can be used for executing arbitrary code:

// not thread safe
// returns false on failure
bool init_subsystem();

// thread safe
void init_system(){
    static const int dummy = []{
        if(not init_subsystem()){
            throw std::runtime_error("failed to init subsystem");
        }
        return 0;
    }();
}

While init_subsystem cannot be used from multiple threads, init_system can, and also takes failure into account.

Such a pattern can be used for loading, for example, a library at runtime that requires some initialization.

The C++ standard library also provides a solution in the form of a library routine: std::call_once

#include <thread>

// not thread safe
// returns false on error
bool init_subsystem(int);

// thread safe
std::once_flag flag;

void simple_do_once()
{
    std::call_once(flag, []{
        if(not init_subsystem(42)){
            throw std::runtime_error("failed to init subsystem");
        }
    });
}

Note that contrary to the immediately invoked lambda, the return value (if there is any) is discarded, and an additional variable (std::once_flag) with the same scope of the data to be initialized/routine to be executed is necessary.

I do not think there is any situation where std::call_once should be preferred over a static variable unless the std::once_flag is shared between different function calls, or if you want to lazily initialize a (non-static) member variable of an instance of a class shared between multiple threads.

I never had one one of those use-cases.

C

The C language lacks the language facilities provided by Java and C++ for initializing lazily some global data, but it provides a similar library solution as in C++: call_once

#include <threads.h>

struct data{
    // some data to be filled at runtime
};

struct data d = {};
void init_data(){
    // modify d
}
once_flag flag = ONCE_FLAG_INIT;

struct data* get_data() {
    call_once(&flag, init_data);
    return &d;
}

Note that contrary to the C++ facility, the C call_once version takes a function pointer.

Thus in C it is generally only possible to reach for global data-

POSIX and Windows

Both POSIX and Windows provide a library facility for their C ecosystem (thus only useful before C11): pthread_once ๐Ÿ—„๏ธ and InitOnceExecuteOnce ๐Ÿ—„๏ธ. It is not specified if one exception is thrown, thus their usability is limited in C++.

Does lazy initialization actually use the double-checked locking mechanism

None of the provided alternatives documents whether the used mechanism is equivalent to the double-checked locking mechanism or not.

On the contrary, the expectation is that those mechanisms are at least as efficient as the double-checked locking mechanism (otherwise they could be updated to the double-checked locking mechanism too)

For example, the mechanism used by the class loader is one level below the interpreted bytecode, thus it sounds like it should be more efficient even if using something equivalent (and in fact, the JVM can do some optimization that it will not do when using a non-final volatile reference).

Marking the variables as const/final expresses the intent more effectively, and this also gives the compiler (and optimizer that work on the bytecode level) better opportunities on how to optimize the code.

Also, note that implementing the double-checked locking pattern is error-prone in C and C++ too ๐Ÿ—„๏ธ.

Final considerations

This pattern is often used together with the Singleton pattern, or when initializing libraries loaded at runtime.

Another place where it could come up is when initializing a non-static member variable that is accessed from multiple threads.

I’ve never had such use-case, but in this case the presented language features of C++ and Java do not have you covered, as they only work for global instances.


Do you want to share your opinion? Or is there an error, some parts that are not clear enough?

You can contact me anytime.