Sharing data between threads

There are many hard problems to solve when programming, like naming things, manual resource management, error safety (in C++ also known as exception safety, which is unfortunate since it is not limited to exception), parallelism and multithreading.

Correct multi-thread programs are very difficult to write correctly and debug. The most useful advice is just to "avoid it", manage threads at a higher level, and reduce the places where threading issues can hide.

It is also extremely useful to reduce at a minimum the sharing of mutable data (since immutable data is race-free). Therefore another reason to avoid non-constant globals, and prefer to copy the data between threads, even if it might be a little more expensive.

But from time to time, there might be some need to share something mutable, and it’s important to see what tools and techniques we have at our disposal.

Since C++11 there are, like in many other programming languages, a set of primitives for synchronizing data and managing threads.

The most known and used tool for ensuring consistency between threads when sharing mutable data is a mutex (std::mutex, std::timed_mutex, std::recursive_mutex, std::recursive_timed_mutex in C++11).

Manual locking is error-prone

Mutexes are fairly easy to use, in the simplest case just create an std::lock_guard (or std::scoped_lock since C++17), and the read/write to the data will be consistent in all threads:

#include <mutex>
#include <future>
#include <cassert>

void foo(std::mutex& m, int& d, int v) {
	std::scoped_lock<std::mutex> _(m);
	d += v;
}

int main() {
	int data = 1;
	std::mutex mutex_data;
	auto f1 = std::async(std::launch::async, [&](){foo(mutex_data, data, 5);});
	auto f2 = std::async(std::launch::async, [&](){foo(mutex_data, data, 1);});
	f1.get();
	f2.get();
	assert(data==7);
}

While easy to explain, this technique has two big drawbacks.

  • There is nothing that suggests that while reading or modifying data, we need to lock an std::mutex, apart from documentation or previous code.

  • We still need to document how to lock the std::mutex (just lock, or use a try_lock, …​.)

Some of those questions are really similar to those you ask yourself when handling memory manually:

  • how to free a resource (delete, delete[], free, …​)

  • when to free it

Thus the answer is to avoid those questions entirely and try to encapsulate those operations in order not to do them manually, just as we do not manage memory manually but have encapsulated the functionality in constructors and destructors.

When locking a mutex, it means we want to synchronize data. Synchronizing data means controlling when a value is read, set or modified.

Thus if we could provide some sort of "getters" or "setters" that ensure that the mutex is locked while the data is used, we would have solved our problem.

Bind getters with mutexes together

Of course, a naive getter like the following would buy us nothing (unless values are copied, but then it would not really be sharing mutable data)

#include <mutex>

struct foo {
		explicit foo(int i_) : i{i_} {}
		int& get(){
			std::scoped_lock<std::mutex> _{m};
			return i;
		}
		const int& get() const {
			std::scoped_lock<std::mutex> _{m};
			return i;
		}
	private:
		int i;
		mutable std::mutex m;
};

since when get() returns, the lock is released, thus changing the underlying value of foo would still not be thread-safe. We could return a more complex object

#include <mutex>
#include <future>
#include <cassert>

template<class T, class mutex = std::mutex>
struct lock_obj {
	lock_obj(T& obj_, mutex& m_) : obj{obj_}, l{m_} {}
	T& obj;
	std::scoped_lock<mutex> l;
};

struct foo {
		explicit foo(int i_) : i{i_} {}
		lock_obj<int> get(){
			return lock_obj{i, m};
		}
		lock_obj<const int> get() const {
			return lock_obj{i, m};
		}
	private:
		int i;
		mutable std::mutex m;
};

int main(){
	foo f {5};
	auto f1 = std::async(std::launch::async, [&](){f.get().obj++;});
	auto f2 = std::async(std::launch::async, [&](){f.get().obj++;});
	f1.get();
	f2.get();
	assert(f.get().obj==7);
}

This way, until lock_obj is in scope, the value is locked.

In my eye, this method has two major drawbacks.

The first is that it is possible to create a deadlock very easily:

foo f{5};
auto g = f.get();
f.get(); // deadlock

This is, of course, true also while locking a mutex manually, so we are not losing anything. Maybe we should name the function lock_and_get to make the intent more clear to the reader.

The second issue is that it is easy to save a reference to the internal value and bypass the mutex, while the code locks perfectly safe.

foo f{5};
auto& g = f.get().obj;
g++; // bypassing mutex

Again, we can limit those problems by using a better naming convention.

While lock_obj encapsulates the data and the mutex together, it is only a helper for managing it. We did not really put the shared data with the mutex together. We still need to remember to use lock_obj, instead of modifying the underlying data directly.

Bind the data and mutex together

For the simple case of a shared instance, where we always need to lock a mutex, we can provide a wrapper class

#include <mutex>
#include <type_traits>

template<class T, class mutex = std::mutex>
class mutexed_obj {
		T obj;
		mutable mutex m;
	public:
		template<typename... Args>
		explicit mutexed_obj(Args&&... args) : obj{std::forward<Args>(args)...} {}

		template<class F>
		friend auto lock(const mutexed_obj<T>& mo, F f){
			std::scoped_lock<mutex> l{mo.m};
			return f(mo.obj);
		}
		template<class F>
		friend auto lock(mutexed_obj<T>& mo, F f){
			std::scoped_lock<mutex> l{mo.m};
			return f(mo.obj);
		}
};

The locking logic is near the data itself, not done manually where one is accessing it, thus removing most possible misuses.

Is this class completely safe? Is the data (obj) able to escape the scope of the mutex?

Unfortunately the class is not completely safe. Since the callback f has access to obj it can store somewhere a pointer or reference to it, and use it later, bypassing the std::scoped_lock. Unfortunately, this will always be possible.

There is also another possible misuse, shown by following snippet:

mutexed_obj<int> mo{};
lock(mo, [&](int&) {
	lock(mo, [](int&){}); // deadlock
}
);

This will cause a deadlock, it’s equivalent of calling get() with lock_obj multiple times. Unfortunately there is no foolprof compile-time method for avoiding this type of issue.

Of course also following class is completely sound

#include <mutex>
#include <type_traits>

template<class T, class mutex = std::mutex>
struct lock_obj {
	lock_obj(T& obj_, mutex& m_) : obj{obj_}, l{m_} {}
	T& obj;
	std::scoped_lock<mutex> l;
};

template<class T, class mutex = std::mutex>
class mutexed_obj2 {
		T obj;
		mutable mutex m;
	public:
		template<typename... Args>
		explicit mutexed_obj2(Args&&... args) : obj{std::forward<Args>(args)...} {}
		auto lock_and_get() {
			return lock_obj{obj, m};
		}
		auto lock_and_get() const {
			return lock_obj{obj, m};
		}
};

While it has the same drawbacks and possible issues, it is, in my opinion, easier to misuse than the callback-approach.

Other drawbacks of both methods, AFAIK unavoidable, is structure packing. It means that generally sizeof(mutexed_obj<foo>) > sizeof(std::mutex) + sizeof(foo), which can or cannot be a problem, depending on you use case.

volatile

volatile is probably the less known qualifier (its big brother is const, while mutable is "just" a specifier) and has also probably been misused a lot in the past for getting thread safety. Thus the advice of today is to avoid it since it has nothing to do with threads or concurrency. It seems to be still useful with hardware interaction, even if seems that compilers have difficulties respecting the authors intent, because of the many different interpretations, and I personally never really saw it used in practice.

Still, in 2001, Andrei Alexandrescu wrote on Dr. Dobb’s an extremely interesting article on how to exploit volatile and the type system for limiting the use of a class interface.

To sum it up:

  • it is possible to define a class or struct with volatile methods, just as it is possible to define const methods.

  • a volatile instance can only execute volatile methods (similar to const), thus giving us the possibility to provide a subset of the class interface when we have a volatile instance.

Of course the same holds for free functions accepting volatile parameters.

How can this be useful? Andrei Alexandrescu marked as volatile the member methods that are thread-safe. If an instance needed to be shared between threads, just create a volatile instance. As bonus, we can also add overloads, but I’m not completely sold that overloading on volatile is generally a good thing, just as you do not normally overload on const, except for accessors functions.

For example:

struct foo{
	void bar1() volatile {} // thread-safe
	void bar2() volatile {} // thread-safe
	void bar2() {} // non thread-safe implementation, but faster than the thread-safe
	void bar3() {} // not thread-safe
};


int main() {
	foo f1 {};
	f1.bar1();
	f1.bar2(); // calls the fast, non thread-safe implementation
	f1.bar3();

	volatile foo f2 {};
	f2.bar1();
	f2.bar2(); // calls the slow thread-safe implementation
	//f2.bar3(); // this line does not compile, as there is no thread-safe bar3 method
}

Unless we control all instances and classes that are shared between threads, the usability seems limited. This is not a mainstream technique, therefore most classes will not have volatile member methods for marking them as threadsafe.

We also want to call thread-unsafe methods, protecting them with a synchronization primitive like a mutex, and mark the resulting method as thread-safe.

This is achieved in Andrei Alexandrescu’s article through a const_cast inside a LockingPtr. It locks a mutex and provides a non-volatile "view" on the object. Here is a shortened version of LockingPtr, since the article is old (before C++11), I’ve also updated it to use the standard library facilities.

/// REM_VOLATILE only removes volatile away
/// also written in uppercase as it should never show up in your code.
template<class T>
T& REM_VOLATILE(volatile T& t){
	return const_cast<T&>(t);
}
template<class T>
T* REM_VOLATILE(volatile T* t){
	return const_cast<T*>(t);
}
template<class T>
const T& REM_VOLATILE(const volatile T& t){
	return const_cast<const T&>(t);
}
template<class T>
const T* REM_VOLATILE(const volatile T* t){
	return const_cast<const T*>(t);
}

template <class T>
struct unique_lock_val {
	unique_lock_val(volatile T& obj, std::mutex& mtx) : obj(REM_VOLATILE(obj)), l(mtx) {
	}
	unique_lock_val(volatile T& obj, volatile std::mutex& mtx) : obj{REM_VOLATILE(obj)}, l{REM_VOLATILE(mtx)} {
	}
	T& obj;
	std::scoped_lock<std::mutex> l;
};

And now we can use those tools as follow:

int main(){
	std::vector<int> tl;
	tl.push_back(1); // this instance is not volatile, so we use it as always

	std::mutex m;
	volatile std::vector<int> ts;
	//tl.push_back(1); // this method is not marked as volatile, thus compilation will fail
	unique_lock_val lv(ts, m);
	lv->push_back(1); // unique_lock_val provides a non-volatile view, while holding the mutex, thus this push_back is thread-safe
}

The technique described by Andrei Alexandrescu holds some similarities with lock_obj, but it is not quite the same.

There are also two issues with this method.

The first one is that this method will not work on primitive types like int, char or enums, but only on structures and classes. With this technique we have no way to prevent errors on those types, thus it is possible to write

#include <mutex>

struct foo {
	int i;
};

int main() {
	std::mutex m;
	volatile foo f{};
	f.i++; // oops, forgot to lock
};

While it is possible to provide a structure that encapsulates a generic integer type, we still need to remember that we do need to use it.

The second issue is that today compilers are exploiting undefined behavior more and more, common exploitation is to mark code that is undefined as dead, and remove it. It makes sense in many situations, but it also hard to debug when this effect is unintended.

While the methodology described by Andrei Alexandrescu seems appealing, a const_cast is usually a bad thing. And it turns out, that just like casting const away from a variable declared as const, and mutating it from the non-const reference, doing the same with volatile is undefined too. Thus while the example with volatile std::vector<int> and unique_lock_val seems to work, it might break without any apparent change.

But if we think about it, similarly to deprecate and delete, we are not really interested in volatile instances per se. We are not even interested what it does really mean what a volatile variable is, and what optimisations the compiler is able to do on it. We just want the compiler to check for us that we lock the mutex. Therefore we can just have a "hidden" non-volatile variable and a visible volatile reference. In this case, just like const, casting volatile will not trigger undefined behavior.

We can achieve such effect with a simple class:

#include <type_traits>
#include <utility>

template<class T>
class volatile_obj {
		T obj;
		static_assert(!std::is_volatile<T>::value, "do not use to store volatile objects");
		static_assert(std::is_class<T>::value, "needs to be a class type");
	public:
		template<typename... Args>
		explicit volatile_obj(Args&&... args):obj{std::forward<Args>(args)...}, value{obj} {}
		volatile T& value;
};

The static_assert on std::is_class seems out of place.

Since we want to use this structure for exploiting the type system on volatile, and it does not work on non-classes, it will not hinder usability. Of course, it is not a complete check; while we won’t be able to create a volatile_obj<int>, we are still able to create volatile_obj<std::pair<int, int>>, which has exactly the same issues. Still, it might prevent some errors.

While this fixes the issues with UB, casting still seems a hacky method. For the reader of a cast, without more context, it is not possible to know if it is legit or not.

but given volatile_obj, we can add a friend function and access the "hidden" non-volatile instance directly, for example:

template<class F, class mutex = std::mutex>
friend auto lock(mutex& m, volatile_obj<T>& o, F f){
	std::scoped_lock<mutex> l{m};
	return f(o.obj);
}

All good then?

In general sizeof(volatile_obj<T>) > sizeof(T), so this class adds an overhead, whereas, if it were not for UB, volatile alone did not add any, just adding compile-time checks. Since structures with one member are normally not padded (tested with GCC, Clang, and MSVC), instead of saving a reference, we can use a less-nice function call.

template<class T>
class volatile_obj {
		T obj;
		static_assert(!std::is_volatile<T>::value, "do not use to store volatile objects");
		static_assert( std::is_class<T>::value, "needs to be a class type");
	public:
		template<typename... Args>
		explicit volatile_obj(Args&&... args) : obj{std::forward<Args>(args)...} {}
		volatile T& value() {
			return obj;
		};
		const volatile T& value() const {
			return obj;
		};
};

and now, with optimizations enabled, we get exactly the same assembly as when using volatile (at least on simple examples). In theory, the variant with volatile could behave worse since the C++ standard says "Access to volatile objects are evaluated strictly according to the rules of the abstract machine", and thus the "as-if" rule cannot be applied.

Can we, therefore, throw REM_VOLATILE away?

No, consider

struct A{};
struct B{A a;};

If we have an instancte of volatile_obj<B>, then we have a reference to a volatile B, and a reference to volatile A. We can use .value() to get a non-volatile reference on the first if we have access to volatile_obj<B>, but what if we do not, or only want a non-volatile reference of A?

Also, while the article showed only volatile and a locked mutex, the two things are not correlated. There might be other techniques or lock types, and when working with external libraries, those might provide thread-safe operations, but it’s improbable that those will be marked as volatile.

Therefore either we add support for all use cases, like a mutex, directly inside volatile_obj (maybe create a type for different use-cases in order not to create a sink-all class), or leave REM_VOLATILE.

Notice that adding support for all use-cases is hard, since we might need to remove the qualifier as implementation detail:

struct foo {
	void bar() volatile {
		auto self = unique_lock_val<foo>{*this, this->m};
		// do something else
	}
	std::mutex m;
};

int main() {
    volatile_obj<foo> f;
    f.value().bar();
}

In the member function bar we unfortunately do not know if foo is really volatile or not. And since std::mutex and std::scoped_lock do not have volatile member functions, just like any other class, we need to remove the qualifier in order to use them, or provide a volatile-friendly alternative. Thus the cast is still necessary for many use cases.

We need to assume that no one is gone to really instantiate a volatile struct or class. Since AFAIK there are no valid use-cases, and if there are, they will not be that much, it should be possible to keep an eye on them in order to ensure that REM_VOLATILE is used safely. In those cases it might even be possible to delete the overload for those types where the class is really instantiated with the`volatile` qualifier.

Personally, I also do not like using the volatile keyword, even if Andrei Alexandrescu’s provides a nice explanation of why it still makes sense:

Outside a critical section, any thread might interrupt any other at any time; there is no control, so consequently variables accessible from multiple threads are volatile. This is in keeping with the original intent of volatile — that of preventing the compiler from unwittingly caching values used by multiple threads at once. Inside a critical section defined by a mutex, only one thread has access. Consequently, inside a critical section, the executing code has single-threaded semantics. The controlled variable is not volatile anymore - you can remove the volatile qualifier.

In short, data shared between threads is conceptually volatile outside a critical section, and non-volatile inside a critical section.

You enter a critical section by locking a mutex. You remove the volatile qualifier from a type by applying a const_cast. If we manage to put these two operations together, we create a connection between C++'s type system and an application’s threading semantics. We can make the compiler check race conditions for us.

— Andrei Alexandrescu

There is a lot of history behind the keyword that adds confusion, while we only want to exploit the type system.

So I would prefer to add a "new" qualifier, which should obviously be orthogonal to volatile (and const of course). Since it is not possible to user-define qualifiers, and since volatile is not used that much, I’ll create a BAD macro, and create a qualifier that clashes with volatile.

To sum it up:

#include <type_traits>
#include <utility>
#include <mutex>

#define THREAD_AWARE   volatile
#define THREAD_UNAWARE // be explicit, state that function is not thread-aware by design

template<class T>
T& REM_VOLATILE(volatile T& t){
	return const_cast<T&>(t);
}
template<class T>
T* REM_VOLATILE(volatile T* t){
	return const_cast<T*>(t);
}
template<class T>
const T& REM_VOLATILE(const volatile T& t){
	return const_cast<const T&>(t);
}
template<class T>
const T* REM_VOLATILE(const volatile T* t){
	return const_cast<const T*>(t);
}

template<class T>
class thread_aware_obj {
	T obj;
	static_assert(!std::is_volatile<T>::value, "do not use to store volatile objects");
	static_assert( std::is_class<T>::value, "needs to be a class type");
	public:
		template<typename... Args>
		explicit thread_aware_obj(Args&&... args) : obj{std::forward<Args>(args)...} {}
		THREAD_AWARE T& value(){
			return obj;
		};
		friend T& REM_THREAD_AWARE(thread_aware_obj<T>& t){
			return t.obj;
		}
		friend const T& REM_THREAD_AWARE(const thread_aware_obj<T>& t){
			return t.obj;
		}
};


// example usage:

struct foo{
	void bar1() THREAD_AWARE {}
	void bar2() THREAD_AWARE {}
	void bar2() THREAD_UNAWARE {}
	void bar3() THREAD_UNAWARE {}
};


// helper for class foo
template <class F>
auto lock(std::mutex& m, thread_aware_obj<foo>& o, F f){
	std::scoped_lock<std::mutex> l{m};
	return f(REM_THREAD_AWARE(o));
}

int main(){
	thread_aware_obj<foo> f{};
	std::mutex m{};
	f.value().bar1();
	f.value().bar2(); // calls THREAD_AWARE overload
	//f.value().bar3(); // does not compile as not thread-aware
	lock(m,f,[](foo& f_){f_.bar2();}); // calls THREAD_UNAWARE overload
}

I’ve not renamed the REM_VOLATILE casts to REM_THREAD_AWARE, since the latter is always a safe operation, and those two functions also do different things. The first would also be hopefully only get used as an implementation detail.

We can conclude that the volatile qualifier is surely still useful, little or nothing changed in this regard.

While the first approach is much more generic, the volatile trick permits us to write and define a thread-safe interface. The first technique answers the question: "How to take a mutable object that needs to be shared by threads and make any modification race-free?"

The volatile trick is orthogonal to that, one method does not necessarily exclude the other. The first approach made us able to make something that was not thread-safe, and add a safety net around it, with the second approach, we can write a class that does not need such a net when used, but it might be using it internally.

The biggest drawback of THREAD_AWARE is that if we would like to exploit volatile for something else that might have nothing to do with concurrency, it would clash with this usage, since it is not a new qualifier.

physical and logical races

Being able to create THREAD_AWARE free or member function, is very valuable for creating easier to use interfaces, for example, a vector where some operations are thread safe. But to preserve state consistency, related state variables need to be updated in a single atomic operation. As far as I know, there is no generic way to provide a class with a minimal subset of operation (apart binding it with a mutex) to achieve that. On the other hand, making every single operation thread-safe it will degrade performance and hide concurrency issues, consider

void foo(std::vector<int>& v){
	if(v.empty()){
		v.push_back(0);
	}
}

none of the operations are thread safe. Thus if we want to use foo with a variable shared between threads, thanks to tools like sanitizers or memory checkers we will be able to spot the error nearly automatically, without the need to understand the business logic or surrounding code.

Now suppose that there exist a thread-safe vector class tsafe::vector, that ensures both size and push_back to be thread-safe:

void foo(tsafe::vector<int>& v){
	if(v.empty()){
		v.push_back(0);
	}
}

The code is still not thread-safe, because we might add an element to v while it’s not empty, or not adding any element while it’s empty. But this time, tools like Valgrind or sanitizers will not be able to help us, because there are no physical race conditions, only logical.

My conclusion is that unless you can provide all operations the user might want, there should be a way (getter or callback), to define atomic operations, since it is not possible to combine the given operation in an atomic way.

There is also another issue, that unfortunately THREAD_AWARE / volatile does not solve, but helps to document better.

coding hint for private and public functions

If a class needs to be shared between threads, and has a single private mutex protecting the data members, a sensible design choice is that the mutex should be locked in all the public member functions. All functions called by the public member function should assume the mutex is already locked, and private functions should never lock the mutex.

If a public member function needs to call another public member function, then split the second function one in two parts: a private implementation function that does the work, and a public member function that just locks the mutex and calls the private one. The first member function can then also call the implementation function without having to worry about recursive locking.

A simple way to partially enforce such a policy, is by letting private member function to take a lock by const-reference. They do not need to do anything with it, it’s just there to ensure that a lock is held while the function is called:

#include <mutex>

struct foo {
		// public function, thread-safe, no need for the caller to do anything like locking a mutex
		void bar(){
			// signature of baz enforces to create the lock, before calling the function
			// no need to rely on the documentation for remembering code writers to lock it
			baz(std::scoped_lock<std::mutex>{m});
		}
	private:
		// internal function, not thread-safe, thus taking a lock by const-reference
		void baz(const std::scoped_lock<std::mutex>&) {}
		std::mutex m;
};

This is the main reason why I did not make the lock inside lock_obj private.

In case of class hierarchies protected functions should be treated as private functions, and in case of free functions, similar guidelines can be enforced.

And again I have no advice on how to avoid deadlocks at compile time. I have found no way to ensure that in a private function, the mutex does not get acquired. Also in case of multiple mutexes, there is still the chance to lock the wrong one.

TIL that one of Rust selling point is not threaded safety as I believed. It "only" offers race-free safety. Apparently deadlocks are still possible, and there seems to be no way to avoid them in a Turing complete language.

I suppose that like thanks to RAII for getting memory-leak safety, by careful designing some building blocks like mutexed_obj and thread_aware_obj, we can get nearly close to Rust. Unfortunately in C++ because of backward compatibility those conventions are not enforced by the type system. And even if always using those techniques, there are ways to screw things up, but as long as they need to be done with intent, or by writing horrible and fishy code, instead of by accident or normal looking code, I think we have achieved our goal.

Reentrant mutexes by default

Of course, it would be possible to avoid many deadlocks and some coding guidelines by using a recursive/reentrant mutex.

While it would solve the problem work, it’s usage is discouraged, unless there is a good reason to use it.

It seems like the same debate on std::unique_ptr and std::shared_ptr, the difference is who owns the data (or mutex). But in case of unique_ptr, the compiler will catch those who try to copy the pointer. But there is no way to test for reentrancy.

So it might provide a benefit, since there are no compile-time checks, but I would also not use a recursive mutex by default. It will hide some possible programming error (in case we should not call functions recursively for example), and it sends to the reader the wrong message. We tried to centralize where we need to synchronize, and by using a recursive mutex now we are suggesting that placing locks in every function is fine. When using a recursive mutex we are stating that our code should work even if THREAD_AWARE functions call other THREAD_AWARE functions, which might not be true, as explained nicely here, because we are possibly breaking an invariant of our class.

Also compilers are not (yet) able to deduce if a lock is necessary and eventually remove it. Since the selling point of multithreading is to achieve better performance, it would be a pity to have a less performant program because of too many mutexes or locks.

Multiple mutexes

There is another open issue, that unfortunately has no easy solution: deadlocks when using multiple mutexes

Consider:

int main(){
	mutexed_obj<int> mo1{};
	mutexed_obj<int> mo2{};

	// thread 1
	lock(mo1, [&mo2](int&){
		lock(mo2, [](int&){});
	});

	// thread 2
	lock(mo2, [&mo1](int&){
		lock(mo1, [](int&){});
	});
}

Mutexes always need to be locked in the same order. If the mutex is "an implementation detail", then user has no idea. That’s another reason why I chose such an explicit class and function names, in those generic routines it cannot be really an implementation detail!

Is there a way to mitigate this issue?

When using multiple mutexed_obj or volatile_objs, it is possible to provide variadic lock friend functions:

#include <mutex>
#include <type_traits>

template<class T, class mutex = std::mutex>
class mutexed_obj {
		T obj;
		mutable mutex m;
		public:
			template<typename... Args>
			explicit mutexed_obj(Args&&... args) : obj{std::forward<Args>(args)...} {}

			template <typename... Objs, class F>
			friend auto lock(F f, Objs&&... objs){
				std::scoped_lock lck{objs.m...};
				return f(objs.obj...);
			}
};

int main(){
	mutexed_obj<int> mo1{};
	mutexed_obj<int> mo2{};

	auto v1 = lock([](int&, int&){ return 1;}, mo2, mo1);
	auto v2 = lock([](const int&, int&){return 1;}, mo2, mo1);
	auto v3 = lock([](const int&){return 1;}, mo2);
}

but when mixing those types with others is trickier to come with a generic solution. Some sort of view/reference type for "wrapping" a mutex and obj together might come handy

template<class T, class mutex = std::mutex>
struct mutexed_obj_view {
	T& obj;
	mutex& m;
};

template<class T, class mutex = std::mutex>
auto to_mutexed_obj_view(T& obj, mutex& m) {
	return mutexed_obj_view<T, mutex>{obj, m};
}

int main() {
	mutexed_obj<int> mo{};

	int freevalue{};
	std::mutex freemutex{};

	auto v = lock([](const int&, int&){return 1;}, mo, to_mutexed_obj_view(freevalue, freemutex));
}

I’m uncertain if it is worth the trouble writing a function like

template <typename... Data, typename Mutexes, class F>
auto lock(F f, Data&&... data, Mutexes&&... mutexes) {
	static_assert(data.size() >= mutexes.size()); // does it otherwise make sense?
	std::scoped_lock l{objs.m..., mutexes...};
	return f(f.objs..., data...);
}

in case the number of objects and mutexes are not the same.

Conclusion

Contrary to the algorithm library and techniques like RAII, designing classes for concurrency with an idiomatic and safe-to-use interface seems harder. Those classes do not seem to compose very well, and which approach is better probably depends on your application logic and the data itself.

There is still a lot to explore, structures and paradigms to try.

Apart from the language, it is important to know which tools can be used to validate assumptions. Thread sanitizers and memory checkers are not tied to a specific language, and can be used to verify if our assumptions are unsound. Static tools are important too, but unfortunately I do not know any that could help with concurrency.