Implement a type erased container and iterator
iterator and type erasure
The main reason I needed to write a custom iterator was that I needed a partially type-erased container.
When I wrote down map as an example, I left out the whole iterator interface, as it would not have been possible to write an iterator that was generic enough.
Now I had a couple of use cases where writing such an iterator was possible and a sensible decision.
Before deciding to use a custom container, I already decided to use span, so that my algorithms would support everything convertible to it.
But where span cannot help further, is to
-
provide a common type for
span<s*>andspan<unique_ptr<s>>, where you want to call member functions ofsif the instances are not null -
type-erase containers that do not provide access to the underlying buffer, but are still accessible by index.
Both issues are partially related and can be fixed with the same technique.
Instead of storing a pointer and the size to the internal buffer, store a (void)pointer to the container, and provide an index-based interface for accessing the content.
Since not all containers I wanted to support give access to the underlying buffer, or use a named function instead of operator[], or a different named function than size for querying the size, the constructor of my class needs to create appropriate adaptors. With if constexpr and requires, it can all be done inside the same function:
#include <cstddef>
#include <type_traits>
namespace fek {
template <class T>
class span_ptr {
using void_ptr = std::conditional_t<std::is_const_v<T>, const void*, void*>;
void_ptr data = nullptr;
T* ( *get_fun )( void_ptr, std::size_t ) = +[]( void_ptr, std::size_t) -> T* {return nullptr;};
size_t ( *size_fun )( const void* ) = +[](const void*) -> size_t { return 0; };
public:
constexpr span_ptr() noexcept = default;
template <class C>
explicit constexpr span_ptr( C& c ) noexcept
: data( &c )
, get_fun( []( void_ptr ptr, std::size_t index ) -> T* {
using C2 = std::conditional_t<std::is_const_v<T>, const C*, C*>;
auto typed_data = reinterpret_cast<C2>( ptr );
if constexpr ( requires { typed_data->item( index ); } )
{
return typed_data->item( index );
}
else if constexpr ( requires { typed_data->getElement( index ); } )
{
return typed_data->getElement( index );
}
else
{
return ( *typed_data )[index].get();
}
} )
, size_fun( []( const void* ptr ) -> size_t {
auto typed_data = reinterpret_cast<const C*>( ptr );
if constexpr ( requires { typed_data->length(); } )
{
return typed_data->length();
}
else if constexpr ( requires { typed_data->getAmountOfElements(); } )
{
return typed_data->getAmountOfElements();
}
else
{
return std::size( *typed_data );
}
} )
{}
// element access
T* operator[]( std::size_t i ) const {
return this->get_fun( data, i );
}
// observers
std::size_t size() const noexcept {
return this->size_fun( data );
}
bool empty() const noexcept {
return this->size_fun( data ) == 0;
}
};
} If I just wanted to support things convertible to span<s*> and span<unique_ptr<s>>, then the constructor can be simplified to
template <class C>
explicit constexpr span_ptr( C& c ) noexcept
: data( &c )
, get_fun( []( void_ptr ptr, std::size_t index ) -> T* {
using C2 = std::conditional_t<std::is_const_v<T>, const C*, C*>;
auto typed_data = reinterpret_cast<C2>( ptr );
return ( *typed_data )[index].get();
} )
, size_fun( []( const void* ptr ) -> size_t {
auto typed_data = reinterpret_cast<const C*>( ptr );
return std::size( *typed_data );
} )
{} Once one learns these two patterns, there is not much more to say about this class. It is the most straightforward way to do type-erasure for types with reference semantics.
One could implement the equivalent of subspan, and store an additional offset and maximum size as member variables. At that point, one needs to decide how to react when span_ptr::operator[] is called with an invalid parameter. The current implementation does whatever the underlying container does, but once sub-spans are in play, the range of valid indices for this class is different than the valid indices for the pointed container. I suppose that throwing an std::out_of_range is the easiest thing to do, except for leaving the implementation with no checks and declaring the behaviour undefined.
There are many other possibly useful member functions that can be added; the first that come to mind are front and back. I would try to avoid adding something that span does not have, in order to avoid having too many similar interfaces. An algorithm implemented as a free function is probably a better alternative.
Since most functions can be implemented on top of a common interface, it makes sense to use the same techniques used for writing an iterator from scratch to encapsulate the boilerplate code and make it easier to reuse for writing other classes.
The interesting part was writing the iterator; thanks to the helper classes, the implementation looks straightforward.
The underlying idea is the same as that of span_ptr. Store a pointer to the actual data structure, an index to mark the current position, and a callback for accessing the data.
namespace fek {
//! part of the API of forward_iterator
//! relies on operator++ for defining operator++(int)
struct forward_iterator_api {
template <typename Self>
auto operator++(this Self&& self, int) {
auto tmp = self;
++self;
return tmp;
}
};
//! part of the API of bidirectional_iterator
//! relies on operator-- for defining operator--(int)
struct bidirectional_iterator_api {
template <typename Self>
auto operator--(this Self&& self, int) {
auto tmp = self;
--self;
return tmp;
}
};
//! part of the API of random_access_iterator
//! relies on operator+= for defining symmetrical operator+, operator-=, operator-
//! relies on operator* and operator+= for defining operator[]
//!
//! Ensure that I defines difference_type as std::ptrdiff_t for consistency!
template <class I>
struct random_access_iterator_api {
friend I operator+(I it, std::ptrdiff_t n) {
it += n;
return it;
}
friend I operator+(std::ptrdiff_t n, I it) {
it += n;
return it;
}
I& operator-=( std::ptrdiff_t n ) {
return *this += ( -n );
}
friend I operator-( I it, std::ptrdiff_t n ){
it += ( -n );
return it;
}
template <typename Self>
decltype(auto) operator[](this Self copy, std::ptrdiff_t n) {
copy += n;
return *copy;
}
};
//! type-erased iterator class
//! underlying data structure is accessed by index
template <class T>
class type_erased_iterator
: public forward_iterator_api
, public bidirectional_iterator_api
, public random_access_iterator_api<type_erased_iterator<T>>
{
std::size_t index = 0;
using void_ptr = std::conditional_t<std::is_const_v<T>, const void*, void*>;
void_ptr data = nullptr;
T* ( *get_fun )( void_ptr, std::size_t ) = nullptr;
public:
constexpr type_erased_iterator() noexcept = default;
constexpr explicit type_erased_iterator( std::size_t index_, void_ptr data_, T* ( *get_fun_ )( void_ptr, std::size_t ) ) noexcept
: index( index_ )
, data( data_ )
, get_fun( get_fun_ )
{}
//! API of input_or_output_iterator
//! note that both operator* and operator-> return a pointer
using difference_type = std::ptrdiff_t;
constexpr T* operator*() const {
return this->get_fun( data, index );
}
constexpr T* operator->() const {
return this->get_fun( data, index );
}
constexpr type_erased_iterator& operator++() {
++index;
return *this;
}
//void operator++(int);
//! API of input_iterator
using value_type = T*;
using forward_iterator_api::operator++;
bool operator==(const type_erased_iterator& other) const {
return this->index == other.index and this->data == other.data;
}
//! API of bidirectional_iterator
constexpr type_erased_iterator& operator--() {
--index;
return *this;
}
using bidirectional_iterator_api::operator--;
//! API of random_access_iterator
auto operator<=>( const type_erased_iterator& other ) const {
return this->index <=> other.index;
}
type_erased_iterator& operator+=( difference_type n ) {
index += n;
return *this;
}
friend difference_type operator-( const type_erased_iterator& a, const type_erased_iterator& b ) {
return a.index - b.index;
}
};
static_assert( std::input_or_output_iterator<type_erased_iterator<int>> );
static_assert( std::input_iterator<type_erased_iterator<int>> );
static_assert( std::forward_iterator<type_erased_iterator<int>> );
static_assert( std::bidirectional_iterator<type_erased_iterator<int>> );
static_assert( std::random_access_iterator<type_erased_iterator<int>> );
} One meaningful design decision was to store the callbacks in type_erased_iterator instead of the type-erased container, just like I did here for testing purposes, because I wanted type_erased_iterator to be able to outlive span_ptr.
It turns out that for most use-cases, I could even replace the span_ptr class with a factory function that returns a pair of iterators.
Another example: container of strings
span_ptr might seem an uncommon case; how often does it happen that you have a vector<unique_ptr<T>> and vector<T*> and you want to iterate on those containers and access the pointed data? In fact, not so often, but because normally I try to avoid inconsistent interfaces and too many indirectsions. There is, however, another use case that happens often: a container of strings.
There are a lot of types for working with strings in the language and standard library: char*, const char*, char[], std::string, and string_view.
As long as you work with a single string, it is "easy" to convert between those types; many conversions are also implicit, and it is often possible to avoid unnecessary copies and allocations.
But what happens when you need to work with multiple strings?
There is a combinatorial explosion of types: std::vector<std::string>, std::string[], std::vector<std::string_view>, std::string_view[], std::vector<char*>, std::vector<const char*>, and char** (honorable mention since it is the function signature of main, with all const-qualified variations), and so on. At least with std::span it is possible to type-erase some containers; it makes it possible to wrap std::vector<std::string>, std::array<std::string, N>, and other types without additional overhead, as long as the contained type is the same. But if the contained type is different, then span does not help. It is not possible to wrap std::vector<std::string> and std::vector<std::string_view> in a common std::span type; at least not without creating another temporary container with the same contained type.
Assuming that for the interested use-cases all containers can be accessed by index, with a few changes, it is possible to transform span_ptr to span_string_view:
#include <cstddef>
#include <stdexcept>
#include <string>
#include <string_view>
#include <type_traits>
namespace fek {
class span_string_view {
const void* data = nullptr;
std::string_view (*get_fun)(const void*, std::size_t) =
+[](const void*, std::size_t) -> std::string_view { throw std::out_of_range("span_string_view is empty"); };
size_t (*size_fun)(const void*) = +[](const void*) -> size_t { return 0; };
public:
constexpr span_string_view() noexcept = default;
template <class C>
explicit constexpr span_string_view(const C& c) noexcept
: data(&c),
get_fun([](const void* ptr, std::size_t index) -> std::string_view {
auto typed_data = reinterpret_cast<const C*>(ptr);
using ret_type = decltype((*typed_data)[index]);
using element_t =
std::remove_const_t<std::remove_reference_t<ret_type>>;
static_assert(std::is_same_v<std::string_view, element_t> ||
std::is_same_v<char*, element_t> ||
// if string, must return by reference,
// or string_view would dangle
std::is_reference_v<ret_type>);
return std::string_view((*typed_data)[index]);
}),
size_fun([](const void* ptr) -> size_t {
auto typed_data = reinterpret_cast<const C*>(ptr);
return std::size(*typed_data);
}) {}
// element access
std::string_view operator[](std::size_t i) const {
return this->get_fun(data, i);
}
// observers
std::size_t size() const noexcept { return this->size_fun(data); }
bool empty() const noexcept { return this->size_fun(data) == 0; }
};
} The usage would look like the following
int main(int argc, char** argv){
char data[]{"123"};
{
std::vector<std::string> vs;
vs.emplace_back(data);
auto view = fek::span_string_view(vs);
assert(view.size() == 1);
assert(view[0] == std::string_view(data));
}
{
std::vector<std::string_view> vs;
vs.emplace_back(data);
auto view = fek::span_string_view(vs);
assert(view.size() == 1);
assert(view[0] == std::string_view(data));
}
{
std::vector<const char*> vs;
vs.emplace_back(data);
auto view = fek::span_string_view(vs);
assert(view.size() == 1);
assert(view[0] == std::string_view(data));
}
{
std::vector<char*> vs;
vs.emplace_back(data);
auto view = fek::span_string_view(vs);
assert(view.size() == 1);
assert(view[0] == std::string_view(data));
}
{
auto vs = std::span(argv, argv+argc);
auto view = fek::span_string_view(vs);
assert(view.size() == 0);
}
}
If you have questions, comments, or found typos, the notes are not clear, or there are some errors; then just contact me.