Since C++17, the standard library provides
Some people are not happy with some of its design decisions, for example, the fact that
std::string_view, just like
std::string, has a bloated interface, or that a
std::string is implicitly convertible to
std::string_view, which might create dangling strings by accident. Another, apparently controversial design decision is that
std::string_view does not point to a
\0 terminated string, making it generically unsuitable as a drop-in replacement for
const char* and
std::string based interfaces.
Most of the design decision can be found at the original proposal, the main guidelines where
more performant than
make it a drop-in replacemente for functions accepting non owning strings
While it is difficult we will have some other string_view type (if we do not count std::span, from C++20), even if the original proposal mentions that it could make sense, it’s easy to implement such a class.
\0 termination and implicit conversion are only part of a small subset of properties we might like to change from a string class. For example, we might need some string class that ensures there is no such thing as an empty string or that handles errors in a different way.
First of all, let’s look at what properties the string-type of the standard library and
const char* have:
|string type||since||owning|| || || || |
mostly by convention
literals, otherwise not necessarily
| || |
| || |
std::string where made implicit for allowing to write
void foo(const std::string&); void bar(std::string_view); foo("literal"); bar("literal"); std::string str; foo(str); bar(str);
std::string_view for those use-cases would have been very verbose.
const char*, the conversion from the non-owning container (
std::string_view) to the owning (
std::string) is explicit
void foo(const std::string&); std::string_view strv = "..." foo(std::string(strv));
const char*, the conversion from the owning container (
std::string) to the non-owning (
std::string_view) is implicit
void bar(std::string_view); std::string str = "..."; bar(str)
If we want a non-owning string container type with explicit conversion, to avoid errors like
std::string_view strv = std::string("a");
We need to look outside of the standard library (supposing we want something better than
If we want a non-owning string with an implicit conversion to owning, in order to be able to write
void foo(const std::string&); std::string_view strv = "..."; foo(strv);
Just as we are able with
const char*, we need again to look outside of the standard library.
If we want a non-owning
\0 terminated string, then
std::string_view is not the right choice,
const char* is, unfortunately, a better choice, unless designing another string class.
Probably too many, there are already a lot of string classes in the wild. The MFC library has
CString, the Xerces library has
XMLString, the Qt framework has
QString, the abseil library has
absl::string_view, the EASTL library has
eastl::string and so on and so on.
It is interesting to see that most libraries provide only an owning string class.
For my use-cases, something like
string_view is the default choice. Most of the time I’m searching something into a string, handling literals, comparing sequences of characters, or just passing them along, so no there is no need to allocate any memory or deep-copy the content, as I’m not mutating them.
So here it is, an incomplete proof of concept of a family of string_view classes!
Obligatory xkcd: https://xkcd.com/927/
Why a family and not a single class that can do it all?
Because some design decisions are incompatible with others. Some use cases want
string_view to convert implicitly from
std::string, others want the opposite. Some do not want any implicit conversion, while others want them in one or another direction. A class cannot satisfy all those requirements at once.
Adding a family of
string_view like types is a major overhead for the developer. Considering that (not counting
wchar_t, other character types, and other libraries), we already have literals,
const char* convention,
std::string_view for dealing with strings, there will always be a certain complexity in any project when dealing with them.
I’ve identified (at least for my most common use-cases) the following "policies"
The conversion policy is straightforward.
std::string_view needs to get explicitly converted to
std::string, while the opposite is not true. In some use-cases this is the desired behavior, in other use-cases, I want the opposite behavior.
The format policy is also easy.
std::string_view is not. On one side, this is very unfortunate since most OS API requires a
\0 terminated string, on the other, it permits functionalities (like substring), that are otherwise not possible. So either we use
const char*, or we use
std::string which might do unnecessary allocations, or we use
std::string_view and hope that the content is
\0 terminated (after all its what we do with
const char*, but much easier to misuse)
Notice that the trailing
\0 is not part of the content (otherwise
.size() would return one character more).
std::string_view has a content policy, but many interfaces do. For example, when creating files, a filename cannot contain an embedded
It is possible to create new string types with those invariants, or the checks are done outside of the string class. The content policy is a user-defined verification that happens on construction, making it possible to reuse an existing string-class without wrapping it. Notice that
\0-termination is not a specific content policy, as the trailing
\0 is not part of the interface, but a
\0-terminated string probably should have a content policy that disallows
\0 for avoiding silent truncations.
The allocation/copy policy is the big difference between
std::string performs deeps copies, while
std::string_view performs shallow copies.
std::string allocates, thus copying is costly, while
std::string_view does not allocate, thus passing by value is preferred than passing by reference.
string_views does not have an allocation policy. It is (given the name I chose for the project) out-of-scope, and has profound implications on how the class should be used.
To sum it up: Most strings classes would have the same underlying implementation, the main difference is the constructor/conversions, presence/absence of trailing
\0, content validation, and value/reference semantic.
The library targets C++>=14, it could be backported to C++11 and C++03, but some features might be missing.
It should permit to cover a lot more use cases that are not covered by
There is one use-case that is not covered (at least yet): a mutable string_view (as all methods are
const). A mutable
string_view still has reference semantic, but it imposes some design decisions, like the absence of
operator== (it’s not a coincidence that
std::span does not have it).
It would also be more difficult to enforce a content policy, as there are different functions (
iterators, …) that are able to change the content directly.
Thus it does not make much sense to enforce a content policy on mutable strings inside the class itself.
Another design decision was not duplicate every function of
Most functionalities can be implemented as free functions, making them easily reusable not only for string_view, but also for most other string-like classes. This has also the advantage to provide a more uniform API when dealing with different string-like classes. The main disadvantage is that it does not make
string_views a drop-in replacement for the string classes inside
As some functionalities depend on the invariant of the class (a generic substring for
\0-terminated string cannot work), and others are just redundant (
.length()), trying to add all possible functionalities would just bloat the API, without many benefits.
Do you want to share your opinion? Or is there an error, some parts that are not clear enough?
You can contact me here.