Tutorial of the Strf library

1. Syntax

The "destination" is the part of the expression that defines where the content goes to and also the return type. Many destinations are supported and you can create your own one. However, for convenience, most code samples in this tutorial use to_string:

#include <strf/to_string.hpp>

void sample() {
    int x = 200;
    std::string str = strf::to_string(x, " in hexadecimal is ", strf::hex(x));
    assert(str == "200 in hexadecimal is c8");
}

You can see that there is no format string here, as there is in printf. Instead, format functions ( like hex above ) specify formatting. The expression strf::hex(x) is equivalent to strf::fmt(x).hex(). The return of strf::fmt(x) is an object containing the value of x in addition to format information which can be edited with member ( format ) functions following the named parameter idiom , like this: strf::fmt(255).hex().p(4).fill(U'.') > 10

To use a translation tools like gettext, you need to use the tr function, which employs what is called here as the tr-string:

auto s = strf::to_string.tr("{} in hexadecimal is {}", x, strf::hex(x));

The reserve, no_reseve and reserve_calc functions are only available for some destinations, like those that allocate memory, which is the case of to_string. Using reserve(size) causes the destination to reserve enough space to store size characters. reserve_calc() has the same effect, except that it calculates the number of characters for you. reserve_calc() is currently the default, but this may change in future.

The with function receives facet objects, which somehow complement format functions. They also influence how the data is printed. A facet example is the lettercase enumeration:

namespace strf {
  enum class lettercase { /* ... */ };
  constexpr lettercase lowercase = /* ... */;
  constexpr lettercase mixedcase = /* ... */;
  constexpr lettercase uppercase = /* ... */;
}

It affects numeric and boolean values:

auto str_uppercase = strf::to_string.with(strf::uppercase)
    ( true, ' ', *strf::hex(0xab), ' ', 1.0e+50 );

auto str_mixedcase = strf::to_string.with(strf::mixedcase)
    ( true, ' ', *strf::hex(0xab), ' ', 1.0e+50 );

assert(str_uppercase == "TRUE 0XAB 1E+50");
assert(str_mixedcase == "True 0xAB 1e+50");

1.2. Constrained facets

You can constrain facets to a set of input types:

auto str = strf::to_string
    .with(std::constrain<std::is_floating_point>(strf::uppercase))
    ( true, ' '*strf::hex(0xab), ' ', 1.0e+50 );

assert(str == "true 0xab 1E+50");

, or to a set of arguments:

auto str = strf::to_string
    ( true, ' ', 1.0e+50, " / "
    , strf::with(strf::uppercase) (true, ' ', 1.0e+50, " / ")
    , true, ' ', 1.0e+50 );

assert(str == "true 1e+50 / TRUE 1E+50 / true 1e+50 );

When there are multiple facets objects of the same category, the order matters. The later one wins:

auto fa = strf::mixedcase;
auto fb = std::constrain<std::is_floating_point>(strf::uppercase);

using namespace strf;
auto str_ab = to_string .with(fa, fb) (true, ' ', *hex(0xab), ' ', 1e+9);
auto str_ba = to_string .with(fb, fa) (true, ' ', *hex(0xab), ' ', 1e+9);

// In str_ab, fb overrides fa, but only for floating points
// In str_ba, ba overrides fb for all types, so fb has no effect.

assert(str_ab == "True 0xAB 1E+9");
assert(str_ba == "True 0xAB 1e+9");

1.3. Facets categories

But what does it mean for two facet objects to belong to same facet category? In this library, the term facet always refers to types. So the type strf::lettercase is a facet, while strf::uppercase is a facet value. In addition, a facet is always associated to one, and only one, facet category. However, several facets can "belong" to the same category.

For each facet category there is class or struct with a public static member function get_default() which returns the default facet value of such facet category. By convention, the name of such class or a struct is the name of the category, and it has the “_c” suffix. For example, the category of strf::lettercase is strf::lettercase_c, and strf::lettercase_c::get_default() returns strf::lowercase.

Informaly ( perhaps in future it will be formal thanks to C++20 Concepts ) for each facet category there is a list of requirements a type must satisfy to be a facet of the category. In the case of strf::lettercase_c, the requirement is, well, to be the strf::lettercase type, since this is only facet of this category by design. However other categories require the facet to contain member functions with specified signatures, effects, preconditions, posconditions and so on.

The design of the facets varies a lot according to their categories. But all facets currently available in the library have something in common: they all are small types ( in terms of sizeof() ) and provide a fast copy constructor. In addition, most of them can be instantiated as constexpr values.

The facet_traits struct template provides the category a given facet.

1.4. Facets packs

To avoid retyping all the facets object that you commonly use, you can store them into a facets_pack, which you create with the pack function template:

constexpr auto my_facets = strf::pack
    ( strf::mixedcase
    , std::constrain<strf::is_bool>(strf::uppercase)
    , strf::numpunct<10>{3}.thousands_sep(U'.').decimal_point(U',')
    , strf::numpunct<16>{4}.thousands_sep(U'\'')
    , strf::windows_1252<char> );


auto str1 = strf::to_string.with(my_facets) (/* ... */);
// ...
auto str2 = strf::to_string.with(my_facets) (/* ... */);
// ...

Any value that can be passed to the with function, can also be passed to pack, and vice-versa. This means a facets_pack can contain another facets_pack.

So the expression:

destination .with(f1, f2, f3, f4, f5) (/* args... */);

is equivalent to

destination .with(strf::pack(f1, strf::pack(f2, f3), f4), f5) (/* args... */);

, which is also equivalent to:

destination .with(f1).with(f2).with(f3).with(f4).with(f5) (/* args... */);

1.5. Locales

Strf is a locale-independent library. When you don’t specify any facet object, everything is printed as in the "C" locale. However, the header <strf/locale.hpp> provides the function locale_numpunct that returns a numpunct<10> object that reflects the numeric punctuation of the current locale ( decimal point, thousands separator and digits grouping ). locale_numpunct() is not thread safe. Actually using locales in general is not thread safe. However, once you store its returned value into a numpunct<10> object, that object is not affected anymore when the locale changes. Also, numpunct<10> is a facet.

#include <strf/locale.hpp>
#include <strf/to_string.hpp>

void sample() {
    if (setlocale(LC_NUMERIC, "de_DE")) {
        const auto punct_de = strf::locale_numpunct();
        auto str = strf::to_string.with(punct_de) (*strf::fixed(10000.5))
        assert(str == "10.000,5");

        // Changing locale does not affect punct_de
        // So using it is thread safe
        setlocale(LC_NUMERIC, "C");
        auto str2 = strf::to_string.with(punct_de) (*strf::fixed(20000.5));
        assert(str2 == "20.000,5");
    }
}

2. Other destinations

Up to here, we only covered things that define the content to be printed, not where it is printed. Strf provides other expressions besides to_string to select the destination. Many of them are overloads of the to function template. You can just replace the to_string expression by to(dest), where dest can be, for example, an array of char:

#include <strf.hpp> // another header !

void sample() {
    int x = 200;
    char buff[200];
    auto res = strf::to(buff) (x, " in hexadecimal is ", strf::hex(x));
    assert(0 == strcmp(buff, "200 in hexadecimal is c8");
    assert(strlen(buff) == (res.ptr - buff));
    assert( ! res.truncated);

    //now with a buffer that is too small
    char small_buff[16];
    auto res = strf::to(small_buff) (x, " in hexadecimal is ", strf::hex(x));
    assert(res.truncated);
    assert(res.ptr == small_buff + 15);
    assert(*res.ptr == '\0');
    assert(0 == strcmp(small_buff, "200 in hexadeci");
}

However, there is another overload of to that deserves a special mention: the one that writes to basic_outbuff references:

namespace strf {

template <typename CharT>
class basic_outbuff;

using     outbuff = basic_outbuff<char>;
using   u8outbuff = basic_outbuff<char8_t>;
using  u16outbuff = basic_outbuff<char16_t>;
using  u32outbuff = basic_outbuff<char32_t>;
using    woutbuff = basic_outbuff<wchar_t>;
using bin_outbuff = basic_outbuff<std::byte>;

template <typename CharT>
/* ... */ to(strf::basic_outbuff<CharT>&);

}

For every destination, there is a concrete class that derives from the basic_outbuff abstract class template. For example, when you use to_string, the library internally instantiates a string_maker. In the case of writting to a raw string, it is a cstr_writer.

So the statement:

std::string str = strf::to_string(arg1, arg2, arg3, arg4);

is equivalent to:

strf::string_maker str_maker;
strf::to(str_maker) (arg1, arg2, arg3, arg4);
std::string str = str_maker.finish()

What makes the second form so interesting is that it doesn’t impose you to pass all arguments in a single statement. So you have the same flexibility as when writting into a std::ostream:

strf::string_maker str_maker;
auto print = str_maker.with(f1, f2, f3);

if (condition1) {
    print(arg1, arg2);
}
while (condition2) {
    print(arg3, arg4, arg5);
    //...
}
print.with(f4) (arg6, arg7);
// ...
auto str = str_maker.finish()

Another reason to use basic_outbuff is when you don’t want to commit yourself to a destination type. Suppose you need to create a function that provides a textual message whose content and size are known only at run time.

Instead of returning a string object:

std::string get_message();

, or writting to caller-supplied char*:

void get_message(char* dest, std::size_t dest_size);

, you can design your function like this:

void get_message(strf::outbuff& dest);

This way you let the caller to decide which outbuff implementation to use. It could be the string_maker or cstr_writer or another one. There is no significant performance difference between writing into a cstr_writer and directly into a char*.

However, when writing to a string — either a raw string or a std::string — note that such string need to be further sent to some other destination — a file, a log system, or whatever — otherwise it is useless, right? So what the caller can also do is to implement a new outbuff that writes directly into such final destination, thus avoiding the need of an intermediate string, which in turn avoids heap allocation ( which can happen when using std::string) or content trucation ( which can happen when using char*).

3. Error handling policy

Strf does not throw exceptions. When there is something wrong, the usual approach is the library to print the replacement character , or the question mark when the encoding can’t represent it. There are two situations when this can happen: when using the tr-string and when converting a string from one encoding to another ( see charset conversion ). In addition, for each of these cases there is a facet category (tr_error_notifier_c and invalid_seq_notifier_c) that enables you to specify a callback that is called in the error events, which can thus throw an exception if you want.

4. What’s next ?

The quick reference should explain most of things you need know about the library. This is the document you will probably use most of the time.

For more specific things, there are the header references:

`<strf/outbuff.hpp>`	This is lighweight header can be used in freestanding environments and is the cornerstone of library. All other headers include it.
`<strf.hpp>`	Defines most of the library, including the main usage syntax , all printable types and all facets.
`<strf/to_string.hpp>`	Provides utilities to write to `std::basic_string`. Includes `<strf.hpp>`.
`<strf/to_streambuf.hpp>`	Provides utilities to write to `std::basic_streambuf`. Includes `<strf.hpp>`.
`<strf/to_cfile.hpp>`	Provides utilities to write to `FILE*`. Includes `<strf.hpp>`.