Format Functions
The expression fmt(arg)
returns a printable object that contains a copy or reference of
arg
as well as format information that can be edited with the
member functions like sci
and fill
:
auto str = strf::to_string( +*strf::fmt(1.0).sci().fill(U'~') ^ 15 );
assert(str == "~~~~+1.e+01~~~~")
These member functions are called in this library as format functions.
There are also some global function templates that work as alias to format functions:
Expression | Equivalent Expression |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
auto str = strf::to_string( +*strf::center(1.0, 9, U'~') );
assert(str == "~~~+1.~~~")
Alignment formatting
Format functions | Effect |
---|---|
|
Aligns to the left ( Or to the right on right-to-left (RTL) scripts, like Arabic ) |
|
Aligns to the right ( Or to the left on RTL scripts ) |
|
Center alignment |
|
Sets the fill character. |
|
Set all alignment formatting options simultaneously. |
|
Set all alignment formatting options to default. |
|
Set all alignment formatting options to default. |
- Note
-
You can see there is no equivalent to
std::internal
in the above table. Instead, look for thepad0
format function if you want to pad zeros when printing integers or floating-points.
Floating-point formatting
Member function | Effect |
---|---|
|
Equivalent to the |
|
Equivalent to the |
|
Similar to the |
|
Equivalent to |
|
Applies the numeric punctuation according to the |
|
Equivalent to |
Similar to the For NaN and infinity, causes the the width ( from alignment
formatting ) to be at least equal to For valid numbers, prints zeros after the sign and the base
indication and before the digits such that at least |
|
|
Sets the precision. Effect varies according to the notation ( see below ). |
|
Sets the float notation ( see below ). |
|
Equivalent to |
|
Equivalent to |
|
Equivalent to |
|
Equivalent to |
|
Set all floating-point formatting options simultaneously. |
|
Reset all floating-point formatting options to default. |
float_notation::hex
-
Hexadecimal
float_notation::fixed
-
If precision is not set, prints the smallest number of digits such that the floating-point value can be exactly recovered. If precision is set, it is the number of fractional digits.
float_notation::scientific
-
If precision is not set, prints the smallest number of digits such that the floating-point value can be exactly recovered. If precision is set, it is the number of fractional digits.
float_notation::general
-
If precision is not set, chooses the notation ( scientific or fixed ) that leads to the smallest number or characters such that the floating-point value can be exactly recovered.
If precision is set, same effect as the'g'
format flags inprintf
( except that the lettercase is specified by the lettercase facet ):-
The precision is the number of significant digts
-
If precision is 0, it is treated as 1
-
Trailing fractional zeros are removed unless
operator*
is used. -
Selects the scientific notation iff the resulting exponent is less than -4 or greater than or equal to the precision
-
pad0 is independent of alignment formatting:
|
auto s = strf::to_string( strf::center(-1.25, 12, '_').pad0(8) );
assert(s == "__-0001.25__");
auto nan = std::numeric_limits<double>::quiet_NaN();
s = strf::to_string( strf::center(-nan, 12, '_').pad0(8) );
assert(s == "____-nan____");
s = strf::to_string( strf::center(-nan, 8, '_').pad0(12) );
assert(s == "____-nan____");
Integer formatting
Member function | Effect |
---|---|
|
Uses the binary base. |
|
Uses the octal base. |
|
Uses the decimal base. |
|
Uses the hexadecimal base. |
|
Equivalent to the |
|
Equivalent to the |
|
Equivalent to |
|
Similar to the |
|
Applies the numeric punctuation according to the |
|
Equivalent to |
Inserts zeros after the sign or base indication and before the digits
such that at least |
|
Inserts zeros after the sign or base indication and before the digits
such that at least |
|
|
Set all integers formatting options simultaneously. |
|
Reset all integers formatting options to default. |
Char formatting
Member function | Effect |
---|---|
|
Prints the argument |
String formatting
Member function | Effect |
---|---|
|
Sets string precision |
|
Transcodes the input string if |
|
Same as in |
|
If |
|
Equivalent to |
|
Equivalent to |
|
Equivalent to |
Printing target expressions
Expression | Header |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
where:
-
CharT
is a charater type. -
Traits
is a CharTraits type. -
A
is an Allocator type -
char_ptr
is aCharT*
value, whereCharT
is a character type. -
end
is aCharT*
value, whereCharT
is a character type. -
count
is astd::size_t
value -
streambuf_ptr
is astd::streambuf<CharT, Traits>*
-
streambuf_ref
is astd::streambuf<CharT, Traits>&
-
cfile
is aFILE*
-
destination_ref
is adestination<CharT>&
, whereCharT
is a character type. -
args...
is an argument list of printable values.
strf::to(destination_ref) (args...)
Return type |
|
Supports reserve |
No |
See the list of types that derive from destination<CharT>&
.
Header file | |
Preconditions |
|
Return type | |
Return value |
a value
|
Note |
The termination character |
Supports reserve |
No |
- Header file
- Preconditions
-
-
end >= char_ptr
-
- Return type
- Return value
-
a value
r
, such that:-
r.ptr
is the one-past-the-end pointer of the characters written. -
r.truncated
istrue
when the destination is too small. In this case, the number of characters written is unspecified.
-
- Note
-
The termination character
'\0'
is not appended to the content. - Supports reserve
-
No
strf::to_basic_string <CharT, Traitsopt, Aopt> ( args... )
Return type |
|
Supports reserve |
Yes |
strf::to_string ( args... )
Return type |
|
Supports reserve |
Yes |
strf::to_u8string ( args... )
Return type |
|
Supports reserve |
Yes |
strf::to_u16string ( args... )
Return type |
|
Supports reserve |
Yes |
strf::to_u32string ( args... )
Return type |
|
Supports reserve |
Yes |
strf::to_wstring ( args... )
Return type |
|
Supports reserve |
Yes |
Return type | |
Return value |
A value
|
Supports reserve |
No |
to<CharTopt>(cfile) (args...)
- Effect
-
Successively call
std::fwrite(buffer, sizeof(CharT),/*...*/, cfile)
until the whole content is written or until an error happens, wherebuffer
is an internal array ofCharT
.
Return type | |
Return value |
|
Supports reserve |
No |
wto(cfile) (args...)
Header file | |
Return type |
destination
classes
The table below lists the concrete types that derivate from the destination<CharT>
abstract class.
Type | Description |
---|---|
|
Writes C strings. Always writes the termination character |
|
Writes to also |
|
Discards the content. The analogous of |
|
Appends to |
|
Creates |
|
Creates |
|
Writes to |
|
Writes to |
Writes to |
where:
-
CharT
is a charater type. -
Traits
is a CharTraits type. -
A
is an Allocator type -
BufferSize
is astd::size_t
constexpr value;
Tr-string
auto s = strf::to_string.tr("{} in hexadecimal is {}", x, strf::hex(x));
The tr-string is like what in other formatting libraries would be called as the format string, except that it does not specify any formatting. Its purpose is to enable your program to provide multilingual support by using translation tools like gettext.
Since it is common for the person who writes the string to be translated not being the same who translates it, the tr-string syntax allows the insertion of comments.
A '{' followed by |
until | means |
---|---|---|
|
the next |
a comment |
another |
the second |
an escaped |
a digit |
the next |
a positional argument reference |
any other character |
the next |
a non positional argument reference |
- Comments
-
auto str = strf::to_string.tr ( "You can learn more about python{-the programming language, not the animal species} at {}" , "www.python.org" ); assert(str == "You can learn more about python at www.python.org");
- Escapes
-
Note there is no way and no need to escape the
'}'
character, since it has special meaning only when corresponding to a previous ’{'auto str = strf::to_string.tr("} {{x} {{{} {{{}}", "aaa", "bbb"); assert(str == "} {x} {aaa {bbb}");
- Positional arguments
-
Position zero refers to the first input argument. The characters the after the digits are ignored. So they can also be used as comments.
auto str = strf::to_string.tr("{1 a person name} likes {0 a food name}.", "sandwich", "Paul"); assert(str == "Paul likes sandwich.");
- Non positional arguments
-
The characters the after the
'{'
are ignored as wellauto str = strf::to_string.tr("{a person} likes {a food type}.", "Paul", "sandwich"); assert(str == "Paul likes sandwich.");
Tr-string error handling
When the argument associated with a "{"
does not exist, the library does two things:
-
It prints a replacement character
"\uFFFD"
(�) ( or"?"
when the charset can’t represent it ) where the missing argument would be printed. -
It calls the
handle
function on the facet object correspoding to thetr_error_notifier_c
category, which, by default, does nothing.
Facet Categories
Category | Constrainable | Description |
---|---|---|
|
Yes |
Numeric punctuation for decimal base |
|
Yes |
Numeric punctuation for hexadecimal base |
|
Yes |
Numeric punctuation for octal base |
|
Yes |
Numeric punctuation for binary base |
Yes |
Letter case for printing numeric and booleans values |
|
|
No |
The character encoding correponding to character type |
Yes |
Notifies nonconformities to the character encoding. |
|
No |
Notifies errors on the tr-string |
|
Yes |
Defines how the width is calculated |
|
Yes |
Overrides printable types |
Numeric punctuation
To apply numeric punctuation in integers and floating-point
arguments you need to invoke the punct
or operator!
format function. You also need to pass a numpunct
facet
object to specify the "thousands" separator,
the decimal point and the grouping pattern.
The integer sequence passed to the constructor of numpunct
defines the grouping.
The last value is repeated, unless it is equal to -1
.
auto s1 = strf::to_string.with(strf::numpunct<10>(1, 2, 3))(strf::punct(1000000000000ll));
assert(s1 == "1,000,000,000,00,0");
auto s2 = strf::to_string.with(strf::numpunct<10>(1, 2, 3, -1))(!strf::dec(1000000000000ll));
assert(s2 == "1000000,000,00,0");
The constructor of numpunct
has some preconditions:
-
No more than six arguments can be passed.
-
No argument can be greater than 30.
-
No argument can be less than
1
, unless it is the last argument and it’s equal to-1
.
When default constructed, the numpunct
has no grouping, i.e.
the thousands separator is never printed.
The default thousands separator and decimal point are U','
and U'.'
,
repectively. To change them, use the thousands_sep
and decimal_point
member functions:
auto my_punct = numpunct<10>{3} .thousands_sep(U'\'') .decimal_point(U':');
auto str = strf::to_string.with(my_punct) (strf::punct(1000000.5));
assert(str == "1'000'000:5");
//or as lvalue:
auto my_punct2 = numpunct<10>(3);
my_punct2.thousands_sep(U';');
my_punct2.decimal_point(U'^');
auto str = strf::to_string.with(my_punct2) (strf::punct(1000000.5));
assert(str == "1;000;000^5");
Numeric punctuation from locale
The header file <strf/locale.hpp>
declares the locale_numpunct
function,
which returns a numpunct<10>
object that reflects the current locale:
#include <strf/locale.hpp>
#include <strf/to_string.hpp>
void sample() {
if (setlocale(LC_NUMERIC, "de_DE")) {
const auto punct_de = strf::locale_numpunct();
auto str = strf::to_string.with(punct_de) (*!strf::fixed(10000.5))
assert(str == "10.000,5");
}
}
Letter case
The lettercase
facet affects the letter case
when printing numeric values.
The default value is strf::lowercase
.
namespace strf {
enum class lettercase { lower = /*...*/, mixed = /*...*/, upper = /*...*/ };
constexpr lettercase lowercase = lettercase::lower;
constexpr lettercase mixedcase = lettercase::mixed;
constexpr lettercase uppercase = lettercase::upper;
}
Value | Result examples |
---|---|
|
|
|
|
|
|
auto str_upper = strf::to_string.with(strf::uppercase)
( *strf::hex(0xabc), ' '
, 1.0e+50, ' '
, std::numeric_limits<FloatT>::infinity() );
assert(str_upper == "0XAB 1E+50 INF");
auto str_mixed = strf::to_string.with(strf::mixedcase)
( *strf::hex(0xabc), ' '
, 1.e+50, ' '
, std::numeric_limits<FloatT>::infinity() );
assert(str_mixed == "0xAB 1e+50 Inf");
Character encodings
The following variables templates can be used as facet objects that specify what is the character encoding associated to the character type passed as the template parameter.
namespace strf {
template <typename CharT> constexpr ascii_t<CharT> ascii {};
template <typename CharT> constexpr iso_8859_1_t<CharT> iso_8859_1 {};
template <typename CharT> constexpr iso_8859_2_t<CharT> iso_8859_2 {};
// ... up to iso_8859_16
template <typename CharT> constexpr windows_1250_t<CharT> windows_1250 {};
template <typename CharT> constexpr windows_1251_t<CharT> windows_1251 {};
// ... up to windows_1258
template <typename CharT> constexpr utf8_t<CharT> utf8 {};
template <typename CharT> constexpr utf16_t<CharT> utf16 {};
template <typename CharT> constexpr utf32_t<CharT> utf32 {};
template <typename CharT> constexpr utf_t<CharT> utf {}
} // namespace strf
auto s = strf::to_string
.with(strf::windows_1252<char>)
.with(strf::numpunct<10>{4, 3, 2}.thousands_sep(0x2022))
("one hundred billions = ", 100000000000ll);
// The character U+2022 is encoded as '\225' in Windows-1252
assert(s == "one hundred billions = 1\2250000\225000\2250000");
Charset conversion
Since the library knows the encoding corresponding to each
character type, and knows how to convert from one to another,
it is possible to mix input string of difference characters
types, though you need to use the function transcode
:
auto str = strf::to_string( "aaa-"
, strf::transcode(u"bbb-")
, strf::transcode(U"ccc-")
, strf::transcode(L"ddd") );
auto str16 = strf::to_u16string( strf::transcode("aaa-")
, u"bbb-"
, strf::transcode(U"ccc-")
, strf::transcode(L"ddd") );
assert(str == "aaa-bbb-ccc-ddd");
assert(str16 == u"aaa-bbb-ccc-ddd");
The transcode
function can also specify an alternative encoding
for a specific input string argument:
auto str_utf8 = strf::to_u8string
( strf::transcode("--\xA4--", strf::iso_8859_1<char>)
, strf::transcode("--\xA4--", strf::iso_8859_15<char>));
assert(str_utf8 == u8"--\u00A4----\u20AC--");
The sani
function has the same effect as transcode
,
except when the input encoding is same as the output.
In this case sani
causes the input to be sanitized, whereas transcode
does not:
auto str = strf::to_string
.with(strf::iso_8859_3<char>) // the output charset
( strf::transcode("--\xff--") // not sanitized
, strf::transcode("--\xff--", strf::iso_8859_3<char>) // not sanitized ( same charset )
, strf::transcode("--\xff--", strf::utf8<char>) // sanitized ( different charset )
, strf::sani("--\xff--") // sanitized
, strf::sani("--\xff--", strf::iso_8859_3<char>) ) // sanitized
assert(str == "--\xff----\xff----?----?----?--");
The library replaces invalid sequences by the
replacement character �,
if the destination charset supports it ( and by '?'
otherwise ).
An "invalid sequence" is any input that is non-conformant to the source character encoding, or that cannot be to encoded into the destination encoding.
When the input is UTF-8, the library follows the practice recommended by the Unicode Standard regarding to calculate how many replacement characters to print for each non-conformant input sequence. ( see for "Best Practices for Using U+FFFD" in Chapter 3 ). |
The library does not sanitizes non-conformities when converting a single character,
like punctuation characters or the the fill character ( they are in UTF-32 ). In this case
the replacement character is only used when the destination charset is not able
to print the codepoint.
For example, if you use (char32_t)0xFFFFFFF as the decimal point,
then it will printed as "\uFFFD" if the destination is UTF-8 or UTF-16, but
if the destination is UTF-32, then the library just writes (char32_t)0xFFFFFFF
verbatim.
|
Transcoding error handling
Sometimes just printing a replacement character to signalize an encoding conversion
errors may not be enough. In this case, you can create a class deriving from
transcoding_error_notifier
to do something more, like throwing an
exception or logging the failure.
transcoding_error_notifier::invalid_sequence
virtual function is to be
called when the source has any invalid sequence, while
transcoding_error_notifier::unsupported_codepoint
is for when the
destination encoding is not able to represent a codepoint.
You can see an example
here
But pay attention: this type is not a facet. The actual facet
must be a TranscodingErrorNotifierPtr type, which is
supposed to hold a pointer ( raw or smart ) to your
transcoding_error_notifier
object.
#include <strf/to_cfile.hpp>
#include <strf/to_string.hpp>
class my_notifier: public strf::transcodint_error_notifier {
//...
// (click here to see an implementation example)
//...
};
void sample() {
strf::narrow_cfile_writer<char> err_dest{stderr};
my_notifier notifer{err_dest};
strf::transcoding_error_notifier_ptr notifier_ptr{¬ifer};
auto str = strf::to_string.with(notifier_ptr) ( /* ... */ );
// ...
}
Width Calculation
The width_calculator_c
facet value
enables you to choose how the width of a string is calculated when using
alignment formatting. You have five options:
-
The
fast_width
facet object assumes that the width of a string is equal to its size. This is the least accurate method, but it’s the fastest.Exampleauto str = "15.00 \xE2\x82\xAC \x80"; // "15.00 € \x80" auto result = strf::to_string.with(strf::fast_width) ( strf::right(str, 12, '*') ); assert(result == "*15.00 \xE2\x82\xAC \x80"); // width calculated as 11
-
The
width_as_fast_u32len
facet value evaluates the width of a string as the number of Unicode code points. However, differently fromwidth_as_u32len
, to gain performance, it assumes that the measured string is conformant to its charset. Nonconformities do not cause undefined behaviour, but lead to incorrect values. For example, the width of an UTF-8 string may simply be calculated as the number of bytes that are not in the range [0x80
,0xBF
], i.e., byte that are not continuation bytes. This way, any extra continuation byte — that would replaced by a"\uFFFD"
during sanitization — is not counted.Exampleauto str = "15.00 \xE2\x82\xAC \x80"; // "15.00 € \x80" auto result = strf::to_string .with(strf::width_as_fast_u32len) ( strf::right(str, 12, '*')); assert(result == "****15.00 \xE2\x82\xAC \x80"); // width calculated as 8
-
The
width_as_u32len
facet value also evaluates the width of a string as the number of Unicode code points. But each nonconformity to the charset is counted as an extra code points ( as if it were replaced by replacement character � ).Exampleauto str = "15.00 \xE2\x82\xAC \x80"; // "15.00 € \x80" auto result = strf::to_string .with(strf::width_as_u32len) ( strf::right(str, 12, '*')); assert(result == "***15.00 \xE2\x82\xAC \x80"); // width calculated as 9
-
std_width_calc
is the default. It calculates the width just as specified tostd::format
, which means it is aware of grapheme clustering and also that the width of some codepoints is equal to2
. It doesn’t require however the string to be encoded in UTF ( internally, it is converted to UTF-32 ). -
The fifth option is to implement your own width calculator. This implies to create a class that satisfies the WidthCalculator type requirements.
The width calculation algorithm is applied
on the input, not the output string. Keep that in mind when
converting from one charset to another using
|
Ranges
Without formatting
|
|
|
|
|
|
where
-
range_obj
is an object whose type is a Container type -
begin
andend
are iterators -
separator
is a raw string ofCharT
, whereCharT
is the destination character type. -
func
is unary a function object such that the type of expressionfunc(x)
is printable wherex
is an element of the range.
int arr[3] = { 11, 22, 33 };
auto str = strf::to_string(strf::range(arr));
assert(str == "112233");
str = strf::to_string(strf::separated_range(arr, ", "));
assert(str == "11, 22, 33");
auto op = [](auto x){ return strf::join('(', +strf::fmt(x * 10), ')'); };
str = strf::to_string(strf::separated_range(arr, ", ", op));
assert(str == "(+110), (+220), (+330)");
With formatting
|
|
|
|
|
|
Any format function applicable to the element type of the
range can also be applied to the
expression strf::fmt_range(/*...*/)
or
strf::fmt_separated_range(/*...*/)
.
It causes the formatting to be applied to each element.
std::vector<int> vec = { 11, 22, 33 };
auto str1 = strf::to_string("[", +strf::fmt_separated_range(vec, " ;") > 6, "]");
assert(str1 == "[ +11 ; +22 ; +33]");
int array[] = { 11, 22, 33 };
auto str2 = strf::to_string
( "["
, *strf::fmt_separated_range(array, " / ").fill('.').hex() > 6,
" ]");
assert(str2 == "[..0xfa / ..0xfb / ..0xfc]");
Joins
Simple joins
|
Joins enables you to group a set of input arguments as one:
auto str = strf::to_string.tr("Blah blah blah {}.", strf::join("abc", '/', 123))
assert(str == "Blah blah blah abc/123")
They can be handy to create aliases:
struct date{ int day, month, year; };
auto as_yymmdd = [](date d) {
return strf::join( strf::dec(d.year % 100).p(2), '/'
, strf::dec(d.month).p(2), '/'
, strf::dec(d.day).p(2) );
};
date d {1, 1, 1999};
auto str = strf::to_string("The day was ", as_yymmdd(d), '.');
assert(str == "The day was is 99/01/01.");
Aligned joins
You can apply any of the alignment format functions on the
expression join(args...)
auto str = strf::to_string(strf::join("abc", "def", 123) > 15);
assert(str == " abcdef123);
Or use any of the expressions below:
|
|
|
where:
-
args...
are the values to be printed -
width
is a value of typewidth_t
-
alignment
is a value of typetext_alignment
-
ch
is a value of typechar32_t
auto str = strf::to_string(strf::join_center(15, U'.')("abc", "def", 123));
assert(...abcdef123...);