"
s-char-seq (optional)"
(1) R"
d-char-seq (optional)(
r-char-seq (optional))
d-char-seq (optional)"
(2) (since C++11) L"
s-char-seq (optional)"
(3) LR"
d-char-seq (optional)(
r-char-seq (optional))
d-char-seq (optional)"
(4) (since C++11) u8"
s-char-seq (optional)"
(5) (since C++11) u8R"
d-char-seq (optional)(
r-char-seq (optional))
d-char-seq (optional)"
(6) (since C++11) u"
s-char-seq (optional)"
(7) (since C++11) uR"
d-char-seq (optional)(
r-char-seq (optional))
d-char-seq (optional)"
(8) (since C++11) U"
s-char-seq (optional)"
(9) (since C++11) UR"
d-char-seq (optional)(
r-char-seq (optional))
d-char-seq (optional)"
(10) (since C++11) [edit] Explanation Syntax Kind Type Encoding (1,2) ordinary string literal const char[N] ordinary literal encoding (3,4) wide string literal const wchar_t[N] wide literal encoding (5,6) UTF-8 string literal
const char[N]
(until C++20)const char8_t[N]
(since C++20) UTF-8 (7,8) UTF-16 string literal const char16_t[N] UTF-16 (9,10) UTF-32 string literal const char32_t[N] UTF-32In the types listed in the table above, N is the number of encoded code units, which is determined below.
Ordinary and UTF-8(since C++11) string literals are collectively referred to as narrow string literals.
Evaluating a string literal results in a string literal object with static storage duration. Whether all string literals are stored in nonoverlapping objects and whether successive evaluations of a string literal yield the same or a different object is unspecified.
The effect of attempting to modify a string literal object is undefined.
bool b = "bar" == 3 + "foobar"; // can be true or false, unspecified const char* pc = "Hello"; char* p = const_cast<char*>(pc); p[0] = 'M'; // undefined behaviorRaw string literals
Raw string literals are string literals with a prefix containing R
(syntaxes (2,4,6,8,10)). They do not escape any character, which means anything between the delimiters d-char-seq (
and )
d-char-seq becomes part of the string. The terminating d-char-seq is the same sequence of characters as the initial d-char-seq.
// OK: contains one backslash, // equivalent to "\\" R"(\)"; // OK: contains four \n pairs, // equivalent to "\\n\\n\\n\\n" R"(\n\n\n\n)"; // OK: contains one close-parenthesis, two double-quotes and one open-parenthesis, // equivalent to ")\"\"(" R"-()""()-"; // OK: equivalent to "\n)\\\na\"\"\n" R"a( )\ a"" )a"; // OK: equivalent to "x = \"\"\\y\"\"" R"(x = ""\y"")"; // R"<<(-_-)>>"; // Error: begin and end delimiters do not match // R"-()-"-()-"; // Error: )-" appears in the middle and terminates the literal(since C++11) [edit] Initialization
String literal objects are initialized with the sequence of code unit values corresponding to the string literalâs sequence of s-char s and r-char s(since C++11), plus a terminating null character (U+0000), in order as follows:
1)For each contiguous sequence of
basic-s-chars,
r-char s,(since C++11) simple escape sequencesand
universal character names, the sequence of character it denotes is encoded to a code unit sequence using the string literalâs associated character encoding. If a character lacks representation in the associated character encoding, then the program is ill-formed.
If the associated character encoding is stateful, the first such sequence is encoded beginning with the initial encoding state and each subsequent sequence is encoded beginning with the final encoding state of the prior sequence.
2)For each
numeric escape sequence, given
vas the integer value represented by the octal or hexadecimal number comprising the sequence of digits in the escape sequence, and
T
as the string literalâs array element type (see the table
above):
T
, then the escape sequence contributes a single code unit with value v.T
, then the escape sequence contributes a single code unit with a unique value of type T
, that is congruent to v mod 2ST
.If the associated character encoding is stateful, all such sequences have no effect on encoding state.
If the associated character encoding is stateful, it is implementation-defined what effect these sequences have on encoding state.
[edit] ConcatenationAdjacent string literals are concatenated at translation phase 6 (after preprocessing):
"Hello, " "world!" // at phase 6, the 2 string literals form "Hello, world!" L"Îx = %" PRId16 // at phase 4, PRId16 expands to "d" // at phase 6, L"Îx = %" and "d" form L"Îx = %d"
The following contexts expect a string literal, but do not evaluate it:
It is unspecified whether non-ordinary string literals are allowed in these contexts, except that a literal operator name must use an ordinary string literal(since C++11).
(until C++26)Only ordinary string literals are allowed in these contexts.
Each universal character name and each simple escape sequence in an unevaluated string is replaced by the member of the translation character set it denotes. An unevaluated string that contains a numeric escape sequence or a conditional escape sequence is ill-formed.
(since C++26) [edit] NotesString literals can be used to initialize character arrays. If an array is initialized like char str[] = "foo";, str will contain a copy of the string "foo".
String literals are convertible and assignable to non-const char* or wchar_t* in order to be compatible with C, where string literals are of types char[N] and wchar_t[N]. Such implicit conversion is deprecated.
(until C++11)String literals are not convertible or assignable to non-const CharT*
. An explicit cast (e.g. const_cast
) must be used if such conversion is wanted.
A string literal is not necessarily a null-terminated character sequence: if a string literal has embedded null characters, it represents an array which contains more than one string.
const char* p = "abc\0def"; // std::strlen(p) == 3, but the array has size 8
If a valid hexadecimal digit follows a hexadecimal escape sequence in a string literal, it would fail to compile as an invalid escape sequence. String concatenation can be used as a workaround:
//const char* p = "\xfff"; // error: hexadecimal escape sequence out of range const char* p = "\xff""f"; // OK: the literal is const char[3] holding {'\xff','f','\0'}[edit] Example
#include <iostream> // array1 and array2 contains the same values: char array1[] = "Foo" "bar"; char array2[] = {'F', 'o', 'o', 'b', 'a', 'r', '\0'}; const char* s1 = R"foo( Hello World )foo"; // same as const char* s2 = "\nHello\n World\n"; // same as const char* s3 = "\n" "Hello\n" " World\n"; const wchar_t* s4 = L"ABC" L"DEF"; // OK, same as const wchar_t* s5 = L"ABCDEF"; const char32_t* s6 = U"GHI" "JKL"; // OK, same as const char32_t* s7 = U"GHIJKL"; const char16_t* s9 = "MN" u"OP" "QR"; // OK, same as const char16_t* sA = u"MNOPQR"; // const auto* sB = u"Mixed" U"Types"; // before C++23 may or may not be supported by // the implementation; ill-formed since C++23 const wchar_t* sC = LR"--(STUV)--"; // OK, raw string literal int main() { std::cout << array1 << ' ' << array2 << '\n' << s1 << s2 << s3 << std::endl; std::wcout << s4 << ' ' << s5 << ' ' << sC << std::endl; }
Output:
Foobar Foobar Hello World Hello World Hello World ABCDEF ABCDEF STUV[edit] Defect reports
The following behavior-changing defect reports were applied retroactively to previously published C++ standards.
DR Applied to Behavior as published Correct behavior CWG 411RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4