'
c-char '
(1) u8'
c-char '
(2) (since C23) u'
c-char '
(3) (since C11) U'
c-char '
(4) (since C11) L'
c-char '
(5) '
c-char-sequence '
(6) L'
c-char-sequence '
(7) u'
c-char-sequence '
(8) (since C11)(removed in C23) U'
c-char-sequence '
(9) (since C11)(removed in C23)
where
'
), backslash (\
), or the newline character.1) single-byte integer character constant, e.g. 'a' or '\n' or '\13'. Such constant has type int and a value equal to the representation of c-char in the execution character set as a value of type char mapped to int. If c-char is not representable as a single byte in the execution character set, the value is implementation-defined.
2) UTF-8 character constant, e.g. u8'a'. Such constant has type char8_t and the value equal to ISO 10646 code point value of c-char, provided that the code point value is representable with a single UTF-8 code unit (that is, c-char is in the range 0x0-0x7F, inclusive). If c-char is not representable with a single UTF-8 code unit, the program is ill-formed.
3)16-bit wide character constant, e.g.
u'è²', but not
u'ð'(
u'\U0001f34c'). Such constant has type
char16_tand a value equal to the value of
c-charin the 16-bit encoding produced by
mbrtoc16(normally UTF-16). If
c-charis not representable or maps to more than one 16-bit character, the value is implementation-defined.
4)32-bit wide character constant, e.g.
U'è²'or
U'ð'. Such constant has type
char32_tand a value equal to the value of
c-charin in the 32-bit encoding produced by
mbrtoc32(normally UTF-32). If
c-charis not representable or maps to more than one 32-bit character, the value is implementation-defined.
(until C23)3) UTF-16 character constant, e.g. u'è²', but not u'ð' (u'\U0001f34c'). Such constant has type char16_t and the value equal to ISO 10646 code point value of c-char, provided that the code point value is representable with a single UTF-16 code unit (that is, c-char is in the range 0x0-0xD7FF or 0xE000-0xFFFF, inclusive). If c-char is not representable with a single UTF-16 code unit, the program is ill-formed.
4) UTF-32 character constant, e.g. U'è²' or U'ð'. Such constant has type char32_t and the value equal to ISO 10646 code point value of c-char, provided that the code point value is representable with a single UTF-32 code unit (that is, c-char is in the range 0x0-0xD7FF or 0xE000-0x10FFFF, inclusive). If c-char is not representable with a single UTF-32 code unit, the program is ill-formed.
(since C23) 5)wide character constant, e.g.
L'β'or
L'è². Such constant has type
wchar_tand a value equal to the value of
c-charin the execution wide character set (that is, the value that would be produced by
mbtowc). If
c-charis not representable or maps to more than one wide character (e.g. a non-BMP value on Windows where
wchar_tis 16-bit), the value is implementation-defined .
6) multicharacter constant, e.g. 'AB', has type int and implementation-defined value.
7) wide multicharacter constant, e.g. L'AB', has type wchar_t and implementation-defined value.
8) 16-bit multicharacter constant, e.g. u'CD', has type char16_t and implementation-defined value.
9) 32-bit multicharacter constant, e.g. U'XY', has type char32_t and implementation-defined value.
[edit] NotesMulticharacter constants were inherited by C from the B programming language. Although not specified by the C standard, most compilers (MSVC is a notable exception) implement multicharacter constants as specified in B: the values of each char in the constant initialize successive bytes of the resulting integer, in big-endian zero-padded right-adjusted order, e.g. the value of '\1' is 0x00000001 and the value of '\1\2\3\4' is 0x01020304.
In C++, encodable ordinary character literals have type char, rather than int.
Unlike integer constants, a character constant may have a negative value if char is signed: on such implementations '\xFF' is an int with the value -1.
When used in a controlling expression of #if or #elif, character constants may be interpreted in terms of the source character set, the execution character set, or some other implementation-defined character set.
16/32-bit multicharacter constants are not widely supported and removed in C23. Some common implementations (e.g. clang) do not accept them at all.
[edit] Example#include <stddef.h> #include <stdio.h> #include <uchar.h> int main(void) { printf("constant value \n"); printf("-------- ----------\n"); // integer character constants, int c1='a'; printf("'a':\t %#010x\n", c1); int c2='ð'; printf("'ð':\t %#010x\n\n", c2); // implementation-defined // multicharacter constant int c3='ab'; printf("'ab':\t %#010x\n\n", c3); // implementation-defined // 16-bit wide character constants char16_t uc1 = u'a'; printf("'a':\t %#010x\n", (int)uc1); char16_t uc2 = u'¢'; printf("'¢':\t %#010x\n", (int)uc2); char16_t uc3 = u'ç«'; printf("'ç«':\t %#010x\n", (int)uc3); // implementation-defined (ð maps to two 16-bit characters) char16_t uc4 = u'ð'; printf("'ð':\t %#010x\n\n", (int)uc4); // 32-bit wide character constants char32_t Uc1 = U'a'; printf("'a':\t %#010x\n", (int)Uc1); char32_t Uc2 = U'¢'; printf("'¢':\t %#010x\n", (int)Uc2); char32_t Uc3 = U'ç«'; printf("'ç«':\t %#010x\n", (int)Uc3); char32_t Uc4 = U'ð'; printf("'ð':\t %#010x\n\n", (int)Uc4); // wide character constants wchar_t wc1 = L'a'; printf("'a':\t %#010x\n", (int)wc1); wchar_t wc2 = L'¢'; printf("'¢':\t %#010x\n", (int)wc2); wchar_t wc3 = L'ç«'; printf("'ç«':\t %#010x\n", (int)wc3); wchar_t wc4 = L'ð'; printf("'ð':\t %#010x\n\n", (int)wc4); }
Possible output:
constant value -------- ---------- 'a': 0x00000061 'ð': 0xf09f8d8c 'ab': 0x00006162 'a': 0x00000061 '¢': 0x000000a2 'ç«': 0x0000732b 'ð': 0x0000df4c 'a': 0x00000061 '¢': 0x000000a2 'ç«': 0x0000732b 'ð': 0x0001f34c 'a': 0x00000061 '¢': 0x000000a2 'ç«': 0x0000732b 'ð': 0x0001f34c[edit] References
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4