C API: This file defines an immutable Unicode code point trie. More...
Go to the source code of this file.
#define UCPTRIE_16(trie, i) ((trie)->data.ptr16[i]) Macro parameter value for a trie with 16-bit data values. More...C API: This file defines an immutable Unicode code point trie.
Definition in file ucptrie.h.
◆ UCPTRIE_16 #define UCPTRIE_16 ( trie, i ) ((trie)->data.ptr16[i])Macro parameter value for a trie with 16-bit data values.
Use the name of this macro as a "dataAccess" parameter in other macros. Do not use this macro in any other way.
Definition at line 326 of file ucptrie.h.
◆ UCPTRIE_32 #define UCPTRIE_32 ( trie, i ) ((trie)->data.ptr32[i])Macro parameter value for a trie with 32-bit data values.
Use the name of this macro as a "dataAccess" parameter in other macros. Do not use this macro in any other way.
Definition at line 336 of file ucptrie.h.
◆ UCPTRIE_8 #define UCPTRIE_8 ( trie, i ) ((trie)->data.ptr8[i])Macro parameter value for a trie with 8-bit data values.
Use the name of this macro as a "dataAccess" parameter in other macros. Do not use this macro in any other way.
Definition at line 346 of file ucptrie.h.
◆ UCPTRIE_ASCII_GET #define UCPTRIE_ASCII_GET ( trie, dataAccess, c ) dataAccess(trie, c)Returns a trie value for an ASCII code point, without range checking.
Definition at line 517 of file ucptrie.h.
◆ UCPTRIE_FAST_BMP_GET #define UCPTRIE_FAST_BMP_GET ( trie, dataAccess, c ) dataAccess(trie, _UCPTRIE_FAST_INDEX(trie, c))Returns a trie value for a BMP code point (U+0000..U+FFFF), without range checking.
Can be used to look up a value for a UTF-16 code unit if other parts of the string processing check for surrogates.
Definition at line 530 of file ucptrie.h.
◆ UCPTRIE_FAST_GET #define UCPTRIE_FAST_GET ( trie, dataAccess, c ) dataAccess(trie, _UCPTRIE_CP_INDEX(trie, 0xffff, c))Returns a trie value for a code point, with range checking.
Returns the trie error value if c is not in the range 0..U+10FFFF.
Definition at line 358 of file ucptrie.h.
◆ UCPTRIE_FAST_SUPP_GET #define UCPTRIE_FAST_SUPP_GET ( trie, dataAccess, c ) dataAccess(trie, _UCPTRIE_SMALL_INDEX(trie, c))Returns a trie value for a supplementary code point (U+10000..U+10FFFF), without range checking.
Definition at line 542 of file ucptrie.h.
◆ UCPTRIE_FAST_U16_NEXT #define UCPTRIE_FAST_U16_NEXT ( trie, dataAccess, src, limit, c, result ) Value:(c) = *(src)++; \
int32_t __index; \
__index = _UCPTRIE_FAST_INDEX(trie, c); \
} else { \
uint16_t __c2; \
++(src); \
__index = _UCPTRIE_SMALL_INDEX(trie, c); \
} else { \
__index = (trie)->dataLength - UCPTRIE_ERROR_VALUE_NEG_DATA_OFFSET; \
} \
} \
(result) = dataAccess(trie, __index); \
#define UPRV_BLOCK_MACRO_END
Defined as "while (false)" by default.
#define UPRV_BLOCK_MACRO_BEGIN
Defined as the "do" keyword by default.
#define U16_IS_SURROGATE_LEAD(c)
Assuming c is a surrogate code point (U16_IS_SURROGATE(c)), is it a lead surrogate?
#define U16_GET_SUPPLEMENTARY(lead, trail)
Get a supplementary code point value (U+10000..U+10ffff) from its lead and trail surrogates.
#define U16_IS_SURROGATE(c)
Is this code unit a surrogate (U+d800..U+dfff)?
#define U16_IS_TRAIL(c)
Is this code unit a trail surrogate (U+dc00..U+dfff)?
UTF-16: Reads the next code point (UChar32 c, out), post-increments src, and gets a value from the trie.
Sets the trie error value if c is an unpaired surrogate.
Definition at line 386 of file ucptrie.h.
◆ UCPTRIE_FAST_U16_PREV #define UCPTRIE_FAST_U16_PREV ( trie, dataAccess, start, src, c, result ) Value:(c) = *--(src); \
int32_t __index; \
__index = _UCPTRIE_FAST_INDEX(trie, c); \
} else { \
uint16_t __c2; \
--(src); \
__index = _UCPTRIE_SMALL_INDEX(trie, c); \
} else { \
__index = (trie)->dataLength - UCPTRIE_ERROR_VALUE_NEG_DATA_OFFSET; \
} \
} \
(result) = dataAccess(trie, __index); \
#define U16_IS_SURROGATE_TRAIL(c)
Assuming c is a surrogate code point (U16_IS_SURROGATE(c)), is it a trail surrogate?
#define U16_IS_LEAD(c)
Is this code unit a lead surrogate (U+d800..U+dbff)?
UTF-16: Reads the previous code point (UChar32 c, out), pre-decrements src, and gets a value from the trie.
Sets the trie error value if c is an unpaired surrogate.
Definition at line 417 of file ucptrie.h.
◆ UCPTRIE_FAST_U8_NEXT #define UCPTRIE_FAST_U8_NEXT ( trie, dataAccess, src, limit, result ) Value:int32_t __lead = (uint8_t)*(src)++; \
uint8_t __t1, __t2, __t3; \
if ((src) != (limit) && \
(__lead >= 0xe0 ? \
__lead < 0xf0 ? \
++(src) != (limit) && (__t2 = *(src) - 0x80) <= 0x3f && \
(__lead = ((int32_t)(trie)->index[(__lead << 6) + (__t1 & 0x3f)]) + __t2, 1) \
: \
(__lead -= 0xf0) <= 4 && \
(__lead = (__lead << 6) | (__t1 & 0x3f), ++(src) != (limit)) && \
(__t2 = *(src) - 0x80) <= 0x3f && \
++(src) != (limit) && (__t3 = *(src) - 0x80) <= 0x3f && \
(__lead = __lead >= (trie)->shifted12HighStart ? \
(trie)->dataLength - UCPTRIE_HIGH_VALUE_NEG_DATA_OFFSET : \
ucptrie_internalSmallU8Index((trie), __lead, __t2, __t3), 1) \
: \
__lead >= 0xc2 && (__t1 = *(src) - 0x80) <= 0x3f && \
(__lead = (int32_t)(trie)->index[__lead & 0x1f] + __t1, 1))) { \
++(src); \
} else { \
__lead = (trie)->dataLength - UCPTRIE_ERROR_VALUE_NEG_DATA_OFFSET; \
} \
} \
(result) = dataAccess(trie, __lead); \
#define U8_IS_SINGLE(c)
Does this code unit (byte) encode a code point by itself (US-ASCII 0..0x7f)?
#define U8_LEAD3_T1_BITS
Internal bit vector for 3-byte UTF-8 validity check, for use in U8_IS_VALID_LEAD3_AND_T1.
#define U8_LEAD4_T1_BITS
Internal bit vector for 4-byte UTF-8 validity check, for use in U8_IS_VALID_LEAD4_AND_T1.
UTF-8: Post-increments src and gets a value from the trie.
Sets the trie error value for an ill-formed byte sequence.
Unlike UCPTRIE_FAST_U16_NEXT() this UTF-8 macro does not provide the code point because it would be more work to do so and is often not needed. If the trie value differs from the error value, then the byte sequence is well-formed, and the code point can be assembled without revalidation.
Definition at line 451 of file ucptrie.h.
◆ UCPTRIE_FAST_U8_PREV #define UCPTRIE_FAST_U8_PREV ( trie, dataAccess, start, src, result ) Value:int32_t __index = (uint8_t)*--(src); \
__index = ucptrie_internalU8PrevIndex((trie), __index, (const uint8_t *)(start), \
(const uint8_t *)(src)); \
(src) -= __index & 7; \
__index >>= 3; \
} \
(result) = dataAccess(trie, __index); \
UTF-8: Pre-decrements src and gets a value from the trie.
Sets the trie error value for an ill-formed byte sequence.
Unlike UCPTRIE_FAST_U16_PREV() this UTF-8 macro does not provide the code point because it would be more work to do so and is often not needed. If the trie value differs from the error value, then the byte sequence is well-formed, and the code point can be assembled without revalidation.
Definition at line 497 of file ucptrie.h.
◆ UCPTRIE_SMALL_GET #define UCPTRIE_SMALL_GET ( trie, dataAccess, c ) dataAccess(trie, _UCPTRIE_CP_INDEX(trie, UCPTRIE_SMALL_MAX, c))Returns a 16-bit trie value for a code point, with range checking.
Returns the trie error value if c is not in the range U+0000..U+10FFFF.
Definition at line 370 of file ucptrie.h.
◆ UCPTrieType ◆ UCPTrieValueWidth ◆ ucptrie_close()Closes a trie and releases associated memory.
Returns the value for a code point as stored in the trie, with range checking.
Returns the trie error value if c is not in the range 0..U+10FFFF.
Easier to use than UCPTRIE_FAST_GET() and similar macros but slower. Easier to use because, unlike the macros, this function works on all UCPTrie objects, for all types and value widths.
Returns the last code point such that all those from start to there have the same value.
Can be used to efficiently iterate over all same-value ranges in a trie. (This is normally faster than iterating over code points and get()ting each value, but much slower than a data structure that stores ranges directly.)
If the UCPMapValueFilter function pointer is not NULL, then the value to be delivered is passed through that function, and the return value is the end of the range where all values are modified to the same actual value. The value is unchanged if that function pointer is NULL.
Example:
uint32_t value;
start = end + 1;
}
@ UCPMAP_RANGE_NORMAL
ucpmap_getRange() enumerates all same-value ranges as stored in the map.
U_CAPI UChar32 ucptrie_getRange(const UCPTrie *trie, UChar32 start, UCPMapRangeOption option, uint32_t surrogateValue, UCPMapValueFilter *filter, const void *context, uint32_t *pValue)
Returns the last code point such that all those from start to there have the same value.
int32_t UChar32
Define UChar32 as a type for single Unicode code points.
#define NULL
Define NULL if necessary, to nullptr for C++ and to ((void *)0) for C.
Opens a trie from its binary form, stored in 32-bit-aligned memory.
Inverse of ucptrie_toBinary().
The memory must remain valid and unchanged as long as the trie is used. You must ucptrie_close() the trie once you are done using it.
Writes a memory-mappable form of the trie into 32-bit aligned memory.
Inverse of ucptrie_openFromBinary().
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4