The character type Char
is an enumeration whose values represent Unicode (or equivalently ISO/IEC 10646) characters (see http://www.unicode.org/ for details). This set extends the ISO 8859-1 (Latin-1) character set (the first 256 characters), which is itself an extension of the ASCII character set (the first 128 characters). A character literal in Haskell has type Char
.
To convert a Char
to or from the corresponding Int
value defined by Unicode, use toEnum
and fromEnum
from the Enum
class respectively (or equivalently ord
and chr
).
Unicode characters are divided into letters, numbers, marks, punctuation, symbols, separators (including spaces) and others (including control characters).
isControl :: Char -> BoolSource
Selects control characters, which are the non-printing characters of the Latin-1 subset of Unicode.
Selects upper-case or title-case alphabetic Unicode characters (letters). Title case is used by a small number of letter ligatures like the single-character form of Lj.
Selects alphabetic Unicode characters (lower-case, upper-case and title-case letters, plus letters of caseless scripts and modifiers letters). This function is equivalent to isLetter
.
isAlphaNum :: Char -> BoolSource
Selects alphabetic or numeric digit Unicode characters.
Note that numeric digits outside the ASCII range are selected by this function but not by isDigit
. Such digits may be part of identifiers but are not used by the printer and reader to represent numbers.
Selects printable Unicode characters (letters, numbers, marks, punctuation, symbols and spaces).
isLetter :: Char -> BoolSource
Selects alphabetic Unicode characters (lower-case, upper-case and title-case letters, plus letters of caseless scripts and modifiers letters). This function is equivalent to isAlpha
.
Selects Unicode mark characters, e.g. accents and the like, which combine with preceding letters.
isNumber :: Char -> BoolSource
Selects Unicode numeric characters, including digits from various scripts, Roman numerals, etc.
isPunctuation :: Char -> BoolSource
Selects Unicode punctuation characters, including various kinds of connectors, brackets and quotes.
isSymbol :: Char -> BoolSource
Selects Unicode symbol characters, including mathematical and currency symbols.
SubrangesSelects the first 128 characters of the Unicode character set, corresponding to the ASCII character set.
isLatin1 :: Char -> BoolSource
Selects the first 256 characters of the Unicode character set, corresponding to the ISO 8859-1 (Latin-1) character set.
Unicode general categoriesdata GeneralCategory Source
Unicode General Categories (column 2 of the UnicodeData table) in the order they are listed in the Unicode standard.
Constructors
UppercaseLetterLu: Letter, Uppercase
LowercaseLetterLl: Letter, Lowercase
TitlecaseLetterLt: Letter, Titlecase
ModifierLetterLm: Letter, Modifier
OtherLetterLo: Letter, Other
NonSpacingMarkMn: Mark, Non-Spacing
SpacingCombiningMarkMc: Mark, Spacing Combining
EnclosingMarkMe: Mark, Enclosing
DecimalNumberNd: Number, Decimal
LetterNumberNl: Number, Letter
OtherNumberNo: Number, Other
ConnectorPunctuationPc: Punctuation, Connector
DashPunctuationPd: Punctuation, Dash
OpenPunctuationPs: Punctuation, Open
ClosePunctuationPe: Punctuation, Close
InitialQuotePi: Punctuation, Initial quote
FinalQuotePf: Punctuation, Final quote
OtherPunctuationPo: Punctuation, Other
MathSymbolSm: Symbol, Math
CurrencySymbolSc: Symbol, Currency
ModifierSymbolSk: Symbol, Modifier
OtherSymbolSo: Symbol, Other
SpaceZs: Separator, Space
LineSeparatorZl: Separator, Line
ParagraphSeparatorZp: Separator, Paragraph
ControlCc: Other, Control
FormatCf: Other, Format
SurrogateCs: Other, Surrogate
PrivateUseCo: Other, Private Use
NotAssignedCn: Other, Not Assigned
Case conversionConvert a letter to the corresponding upper-case letter, if any. Any other character is returned unchanged.
Convert a letter to the corresponding lower-case letter, if any. Any other character is returned unchanged.
Convert a letter to the corresponding title-case or upper-case letter, if any. (Title case differs from upper case only for a small number of ligature letters.) Any other character is returned unchanged.
Single digit charactersdigitToInt :: Char -> IntSource
Convert a single digit Char
to the corresponding Int
. This function fails unless its argument satisfies isHexDigit
, but recognises both upper and lower-case hexadecimal digits (i.e. '0'
..'9'
, 'a'
..'f'
, 'A'
..'F'
).
intToDigit :: Int -> CharSource
Convert an Int
in the range 0
..15
to the corresponding single digit Char
. This function fails on other inputs, and generates lower-case hexadecimal digits.
showLitChar :: Char -> ShowSSource
Convert a character to a string using only printable characters, using Haskell source-language escape conventions. For example:
showLitChar '\n' s = "\\n" ++ s
lexLitChar :: ReadS StringSource
Read a string representation of a character, using Haskell source-language escape conventions. For example:
lexLitChar "\\nHello" = [("\\n", "Hello")]
readLitChar :: ReadS CharSource
Read a string representation of a character, using Haskell source-language escape conventions, and convert it to the character that it encodes. For example:
readLitChar "\\nHello" = [('\n', "Hello")]
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4