Stay organized with collections Save and categorize content based on your preferences.
GoogleSQL for Spanner supports collation. You can learn more about collation in this topic.
About collationCollation determines how strings are sorted and compared in an ORDER BY
operation. If you would like to use custom collation in the operation, you can include the COLLATE
clause with a collation specification.
In the ORDER BY
clause, you can specify a collation specification for a collation-supported column. This overrides any collation specifications set previously.
For example:
SELECT Place
FROM Locations
ORDER BY Place COLLATE "und:ci"
Query statements Collation specification details
A collation specification determines how strings are sorted and compared in collation-supported operations. You can define a collation specification for collation-supported types. These types of collation specifications are available:
If a collation specification isn't defined, the default collation specification is used. To learn more, see the next section.
Default collation specificationWhen a collation specification isn't assigned or is empty, the ordering behavior is identical to 'unicode'
collation, which you can learn about in the Unicode collation specification.
collation_specification:
'language_tag[:collation_attribute]'
A unicode collation specification indicates that the operation should use the Unicode Collation Algorithm to sort and compare strings. The collation specification can be a STRING
literal or a query parameter.
The language tag determines how strings are generally sorted and compared. Allowed values for language_tag
are:
en_US
. These names are defined by the Common Locale Data Repository (CLDR).und
: A locale string representing the undetermined locale. und
is a special language tag defined in the IANA language subtag registry and used to indicate an undetermined locale. This is also known as the root locale and can be considered the default Unicode collation. It defines a reasonable, locale agnostic collation. It differs significantly from unicode
.unicode
: Returns data in Unicode code point order, which is identical to the ordering behavior when COLLATE
isn't used. The sort order will look largely arbitrary to human users.In addition to the language tag, the unicode collation specification can have an optional collation_attribute
, which enables additional rules for sorting and comparing strings. Allowed values are:
ci
: Collation is case-insensitive.cs
: Collation is case-sensitive. By default, collation_attribute
is implicitly cs
.If you're using the unicode
language tag with a collation attribute, these caveats apply:
unicode:cs
is identical to unicode
.unicode:ci
is identical to und:ci
. It's recommended to migrate unicode:ci
to binary
.This is what the ci
collation attribute looks like when used with the und
language tag in the ORDER BY
clause:
SELECT Place
FROM Locations
ORDER BY Place COLLATE 'und:ci'
Caveats
Differing strings can be considered equal. For instance, ẞ
(LATIN CAPITAL LETTER SHARP S) is considered equal to 'SS'
in some contexts. The following expressions both evaluate to TRUE
:
COLLATE('ẞ', 'und:ci') > COLLATE('SS', 'und:ci')
COLLATE('ẞ1', 'und:ci') < COLLATE('SS2', 'und:ci')
This is similar to how case insensitivity works.
In search operations, strings with different lengths could be considered equal. To ensure consistency, collation should be used without search tailoring.
There are a wide range of unicode code points (punctuation, symbols, etc), that are treated as if they aren't there. So strings with and without them are sorted identically. For example, the format control code point U+2060
is ignored when the following strings are sorted:
SELECT *
FROM UNNEST([
'oran\u2060ge1',
'\u2060orange2',
'orange3'
]) AS fruit
ORDER BY fruit COLLATE 'und'
/*---------*
| fruit |
+---------+
| orange1 |
| orange2 |
| orange3 |
*---------*/
Ordering may change. The Unicode specification of the und
collation can change occasionally, which can affect sorting order. If you need a stable sort order that's guaranteed to never change, use unicode
collation.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-07 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-07 UTC."],[],[]]
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4