This document defines a generic, abbreviated syntax for expressing URIs. This syntax is intended to be used as a common element by language designers. Target languages include, but are not limited to, XML languages. The intended audience for this document is Language designers, not the users of those Languages.
Table of Contents 1. IntroductionThis section is informative.
More and more languages need a mechanism to permit the use of extensible name collections. These are primarily found in XML attribute values, but also found in other, similar spaces in non-XML languages (e.g., [SPARQL]). Typically such extension mechanisms utilize the concept of scoping, where names are created within a unique scope, and that scope's collection is managed by the group that defines it. Using such a mechanism allows independent organizations to define names without the risk of collision.
At the same time, language designers are trying to ensure that their languages mesh smoothly into the semantic web. Since the basis of the semantic web is the notion that meaning can be derived through the relationship among resources, these extension mechanisms need a ready way of mapping their scoped names to resources (via URIs).
In many cases, language designers are attempting to use QNames for this extension mechanism [XML-SCHEMA-QNAME]. QNames do permit independent management of the name collection, and can map the names to a resource. Unfortunately, QNames are unsuitable in most cases because 1) the use of QName as identifiers in attribute values and element content is problematic as discussed in [QNAMES], and 2) the syntax of QNames is overly-restrictive and does not allow all possible URIs to be expressed.
A specific example of the problem this causes comes from attempting to define the name collection for books. In a QName, the part after the colon must be a valid element name, making an example such as the following invalid: isbn:0321154991
This is not a valid QName simply because '0321154991' is not a valid element name. Yet, in the example given, we don't really want to define a valid element name anyway. The whole reason for using a QName was to reference an item in a private scope - that of ISBNs. Moreover, in this example, we want the names within that scope to a URI that will reveal the meaning of that ISBN. As you can see, the definition of QNames and this (relatively common) use case are in conflict with one another.
This specification addresses the problem by creating a new data type whose purpose is specifically to allow for the definition of scoped names that map to URIs in exactly this way. This type is called a "CURIE" or a "Compact URI". Syntactically, CURIEs are a superset of QNames.
Note that this specification is targeted at language designers, not document authors. Any language designer considering the use of QNames as a way to represent URIs or unique tokens should consider instead using CURIEs:
Although they are not currently called CURIEs, the technique described here is in widespread usage. However, taken literally, QNames would not support many of the examples that we would find 'in the wild' — the fact that they do is mainly because systems and authors take a very lax approach to QNames.
In other words, the principle used in QNames — that of combining a namespace name with a local part to generate a URI — is widely used, but little checking is done on the local part to ensure that the string is a valid element name. However, this does mean that CURIEs can be easily used in a number of places, since there is already a large amount of 'mind-share'.
2. UsageThis section is informative.
CURIEs can be used in exactly the same syntactic way QNames have been used in attribute values, with the modification that the format of the strings after the colon is looser. In all cases a parsed CURIE will produce an IRI. However, the process of evaluating involves replacing the CURIE with a concatenation of the value represented by the prefix and the part after the colon (the reference).
Note that if CURIEs are to be used in the context of scripting, accessing a CURIE via standard mechanisms such as the XML DOM will return the lexical form, not its value as IRI. In order to develop portable applications that evaluate CURIEs, a script author must transform CURIEs into their value as IRI before evaluating them (e.g., dereferencing the resulting IRI or comparing two CURIEs).
Also note that it is possible to define a CURIE prefix such that the resulting CURIEs in a document would resemble URIs (e.g., a prefix named 'mailto' would engender CURIEs that look like mailto:someone
but would expand to something else). Such CURIEs would only occur in contexts where a normal URI COULD NOT occur, but might still cause confusion for people reviewing the source of a document.
All of the following are valid CURIEs — even though they are not valid QNames — and they take advantage of the fact that the part after the colon no longer needs to conform to the rules for element names:
home:#start joseki: google:xforms+or+'xml+forms'2.2. Ambiguities Between CURIEs and URIs
CURIEs and SafeCURIEs map to IRIs, but neither a CURIE nor a Safe_CURIE is an IRI or URI. Accordingly, CURIEs and Safe_CURIEs MUST NOT be used as values for attributes or other content that are specified to contain only URIs, IRIs, URI-references, IRI-references, etc. Specifications for particular attribute values or other content MAY be written to allow either CURIEs or IRIs (or URIs, etc.). The specifications for such languages MUST provide rules for disambiguation in situations where the same string could be interpreted as either a CURIE or an IRI. One way to do this is to require that all CURIEs be expressed as Safe_CURIEs, implying that all unbracketed strings are to be interpreted directly as IRIs.
3. SyntaxThis section is normative.
The following definition makes the set of CURIEs a syntactic superset of the set of QNames. It is comprised of two components, a prefix and a reference. The prefix is separated from the reference by a colon (:
). It is possible to omit both the prefix and the colon, or to omit just the prefix and leave the colon. To disambiguate a CURIE when it appears in a context where a normal [URI] may also be used, the entire CURIE is permitted to be enclosed in brackets ([
, ]
).
safe_curie := '[' curie ']' curie := [ [ prefix ] ':' ] reference prefix := NCName reference := irelative-ref (as defined in IRI)
Note that while the empty string matches the production for curie above, an empty string is NOT a valid CURIE.
When CURIES are used in an XML-based host language, and that host language supports XML Namespaces, prefix values MUST be able to be defined using the 'xmlns:' syntax specified in [XMLNAMES]. Such host languages MAY also provide additional prefix mapping definition mechanisms.
The XML Namespaces specification states that prefix names are not permitted to begin with the characters 'xml' (see Leading XML). While this specification does not impose such a restriction, when CURIEs are used in an XML language this restriction is effectively inherited from XML Namespaces. If such a language defines an additional mechanism for defining prefixes, that mechanism SHOULD impose a similar restriction so there is no possibility of conflict between prefixes defined using the two mechanisms.
When CURIES are used in a non-XML host language, the host language MUST provide a mechanism for defining the mapping from the prefix
to an IRI.
A host language MAY interpret a reference value that is not preceded by a prefix and a colon as being a member of a host-language defined set of reserved values. Such reserved values MUST translate into an IRI, just as with any other CURIE.
A host language MAY declare a default prefix value, or MAY provide a mechanism for defining a default prefix value. This default prefix value MAY be different than the language's default namespace. In such a host language, when the prefix
is omitted from a CURIE, the default prefix value MUST be used. Conversely, if such a language does not define a default prefix value mechanism and does not define a set of reserved values, CURIEs MUST NOT be used without a leading prefix and colon.
Host languages that support XML Namespaces always have a default namespace. This default namespace MUST NOT be used as the default prefix for CURIEs because the default namespace in such a language can change at any time in a document, and such a change would mean that seemingly identical CURIEs might map to different IRIs.
CURIEs are an abbreviation for strings which are intended to represent IRIs (as defined by the IRI production in [IRI]), but checking that intent is not part of CURIE conformance. The intended IRI is constructed by concatenating the prefix binding with the reference part, if any. There MUST be a prefix binding for the prefix (or the default prefix, if the prefix is absent) in scope. The concatenation of the prefix value associated with a CURIE and its reference
MUST be an IRI (as defined by the IRI production in [IRI]).
The CURIE prefix '_' is reserved for use by languages that support RDF. For this reason, the prefix '_' SHOULD be avoided by authors.
Host languages MAY define additional constraints on these syntax rules when CURIES are used in the context of those host languages. Host languages MUST NOT relax the constraints defined this specification.
It is an error if a string required by a host language to be a CURIE or safe_curie fails to satisfy the constraints defined above. Rules for error reporting and/or recovery SHOULD be provided in the specification for the host language.
The safe_curie
production is for use in attribute values where it would be otherwise impossible to disambiguate between a CURIE and a URI. Host languages are NOT required to use safe_curie
s other than in such a context.
When revising a language that has historically permitted URIs in certain locations (e.g., as values of a specific attribute), to ensure backward compatibility, language designers SHOULD NOT permit CURIEs (or safe_curies) as the datatype in the corresponding location, but SHOULD provide a new mechanism (e.g., a new attribute).
4. Incorporating CURIEs into Host LanguagesThis section is informative.
CURIEs can be used in a variety of ways in host languages. This section shows a few simple examples.
4.1. SPARQLThe [SPARQL] language provides a PREFIX
keyword for defining the prefix used in their CURIE-like identifiers. SPARQL would need to constrain the reference portion of a CURIE to get identical syntax, but such a constraint is permitted by this specification.
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?x ?name WHERE { ?x foaf:name ?name }4.2. XHTML+RDFa
The RDFa Syntax specification defines an extended version of XHTML 1.1 called XHTML+RDFa. Because XHTML+RDFa is an XML markup language, the CURIE prefixes are defined using the xmlns:
attribute.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd"> <html version="XHTML+RDFa 1.0" xmlns="http://www.w3.org/1999/xhtml" xmlns:dc="http://purl.org/dc/elements/1.1/"> <head version="XHTML+RDFa 1.0" profile="http://www.w3.org/1999/xhtml/vocab"> <title>An XHTML+RDFa document using CURIEs</title> </head> <body> <p rel="cite"> this content was written by <span property="dc:creator">some author</span> </p> </body> </html>4.3. XHTML 2
XHTML 2 includes the role
attribute. This attribute takes advantage of CURIEs to permit the easy definition of additional roles for the content of a page.
<html version="XHTML2" xmlns="http://www.w3.org/1999/xhtml" xmlns:MR="http://www.example.org/roles/myRoles#"> <head profile="http://www.w3.org/1999/xhtml/vocab"> <title>An XHTML 2 document using Role</title> </head> <body> <p role="MR:main">The main content</p> <p role="MR:music">Some musical support for the page</p> </body> </html>5. Conformance Requirements
This section is normative.
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
5.1. CURIE Processor ConformanceA conforming CURIE processor must support all of the features required in this specification.
A. CURIE DatatypesThis section is informative.
In order to facilitate the use of CURIEs in markup languages, this specification defines some additional datatypes in the XHTML datatype space (http://www.w3.org/1999/xhtml/datatypes/
). Markup languages that use XHTML Modularization [XHTMLMOD] can find these normative definitions in the Modularization support file "datatypes" for their schema grammar:
Specifically, the following datatypes are introduced:
The following informative XML Schema definition for these datatypes is included as an example:
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://www.w3.org/1999/xhtml/datatypes/" xmlns:xh11d="http://www.w3.org/1999/xhtml/datatypes/" targetNamespace="http://www.w3.org/1999/xhtml/datatypes/" elementFormDefault="qualified" > <xs:simpleType name="CURIE"> <xs:restriction base="xs:string"> <xs:pattern value="(([\i-[:]][\c-[:]]*)?:)?.+" /> <xs:minLength value="1"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="CURIEs"> <xs:list itemType="xh11d:CURIE"/> </xs:simpleType> <xs:simpleType name="SafeCURIE"> <xs:restriction base="xs:string"> <xs:pattern value="\[(([\i-[:]][\c-[:]]*)?:)?.+\]" /> <xs:minLength value="3"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="SafeCURIEs"> <xs:list itemType="xh11d:SafeCURIE"/> </xs:simpleType> <xs:simpleType name="URIorSafeCURIE"> <xs:union memberTypes="xs:anyURI xh11d:SafeCURIE" /> </xs:simpleType> <xs:simpleType name="URIorSafeCURIEs"> <xs:list itemType="xh11d:URIorSafeCURIE"/> </xs:simpleType> </xs:schema>A.2. XML DTD Definition
The following informative XML DTD definition for these datatypes is included as an example:
<!ENTITY % CURIE.datatype "CDATA" > <!ENTITY % CURIEs.datatype "CDATA" > <!ENTITY % SafeCURIE.datatype "CDATA" > <!ENTITY % SafeCURIEs.datatype "CDATA" > <!ENTITY % URIorSafeCURIE.datatype "CDATA" > <!ENTITY % URIorSafeCURIEs.datatype "CDATA" >B. References B.1. Required Specifications
This section is normative.
This section is informative.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4