In #7834 we resolved to relax the Nesting syntax to basically "&
is only required to indicate a descendant selector only when the selector starts with an ident" (e.g. in & h1
the &
is necessary, but & .foo
can become just .foo
and & > h1
can be just > h1
).
I opened this issue to brainstorm about ways to relax the syntax even further and do away with the &
for all descendants, without introducing infinite lookahead, so we can have our 🍰 and eat it too.
If we do not require a &
before descendant element selectors, then when followed by a pseudo-class they can look like declarations (which are <ident>
:
and then anything goes, including another <ident>
) to the parser. The parser cannot know if it's dealing with a declaration or a nested rule, until it sees a ;
or {
, which could come after an arbitrary number of tokens, hence unbounded lookahead.
:
for declarations, minifiers currently remove that so there is a lot of code out there with declarations that do not include whitespace after :
.font
is both a valid property name and an HTML element — also with custom elements any vaid property name can also be an element selector).One way this problem is bounded is that there are only two distinct possibilities: either you have a declaration, or a selector.
In CSS, tokenization is context-less, i.e. parsing a declaration or rule creates the same tokens, it's only the higher-level structures that are different.
Assuming parsing a declaration takes O(M) time and parsing a rule takes O(N) time, it would theoretically solve the problem to naively parse every rule-or-declaration twice (one as declaration, one as rule), and then throw away the structure we don't need. Clearly, that's a silly idea, because that would take O(M+N) time for every rule-or-declaration.
One optimization would be to parse as a declaration (there are far more declarations than nested rules), and keep the list of raw tokens around until the ;
or {
. Then declarations continue to be parsed in O(M) time, and rules are parsed in O(M+N) time. The extra space needed is minimal, since we don't need to keep these tokens around after the current structure is parsed.
But also, as discussed in #7834, we can rule out the possibility of being a declaration very early for nearly every selector. The only exception is element selectors followed by a pseudo-class (e.g. strong:hover
) which are fairly rare in nested stylesheets (you usually want to style the base selector as well, so it's usually & { /* ... */ &:hover {...} }
)
So in the end, declarations still take O(M) time, nearly all rules still take O(N) time, and some, but very few rules take O(M + N) time.
And there's probably more room for optimizations.
I'd love to hear from implementers whether this is feasible, and whether I'm missing something.
Edit: Another restriction of going that way is that we'd need to forbid {}
from property values, including custom properties (having it in strings is fine). But usage of that in the wild seems very low (and a lot of that includes the braces in strings, which will always be fine). Note we can reserve property: {
for future syntax, since pseudo-classes cannot be empty so there's no ambiguity there.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.3