A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://www.mail-archive.com/html5lib-discuss@googlegroups.com/msg00266.html below:

Re: Lots of fails/exceptions

On Wed, Dec 24, 2008 at 10:11 PM, Edward Z. Yang
<edwardzy...@thewritingpot.com> wrote:
>
> Philip Taylor wrote:
>> The case you mentioned is:
>>
>> {"description":"<!DOCTYPEa SYSTEM\"!",
>> "input":"<!DOCTYPEa SYSTEM\"!",
>> "output":["ParseError", "ParseError", ["DOCTYPE", "a", null, "!", false]]},
>>
>> so it's expecting a doctype with correctness=false (i.e.
>> force-quirks=true), which is sensible because it's got an EOF while in
>> "DOCTYPE public identifier (double-quoted) state" which makes it "Set
>> the DOCTYPE token's force-quirks flag to on". So I don't see a problem
>> here yet...
>
> The test-case I'm looking at is:
>
> {"description":"<!DOCTYPE a SYSTEM''!",
> "input":"<!DOCTYPE a SYSTEM''!",
> "output":["ParseError", ["DOCTYPE", "a", null, "", true]]},
Oops, sorry - I'll blame Gmail for using a font that makes '' look
identical to ".

In that case, the final ! looks like it comes in "After DOCTYPE system
identifier state" and is "Anything else", so it does "Parse error.
Switch to the bogus DOCTYPE state. (This does /not/ set the DOCTYPE
token's force-quirks flag to on.)", and then the EOF is consumed in
the bogus doctype state and the token is emitted without force-quirks
ever being set. So I still don't see a problem :-)

My (non-html5lib) implementation says it's doing:

DataState
Matched: (char == 60 && content-model-flag == PCDATA)
Actions: switch-to-state(TagOpenState)

TagOpenState
Matched: (content-model-flag == PCDATA && char == 33)
Actions: switch-to-state(MarkupDeclarationOpenState)

MarkupDeclarationOpenState
Matched: (! (next-chars-are("--")) && next-chars-are("DOCTYPE"))
Actions: consume-character; consume-character; consume-character;
consume-character; consume-character; consume-character;
switch-to-state(DoctypeState)

DoctypeState
Matched: (((char == 9 || char == 10) || char == 12) || char == 32)
Actions: switch-to-state(BeforeDoctypeNameState)

BeforeDoctypeNameState
Matched: anything-else
Actions: create-doctype-token; append-char-to-doctype-name;
switch-to-state(DoctypeNameState)

DoctypeNameState
Matched: (((char == 9 || char == 10) || char == 12) || char == 32)
Actions: switch-to-state(AfterDoctypeNameState)

AfterDoctypeNameState
Matched: next-chars-are("SYSTEM")
Actions: consume-character; consume-character; consume-character;
consume-character; consume-character;
switch-to-state(BeforeDoctypeSystemIdentifierState)

BeforeDoctypeSystemIdentifierState
Matched: char == 39
Actions: set-doctype-sysid-empty-string;
switch-to-state(DoctypeSystemIdentifierSingleQuotedState)

DoctypeSystemIdentifierSingleQuotedState
Matched: char == 39
Actions: switch-to-state(AfterDoctypeSystemIdentifierState)

AfterDoctypeSystemIdentifierState
Matched: anything-else
Actions: parse-error; switch-to-state(BogusDoctypeState)

BogusDoctypeState
Matched: EOF
Actions: emit-doctype-token; unconsume-character; switch-to-state(DataState)

DataState
Matched: EOF
Actions: emit-eof-token

Output: "ParseError", ["DOCTYPE", "a", null, "", true]

-- 
Philip Taylor
exc...@gmail.com

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"html5lib-discuss" group.
 To post to this group, send email to html5lib-discuss@googlegroups.com
 To unsubscribe from this group, send email to 
html5lib-discuss+unsubscr...@googlegroups.com
 For more options, visit this group at 
http://groups.google.com/group/html5lib-discuss?hl=en-GB
-~----------~----~----~----~------~----~------~--~---


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4