On Wed, Dec 24, 2008 at 10:11 PM, Edward Z. Yang <edwardzy...@thewritingpot.com> wrote: > > Philip Taylor wrote: >> The case you mentioned is: >> >> {"description":"<!DOCTYPEa SYSTEM\"!", >> "input":"<!DOCTYPEa SYSTEM\"!", >> "output":["ParseError", "ParseError", ["DOCTYPE", "a", null, "!", false]]}, >> >> so it's expecting a doctype with correctness=false (i.e. >> force-quirks=true), which is sensible because it's got an EOF while in >> "DOCTYPE public identifier (double-quoted) state" which makes it "Set >> the DOCTYPE token's force-quirks flag to on". So I don't see a problem >> here yet... > > The test-case I'm looking at is: > > {"description":"<!DOCTYPE a SYSTEM''!", > "input":"<!DOCTYPE a SYSTEM''!", > "output":["ParseError", ["DOCTYPE", "a", null, "", true]]},
Oops, sorry - I'll blame Gmail for using a font that makes '' look identical to ". In that case, the final ! looks like it comes in "After DOCTYPE system identifier state" and is "Anything else", so it does "Parse error. Switch to the bogus DOCTYPE state. (This does /not/ set the DOCTYPE token's force-quirks flag to on.)", and then the EOF is consumed in the bogus doctype state and the token is emitted without force-quirks ever being set. So I still don't see a problem :-) My (non-html5lib) implementation says it's doing: DataState Matched: (char == 60 && content-model-flag == PCDATA) Actions: switch-to-state(TagOpenState) TagOpenState Matched: (content-model-flag == PCDATA && char == 33) Actions: switch-to-state(MarkupDeclarationOpenState) MarkupDeclarationOpenState Matched: (! (next-chars-are("--")) && next-chars-are("DOCTYPE")) Actions: consume-character; consume-character; consume-character; consume-character; consume-character; consume-character; switch-to-state(DoctypeState) DoctypeState Matched: (((char == 9 || char == 10) || char == 12) || char == 32) Actions: switch-to-state(BeforeDoctypeNameState) BeforeDoctypeNameState Matched: anything-else Actions: create-doctype-token; append-char-to-doctype-name; switch-to-state(DoctypeNameState) DoctypeNameState Matched: (((char == 9 || char == 10) || char == 12) || char == 32) Actions: switch-to-state(AfterDoctypeNameState) AfterDoctypeNameState Matched: next-chars-are("SYSTEM") Actions: consume-character; consume-character; consume-character; consume-character; consume-character; switch-to-state(BeforeDoctypeSystemIdentifierState) BeforeDoctypeSystemIdentifierState Matched: char == 39 Actions: set-doctype-sysid-empty-string; switch-to-state(DoctypeSystemIdentifierSingleQuotedState) DoctypeSystemIdentifierSingleQuotedState Matched: char == 39 Actions: switch-to-state(AfterDoctypeSystemIdentifierState) AfterDoctypeSystemIdentifierState Matched: anything-else Actions: parse-error; switch-to-state(BogusDoctypeState) BogusDoctypeState Matched: EOF Actions: emit-doctype-token; unconsume-character; switch-to-state(DataState) DataState Matched: EOF Actions: emit-eof-token Output: "ParseError", ["DOCTYPE", "a", null, "", true] -- Philip Taylor exc...@gmail.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "html5lib-discuss" group. To post to this group, send email to html5lib-discuss@googlegroups.com To unsubscribe from this group, send email to html5lib-discuss+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/html5lib-discuss?hl=en-GB -~----------~----~----~----~------~----~------~--~---
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4