[FIX]Implement support for HTTP 1.1 Content-location header
Categories (Core :: DOM: HTML Parser, defect, P3) Core ▾Core
Shared components used by Firefox and other Mozilla software, including handling of Web content; Gecko, HTML, CSS, layout, DOM, scripts, images, networking, etc. Issues with web page layout probably go here, while Firefox user interface issues belong in the
Firefoxproduct. (
More info)
DOM: HTML Parser ▾Core :: DOM: HTML Parser
This system consumes content from the web, parses, validates and builds a content model (document)
Webcompat Score Webcompat Priority Performance Impact a11y-review Size Estimate Accessibility Severity Tracking Status relnote-firefox thunderbird_esr115 thunderbird_esr128 thunderbird_esr140 firefox-esr115 firefox-esr128 firefox-esr140 firefox140 firefox141 firefox142 People (Reporter: andreas.otte, Assigned: bzbarsky)Reset Assignee to default
Bug Flags:
behind-pref firefox-backlog sec-bounty sec-bounty-hof in-qa-testsuite in-testsuite qe-verifyThis bug is publicly visible.
Currently handling of HTTP 1.1 Header Content-Location from RFC 2616 is not supported. It is similar to the handling of the deprecated Content-Base header, but can also handle relative urls. See
bug 94096about a related discussion based on the removal of Content-Base support.
->Networking: HTTP
Assignee: attinasi → darin
Component: Layout → Networking: HTTP
QA Contact: petersen → tever
Sorry, this has to be owned by whoever owns the HTMLContentSink and that is not networking, back to layout.
Assignee: darin → attinasi
Component: Networking: HTTP → Layout
QA Contact: tever → petersen
Harish do you own the content sink?
Assignee: attinasi → harishd
Target Milestone: --- → Future
from cvsblame it appears that jst owns that file.
Assignee: harishd → jst
Component: Layout → DOM HTML
QA Contact: petersen → stummala
Target Milestone: Future → ---
qa -> me
QA Contact: stummala → ian
->parser. this isn't dom.
Assignee: dom_bugs → harishd
Component: DOM HTML → Parser
***
Bug 230035has been marked as a duplicate of this bug. ***
No. Opera supports this fine.
So what happens with multiple content-location headers as follows: Content-Location:
http://foo.com/bar/baz.html<base href="
http://bar.com/foo/" /> <meta http-equiv="content-location" content="bar.html" /> Do we end up with the document basically having a base of
http://bar.com/foo/bar.html? Or what? It seems to me that this whole thing is badly underspecified... That is, would it be sufficient so simply resolve the content-location relative to whatever the current base URI is and set that as the new base URI?
Another quote <
http://diveintomark.org/archives/2004/01/02/relative-uris>:
> Section 14.14 of RFC 2616 defines the Content-Location: HTTP header. If an > HTML document is served without a BASE element but with a Content-Location: > HTTP header, then that is the base URI (test page). Just to make this more > interesting, Content-Location: may itself be a relative URI, in which case it > is resolved according to RFC 2396, with the URI of the HTML document as its > base URI. The resolved URI then serves as the base URI for other relative URIs > within the HTML document.It looks to me like the BASE element is more important, since the quote states that if the BASE element is not set the content-location must be used.
Let's please stick to quoting specs, not blogs. Blogs do not have any normative status and will only confuse matters further. RFC2396, section 5.1. "Establishing a Base URI", is what defines the interactions of the various levels:
http://www.ietf.org/rfc/rfc2396In the presence of multiple sources at the same level, I would suggest doing what bz suggested, namely just resolving each base relative to the previous base and then setting that new URI as the new base for future elements.
Yes, that is a known bug. I'm dealing with the bookwire people to get them to resolve their problem.
Ian, you propose supporting relative values for <base href="">? I don't believe that's a good idea -- we don't support it now and neither do any other browsers; furthermore the HTML spec clearly says the href must be an absolute URI.
No, indeed, I'd recommend keeping <base> working only for absolute URIs as per the spec.
Taking.
Assignee: harishd → bz-vacation
Priority: -- → P3
Summary: Implement support for HTTP 1.1 Content-location header → [FIX]Implement support for HTTP 1.1 Content-location header
Target Milestone: --- → mozilla1.7alpha
(Does that patch also implement the real HTTP Content-Location header? Or only the META one? I hope it does both, because then that would mean our code was well designed, but...)
Yeah, that'll do both headers and meta tags. Amazing isn't it? :-)
Checked in.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Something went wrong here. Take a look at
bug 231072.
Nothing seems to have gone wrong that I can see, past a broken server....
Status: RESOLVED → VERIFIED
Also broke a bank:
bug 238626I really wish this fix was backed out!! It's a dreadful experience to wait over 10 minutes to load pages. In the case of
bug 238626one also suffers from ever more timeout alerts after 5-10 minutes have passed, but the page does continue "loading"/spinning. I suspect any other "fix" causing 10 minute performance hit per page, would have been made blockers and the offending code backed out.
The bug you are seeing is that we block on stylesheets. That is what should be fixed, this particular fix just made it more visible in certain cases where we were doing the wrong thing before.
When the page finally loads one ulcer later, it doesn't even USE a stylesheet. It would have used one without this fix. So things got both slower and uglyer. The problems this bug triggered just spawned
bug 238654btw, but more should probably be filed.
I've backed out this patch due to all the servers that send bogus content-location headers and the fact that the HTTP spec's treatment of content-location for content-negotiated pages breaks anchor traversals on such pages. See
bug 238654and its various dependencies and dups. The other option would be to implement what Opera implements, and that feels like a lot of work for very low benefit, so I don't plan to do that.
I guess this is wontfix, instead.
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → RESOLVED
Closed: 21 years ago → 21 years ago
Resolution: --- → WONTFIX
(In reply to
comment #32)
> I've backed out this patch due to all the servers that send bogus > content-location headersThis is not an arugment to implement the client also buggy. If the server is buggy, then the client may reject the server and display an alert asking the user whether to send an email to the server administrator to replace the buggy server by a server without bugs. It is not the job of the client to fix the bugs of the server. If the server serves stuff that does not conform to the RFCs instead of persenting an error message to the author, then this behaviour is a bug in the server and does not target you, because your client simply has to reject such buggy resonses.
> and the fact that the HTTP spec's treatment of > content-location for content-negotiated pages breaks anchor traversals on such > pages. See bug 238654 and its various dependencies and dups.This is also the responsibility of the page authors. If you can not detect the problem automatically, then ignore it. It is not your duty to fix authors errors. If you can automatically fix it, then you may implement an auto fix as configurable goody. If you can detect the problem but can not fix it, then simply reject such buggy pages and alert for whether to send an email to the author or server administrator.
> The other option would be to implement what Opera implements, and that feels > like a lot of work for very low benefit, so I don't plan to do that.Emulation if other browsers may be an optional goody but nothing more. It is not a duty. Duty is only to support and force the standards. If the standard is buggy (i.e. unclear) then the standard must be reviewed and rewritten, which results in a errata RFC or a completely new RFC. If such a necessary review of a standard blocks fixing, then you should not set the bug to WONTFIX but set the target milestone to FUTURE, because it is unknown, when the fix will be done.
George: when people browser the Web with Mozilla 1.7 and find that thousands of pages render incorrectly but work fine in Mozilla 1.6, Opera, IE, Safari, and every other browser, then they think it is a bug in Mozilla 1.7, and stop using us. Support more standards is great, but only when it makes the user experience better. When it makes things worse, it is a bad thing. Note that not supporting this doesn't make us non-compliant, it just means we don't support it. Since as far as I can tell nobody supports this (no browsers do it right, no servers do it right), what's the point?
(In reply to
comment #34)
> This is not an arugment to implement the client also buggy. If the server is > buggy, then the client may reject the server and display an alert asking the > user whether to send an email to the server administrator to replace the buggy > server by a server without bugs.1. that would annoy users 2. it is impossible to find the email address of the server administrator 3. it is impossible to detect whether the Content-Location the server sends is a "good" one, or whether it is a "bad" one
> > and the fact that the HTTP spec's treatment of content-location [etc] > If you can detect the problem but can not fix it, then simply reject such > buggy pages The point is, these pages ARE NOT BUGGY. They are following the RFC to the letter. This header, if implemented by the server as specified in the RFC (which it is in Apache) and if implemeneted in the client as specified in the RFC (which it was in Mozilla) breaks all sorts of user expectations wrt to a web site's URI. Did you even read the bug I pointed to AND ITS DEPENDENCIES like I asked? Or did you just spew about standards on general principle? The point, as I said in one of the bugs you clearly did not read, is that this header is broken-by-design. What the RFC specifies is simply not workable when correctly implemented on both the server and client side. > If such a necessary review of a standard blocks fixing Then the bug is wontfix and a new bug should be filed if the standard is ever updated. Note that updating the standard would involve creating a new header with a different name that does sorta what Content-Location does but is treated differently by servers. So the _HTTP/1.1_ header _Content-Location_ (what this bug is about) will not be implemented no matter what. I don't have time to deal with yet another standards committee to resolve this issue; if you do, please feel free to contact whoever is responsible for the HTTP RFC and talk to them about the problem. Ian, Apache does actually do this header right, which is what led us to find the problem in the RFC in the first place.
***
Bug 303552has been marked as a duplicate of this bug. ***
The post above from Boris on 2004-05-01 states that "the RFC is broken-by-design" but doesn't directly describe why. For future reference I believe the bugzilla entry that Boris refers to as describing the problem is 241981. It's a real shame this functionality isn't available as it is extremely useful when internal forwards are occuring within a webserver. In particular JavaServer Faces applications perform internal forwards regularly; a post to /alpha/beta.jsf will often forward to /gamma/delta.jsp. The latter page often wants to reference resources using relative paths but cannot as the browser will resolve paths relative to /alpha (the last url it knew about), not /gamma. Sending a "redirect" is one solution but has significant implications, esp. with respect to "request-scope variables". The html BASE tag is of no use here as that requires a hostname and port which are not available to the code being executed. The Content-Location tag would solve this issue as it allows relative paths - and having this info in a header rather than an html tag is far more convenient too. Ah well...
> The html BASE tag is of no use here as that requires a hostname and port which > are not available to the code being executed. That's really odd; I can't think of a single sane server-side solution that doesn't know its own hostname and port. But even if true, if you're willing to rely on JavaScript you can output JavaScript which will document.write() the relevant <base> tag. Of course long-term the answer is still to get the HTTP folks to provide a better alternative....
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4