A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/AngleSharp/AngleSharp.Xml/issues/14 below:

Invalid XML should not break parsing when IsSuppressingErrors = true · Issue #14 · AngleSharp/AngleSharp.Xml · GitHub

Bug Report Prerequisites

For more information, see the CONTRIBUTING guide.

Description

When using IsSuppressingErrors = true in XmlParserOptions an exception is thrown when trying to parse an invalid XML.

The Stacktrace:

AngleSharp.Xml.Parser.XmlParseException: Error while parsing the provided XML document.
   at AngleSharp.Xml.Parser.XmlTokenizer.TagSelfClosing(XmlTagToken tag)
   at AngleSharp.Xml.Parser.XmlDomBuilder.ParseAsync(XmlParserOptions options, CancellationToken cancelToken)
   at AngleSharp.Xml.Parser.XmlParser.ParseAsync(XmlDocument document, CancellationToken cancel)
Steps to Reproduce

Given the following XML:

<P>
    <P>
        <FONT FACE="calibri" SIZE="14.666666666666666" COLOR="#000000"></FONT>
    </P>
    <P>
        <FONT FACE="calibri" SIZE="14.666666666666666" COLOR="#000000"></FONT>
    </P>
    <P>
        <FONT FACE="calibri" SIZE="14.666666666666666" COLOR="#0000ff">
            <U>
                <https://some.url.example.com></U>
            </FONT>
            <FONT FACE="calibri" SIZE="14.666666666666666" COLOR="#000000">
                <B></B>
            </FONT>
            <FONT FACE="calibri" SIZE="14.666666666666666" COLOR="#000000"></FONT>
        </P>
        <P>
            <FONT FACE="calibri" SIZE="14.666666666666666" COLOR="#000000"></FONT>
        </P>
        <P>
            <FONT FACE="calibri" SIZE="14.666666666666666" COLOR="#000000"></FONT>
        </P>
    </P>

The problem is the missing closing tag of the first <P>.
When parsing the xml like so, the exception from the description above is thrown:

var xml = "xml from above";
var config = Configuration
                    .Default
                    .WithXml();
var context = BrowsingContext.New(config);

var parser = new XmlParser(new XmlParserOptions { IsSuppressingErrors = true }, context);
var document = await parser.ParseDocumentAsync(xml, cancellationToken);
var html = document.ToHtml();

I know this sounds quite stupid, but I need to actually parse invalid XML data and convert it to HTML afterwards.
Is there some way to parse and/or fix an invalid XML with AngleSharp.Xml?


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4