M.-A. Lemburg wrote: > > To reinforce Fredrik's point here, note that XML only supports > > encodings at the level of an entire file (or external entity). You > > can't tell an XML parser that a file is in UTF-8, except for this = one > > element whose contents are in Latin1. >=20 > Hmm, this would mean that someone who writes: >=20 > """ > #pragma script-encoding utf-8 >=20 > u =3D u"\u1234" > print u > """ >=20 > would suddenly see "\u1234" as output. not necessarily. consider this XML snippet: <?xml version=3D'1.0' encoding=3D'utf-8'?> <body>ሴ</body> if I run this through an XML parser and write it out as UTF-8, I get: <body>=E1^=B4</body> in other words, the parser processes "&#x" after decoding to unicode, not before. I see no reason why Python cannot do the same. </F>
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4