I came across this myself before I joined the list. My general rule was to always convert unicode to strings with something like: "%s" %unicode (I don't remember if I avoided str because it also returned unicode) for any internal use. I think the context was I wanted to use a type attribute from an xml tag to instantiate an object whose class I retrieved from a dict. So I had something like: module = self.record['module'] if not resources.__dict__.has_key (module): raise RuntimeError, "Attempted to retrieve data from non-existant resource module: %s" %module code = resources.__dict__[module] obj = apply (code, [ ], self.record['args']) ... where module was a unicode string. This was an example where unicode sorta transparently pissed me off because it behaved just like a string in so many ways but wasn't. -- Mike On Sat, Apr 27 @ 21:30, Matthias Urlichs wrote: > Playing around with xml.dom.minidom, I noticed that this beast is > perfectly able to read HTML which it can't print: > > >>> import xml.dom.minidom as md > >>> d=md.parseString("<foo>bߐ</foo>")) > >>> d.writexml(sys.stdout) > ... > UnicodeError: ASCII encoding error: ordinal not in range(128) > > Ouch. -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4