mal wrote: > > Given the new 7-bit-ASCII-as-default-encoding-for-8-bit-strings > > convention, shouldn't just hashing the character values work > > fine? That is, hash('abc') should == hash(u'abc'), no conversion > > required. > > Yes, and it does so already for pure ASCII values. The problem > comes from the fact that the default encoding can be changed to > a locale specific value (site.py does the lookup for you), e.g. > given you have defined LANG to be us_en, Python will default > to Latin-1 as default encoding. footnote: in practice, this is a Unix-only feature. I suggest adding code to the _locale module (or maybe sys is better?) which can be used to dig up a suitable encoding for non-Unix platforms. On Windows, the code page should be "cp%d" % GetACP(). I'll look into this later today. </F>
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4