M.-A. Lemburg wrote: > Simply going with UCS-4 does not solve the problem, since > even with UCS-4 storage, you can still have surrogates in your > Python Unicode string. Yes, but in that case, you presumably *intend* them to be treated as separate indexing units. If you didn't, there would be no need to use surrogates in the first place. -- Greg
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4