> I started such hack for the UTF-8 codec... It is really tricky, we should not > do that! With the proper encapsulation, it's not that tricky. I have written functions PyUnicode_IndexToWCharIndex and PyUnicode_WCharIndexToIndex, and PyUnicodeEncodeError_GetStart and friends would use that function. I'd also need new functions PyUnicodeEncodeError_GetStartIndex to access the "true" start field. >> That would be expensive to compute > > Yeah, O(n) should be avoided when is it possible. Ok. I'll wait half a day or so for people to reconsider (now knowing that it's actually feasible to be fully backwards compatible); if nobody speaks up, I go ahead and accept the breakage. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4