A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://mail.python.org/pipermail/python-dev/2011-November/114327.html below:

[Python-Dev] Unicode exception indexing

[Python-Dev] Unicode exception indexing [Python-Dev] Unicode exception indexing"Martin v. Löwis" martin at v.loewis.de
Thu Nov 3 22:47:00 CET 2011
>> On the one hand, these indices are used in formatting error messages such as
>> "codec can't encode character \u%04x in position %d", suggesting they  
>> are regular
>> indices into the string (counting code points).
>>
>> On the other hand, they are used by error handlers to lookup the character,
>> and existing error handlers (including the ones we have now) use
>> PyUnicode_AsUnicode to find the character. This suggests that the indices
>> should be Py_UNICODE indices, for compatibility (and they currently do
>> work in this way).
> 
> But what about error handlers written in Python?

I'm working on a patch where an C error handler using
PyUnicodeEncodeError_GetStart gets a different value than a Python
error handler accessing .start. The _GetStart/_GetEnd functions would
take the value from the exception object, and adjust it before returning
it.

The implementation is fairly straight-forward, just a little expensive
(in the case of non-BMP strings on Windows).

Regards,
Martin
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4