On Sun, Jul 20, 2003 at 02:29:26PM -0400, Barry Warsaw wrote: >=20 > Can we perhaps have a PEP for the 2.4 timeframe? Sure. Reviews would be really appreciated. PEP: XXX Title: Be Honest about LC_NUMERIC (to the C library) Version: $Revision: 1.9 $ Last-Modified: $Date: 2002/08/26 16:29:31 $ Author: Christian R. Reis <kiko at async.com.br> Status: Draft Type: Standards Track Content-Type: text/plain <pep-xxxx.html> Created: 19-July-2003 Post-History: =09 ------------------------------------------------------------------------ Abstract =20 Support in Python for the LC_NUMERIC locale category is currently implemented only in Python-space, which causes inconsistent behavior and thread-safety issues for applications that use extension modules and libraries implemented in C. This document proposes a plan for removing this inconsistency by providing and using substitute locale-agnostic functions as necessary. Introduction Python currently provides generic localization services through the locale module, which among other things allows localizing the display and conversion process of numeric types. Locale categories, such as LC_TIME and LC_COLLATE, allow configuring precisely what aspects of the application are to be localized. The LC_NUMERIC category specifies formatting for non-monetary numeric information, such as the decimal separator in float and fixed-precision numbers. Localization of the LC_NUMERIC category is currently implemented in only in Python-space; the C libraries are unaware of the application's LC_NUMERIC setting. This is done to avoid changing the behavior of certain low-level functions that are used by the Python parser and related code [2]. However, this presents a problem for extension modules that wrap C libraries; applications that use these extension modules will inconsistently display and convert numeric values.=20 =20 James Henstridge, the author of PyGTK [3], has additionally pointed out that the setlocale() function also presents thread-safety issues, since a thread may call the C library setlocale() outside of the GIL, and cause Python to function incorrectly. Rationale The inconsistency between Python and C library localization for LC_NUMERIC is a problem for any localized application using C extensions. The exact nature of the problem will vary depending on the application, but it will most likely occur when parsing or formatting a numeric value. Example Problem =20 The initial problem that motivated this PEP is related to the GtkSpinButton [4] widget in the GTK+ UI toolkit, wrapped by PyGTK. The widget can be set to numeric mode, and when this occurs, characters typed into it are evaluated as a number.=20 =20 Because LC_NUMERIC is not set in libc, float values are displayed incorrectly, and it is impossible to enter values using the localized decimal separator (for instance, `,' for the Brazilian locale pt_BR). This small example demonstrates reduced usability for localized applications using this toolkit when coded in Python. Proposal Martin V. L=F6wis commented on the initial constraints for an acceptable solution to the problem on python-dev: - LC_NUMERIC can be set at the C library level without breaking the parser. - float() and str() stay locale-unaware. The following seems to be the current practice: - locale-aware str() and float() [XXX: atof(), currently?] stay in the locale module. An analysis of the Python source suggests that the following functions currently depend on LC_NUMERIC being set to the C locale: - Python/compile.c:parsenumber() - Python/marshal.c:r_object() - Objects/complexobject.c:complex_to_buf() - Objects/complexobject.c:complex_subtype_from_string() - Objects/floatobject.c:PyFloat_FromString() - Objects/floatobject.c:format_float() - Modules/stropmodule.c:strop_atof() - Modules/cPickle.c:load_float() [XXX: still need to check if any other occurrences exist] The proposed approach is to implement LC_NUMERIC-agnostic functions for converting from (strtod()/atof()) and to (snprintf()) float formats, using these functions where the formatting should not vary according to the user-specified locale.=20 =20 This change should also solve the aforementioned thread-safety problems. Potential Code Contributions This problem was initially reported as a problem in the GTK+ libraries [5]; since then it has been correctly diagnosed as an inconsistency in Python's implementation. However, in a fortunate coincidence, the glib library implements a number of LC_NUMERIC-agnostic functions (for an example, see [6]) for reasons similar to those presented in this paper. In the same GTK+ problem report, Havoc Pennington has suggested that the glib authors would be willing to contribute this code to the PSF, which would simplify implementation of this PEP considerably. [XXX: I believe the code is cross-platform, since glib in part was devised to be cross-platform. Needs checking.] [XXX: I will check if Alex Larsson is willing to sign the PSF contributor agreement [7] to make sure the code is safe to integrate.] Risks There may be cross-platform issues with the provided locale-agnostic functions. This needs to be tested further. Martin has pointed out potential copyright problems with the contributed code. I believe we will have no problems in this area as members of the GTK+ and glib teams have said they are fine with relicensing the code. Code An implementation is being developed by Gustavo Carneiro=20 <gjc at inescporto.pt>. It is currently attached to Sourceforge.net bug 744665 [8] [XXX: The SF.net tracker is horrible 8(] References [1] PEP 1, PEP Purpose and Guidelines, Warsaw, Hylton http://www.python.org/peps/pep-0001.html [2] Python locale documentation for embedding, http://www.python.org/doc/current/lib/embedding-locale.html [3] PyGTK homepage, http://www.daa.com.au/~james/pygtk/ [4] GtkSpinButton screenshot (demonstrating problem),=20 http://www.async.com.br/~kiko/spin.png [5] GNOME bug report, http://bugzilla.gnome.org/show_bug.cgi?id=3D114= 132 [6] Code submission of g_ascii_strtod and g_ascii_dtostr (later renamed g_ascii_formatd) by Alex Larsson,=20 http://mail.gnome.org/archives/gtk-devel-list/2001-October/msg001= 14.html [7] PSF Contributor Agreement,=20 http://www.python.org/psf/psf-contributor-agreement.html [8] Python bug report, http://www.python.org/sf/774665 Copyright This document has been placed in the public domain. Take care, -- Christian Reis, Senior Engineer, Async Open Source, Brazil. http://async.com.br/~kiko/ | [+55 16] 261 2331 | NMFL
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4