Java uses ResourceBundles, which are identified by basename + 2 character locale id (eg "en", "fr" etc). The content of the resource bundle is essentially a dictionary of name value pairs. MS Visual C++ uses pragma code_page(windows_code_page_id) in resource files to indicate what code page was used to generate the subsequent text. In both cases, an application would rely on a fixed (7 bit ASCII) subset to give the well-known key to find the localized text for the current locale. Any "hardcoded" string literals would be mangled when attempting to display them using an alternate locale. So essentially, one could take the view that correct support for localization is a runtime issue affecting the user of an application, not the developer. Hence, myfile.py may contain 8 bit string literals encoded in my current windows encoding (1252) but my user may be using Japanese Windows in code page 932. All I can guarantee is that the first 128 characters (notwithstanding BACKSLASH) will be rendered correctly - other characters will be interpreted as half width Katakana or worse. Any literal strings one embeds in code should be purely for the benefit of the code, not for the end user, who should be seeing properly localized text, pulled back from a localized text resource file _NOT_ python code, and automatically pumped through the appropriate native <--> unicode translations as required by the code. So to sum up, 1 Hardcoded strings are evil in source code unless they use the invariant ASCII (and by extension UTF8) character set. 2 A proper localized resource loading mechanism is required to fetch genuine localized text from a static resource file (ie not myfile.py). 3 All transformations of 8 bit strings to and from unicode should explicitly specify the 8 bit encoding for the source/target of the conversion, as appropriate. 4 Assume that a Japanese / Chinese programmer will find it easier to code using the invariant ASCII subset than a Western European / American will be able to read hanzi in source code. Regards, Mike da Silva -----Original Message----- From: Ka-Ping Yee [mailto:ping@lfw.org] Sent: Wednesday, April 12, 2000 6:45 PM To: Fred L. Drake, Jr. Cc: Python Developers @ python.org Subject: Re: [Python-Dev] #pragmas in Python source code On Wed, 12 Apr 2000, Fred L. Drake, Jr. wrote: > > Or do we need to separate out two categories of pragmas -- > > pre-parse and post-parse pragmas? > > Eeeks! We don't need too many special forms! That's ugly! Eek indeed. I'm tempted to suggest we drop the multiple-encoding issue (i can hear the screams now). But you're right, i've never heard of another language that can handle configurable encodings right in the source code. Is it really necessary to tackle that here? Gak, what do Japanese programmers do? Has anyone seen any of that kind of source code? -- ?!ng _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://www.python.org/mailman/listinfo/python-dev
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4