A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2008-October/082720.html below:

[Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

[Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issueMichael Urman murman at gmail.com
Wed Oct 1 02:16:19 CEST 2008
On Tue, Sep 30, 2008 at 7:04 PM, Steven D'Aprano <steve at pearwood.info> wrote:
>> I believe on disk it uses UTF-16.
>
> Which is made up of bytes. There may be byte sequences that are illegal
> UTF-16, but that's not what Martin said. I don't understand how there
> can be UTF-16 sequences which don't correspond to some sequence of
> bytes. How would they be represented in memory? Is this to do with the
> endianness of the UTF-16 sequence?

It has to do with the internal mapping between the ANSI and Unicode
functions. On NT systems, CreateFileA will map the ANSI bytestring to
a Unicode filename via the active code page, and call CreateFileW
accordingly. The active code page cannot be set to something as useful
as UTF-8, so given any actual code page (1252, 932, etc.) there are
Unicode strings that cannot be represented with a bytestring provided
to the ANSI function.
-- 
Michael Urman
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4