A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/microsoft/terminal/issues/4551 below:

COOKED_READ doesn't return UTF-8 on *A APIs in CP_UTF8 · Issue #4551 · microsoft/terminal · GitHub

Environment
Microsoft Windows [Version 10.0.18363.592]
Impact

This issue is affecting reading console input via the Universal C Runtime as well - _read, getchar, fread, scanf, etc. Using _cgets_s only works around this issue because it uses ReadConsoleW instead of ReadFile. This is also reported against the UCRT on Developer Community here: _read() cannot read UTF-8 but _cgets_s() can.

Steps to reproduce

When using ReadFile to read from a console handle, UTF-8 input is not correctly returned. Using ReadFile on other types of handles (files, pipes) can read UTF-8 without issue. SetConsoleCP and SetConsoleOutputCP do not appear to affect this behavior.

C:\Users\stwish\source\read_utf8>type win32_test.cpp
#include <Windows.h>
#include <stdio.h>

int main()
{
    SetConsoleCP(65001);
    SetConsoleOutputCP(65001);
    const HANDLE console_stdin = GetStdHandle(STD_INPUT_HANDLE);

    const size_t buf_count = 20;
    char buffer[buf_count]{};

    DWORD num_read;

    BOOL result = ReadFile(
        console_stdin,
        buffer,
        buf_count,
        &num_read,
        nullptr
        );

    printf("ReadFile returned '%d'\n", result);
    for (int i = 0; i < 20; i++)
    {
        printf("%02x ", (unsigned char)buffer[i]);
    }

    return 0;
}
C:\Users\stwish\source\read_utf8>cl /nologo /EHsc /MT win32_test.cpp /Zi
win32_test.cpp
C:\Users\stwish\source\read_utf8>win32_test.exe
我是中文字符
ReadFile returned '1'
00 00 00 00 00 00 0d 0a 00 00 00 00 00 00 00 00 00 00 00 00
C:\Users\stwish\source\read_utf8>echo 我是中文字符 | win32_test.exe
ReadFile returned '1'
e6 88 91 e6 98 af e4 b8 ad e6 96 87 e5 ad 97 e7 ac a6 20 0d
C:\Users\stwish\source\read_utf8>type input.txt
我是中文字符

C:\Users\stwish\source\read_utf8>type input.txt | win32_test.exe
ReadFile returned '1'
e6 88 91 e6 98 af e4 b8 ad e6 96 87 e5 ad 97 e7 ac a6 00 00
Expected behavior

Running win32_test.exe and entering '我是中文字符' input on the console should return e6 88 91 e6 98 af e4 b8 ad e6 96 87 e5 ad 97 e7 ac a6 0d 0a as this is the UTF-8 representation of that string, plus CR LF.

Actual behavior

Running win32_test.exe and entering '我是中文字符' input on the console will return 6 null characters and CR LF, but still returns that the read operation was successful.

eryksun, r37r0m0d3l, KindDragon, vrubleg, asm256 and 8 more


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4