|
On 13/07/2020 02:59, Scott Morgan wrote:
On 12/07/2020 21:04, Milind Gupta wrote:How can I emulate the python encode function? Why does that work on the windows cmd and utf8.codepoint does not?tl;dr CMD.EXE isn't fully UTF8 compliant. No idea what tricks Python is pulling (recoding from a local codepage? Possible if you didn't chcp 65001 first)
As far as far I remember, cmd.exe has no UTF-8 support at all (besides that MS effort for Win10 you quote below - BTW, thanks, I didn't know anything about that - I'm still on Win7 most of the time).
cmd.exe can run in UTF-16 compliant mode, but no UTF-8 processing. CHCP 65001 code page only changes the code page and thus some console handling for console applications (that anyway must do the right thing).
cmd.exe /is/ a console application, but doesn't handle UTF-8, i.e. it doesn't do the right thing.
See this: https://ss64.com/nt/chcp.html and this SO answer in particular: https://stackoverflow.com/a/47843552/2633423
AFAIK, this is the last word on the issue: https://devblogs.microsoft.com/commandline/windows-command-line-unicode-and-utf-8-output-text-buffer/The current changes also don’t cover what is required for our “processed input mode” that presents an editable input line for applications like CMD.exe.