Description
@bagder and @jay & the curl team:
Hi - I am the PM for Windows Console. Since adding curl to Windows 10 in the Windows 10 Sprint 2018 Update, we've seen and received reports of issue with Curl not behaving as expected on Windows, e.g. curl issues #731 and #345, and known-issue 5.5
I and the Windows Console Team, are in the process of overhauling Windows Console.
For example, we recently added a Pseudo Console (ConPTY) API to Windows for the first time. This API was inspired by Linux' openpty(3)
APIs and enables terminal/server applications to create and supply (via a call to CreatePseudoConsole()
) the pipes via which they communicate directly with command-line apps (e.g. Cmd, PowerShell, curl.exe, etc.), using UTF-8 encoded text/VT.
We're also in the process of overhauling how the Console's internal buffers store and handle Unicode, and UTF-8. In the up-coming Windows 10 Fall 2018 release, Console will be able to store and retrieve Unicode text in codepage 65001 (though it'll still struggle to display emoji, chars outside the currently selected font, etc. - that's a whole other set of issues 😜) :
To accomplish the above, I hacked a couple of quick changes into my own fork of curl to enable VT-mode in the Console. when curl starts-up.
However, these changes are far from complete:
- I am sure you'd want such changes implemented in a different way?!
- I haven't yet figured out is where to add code that changes the codepage to 65001 at startup on Windows 10? Alas, without switching to codepage 65001 for UTF-8 HTTP responses, Console garbles its output:
Could you steer me towards the changes you'd recommend to enable curl to light-up on Windows?
Activity
bagder commentedon Sep 17, 2018
Hey @bitcrazed, this looks awesome.
I think that looks like a great start. If you make a PR out of that I'm sure we'll nitpick some details but the general approach should be fine.
Can't it be done in those functions or in functions adjacent to
configure_terminal
from (1) ? (I feel I'm missing some finer point.)bitcrazed commentedon Sep 18, 2018
Thanks @badger. Great to hear - I'll submit a PR in a moment and let the fun begin :)
Good question - we could just set the CP to 65001 at startup, but I was originally thinking to try to set the CP to more closely match the CP expressed by an HTML doc, as per W3C guidelines on encoding declaration, by looking for a
<meta charset="utf-8" />
declaration in the first 2K or so of any response.Is this overkill?
bagder commentedon Sep 18, 2018
Ah, now I get you, thanks. Okay, I see what you're saying but that's an approach we don't do on any system so it would be a first -- and quite frankly that's a route full of traps and sorry faces. I'm not convinced that road actually leads to happiness without an awful lot of work and trial and error.
So, perhaps not "overkill" but quite complicated. But of course all other approaches will mean more shortcuts and guessing with less actual input to base the guesses on...
bagder commentedon Sep 18, 2018
(oh btw, my nick name is a dyslexic animal, not the correctly spelled version!)
bitcrazed commentedon Sep 18, 2018
OMG! Sorry @ba-gd-er - Hadn't even noticed that - I think I need new glasses!
bitcrazed commentedon Sep 27, 2018
Okay, I think we're close.
Here's what the version of curl currently shipping in Windows 10 looks like when querying http://wttr.in/seattle sans VT and UTF-8 support:

And here's what it looks like with VT enabled and the UTF-8 codepage (65001) selected:

For comparison, here's what curl for Linux looks like built from the same sources and running in Ubuntu on WSL:

If the user is running an older version of Windows that doesn't support VT then they'll see something very similar to the first screenshot above, which is no worse than what they'll see today anyhow.
mattn commentedon Nov 1, 2018
Please revert this change. Please Please Please!
This is very ugly change. Changing console codepage effect console font. Also you probably notice that curl doesn't restore original font.
@bitcrazed This is NOT right way to show unicode. You should use wide string APIs.
bagder commentedon Nov 1, 2018
I turned that comment into a proper issue. Take follow-ups there.