Re: patch for file:// encoding on Windows
Date: Sun, 28 Sep 2014 10:44:03 +0200 (CEST)
On Sat, 27 Sep 2014, clinton_at_elemtech.com wrote:
>> It feels like something with a much larger scope than just file:// URLs
>> that I feel very scared of even considering. Please provide a proper
>> motivation for why we want this! URLs are not UTF-8, they're a sequence of
> The raw "sequence of bytes" idea doesn't work on Windows.
Sure it does. See below.
> From the current code page
That's not a very workable approach. What if you copy the URL from somewhere?
Assuming a "current code page" is asking for non-deterministic behaviors in
how the input is treated.
> Not all files are accessible this way when you have an NTFS file system that
> supports file names that can't be represented with the default 8 bit
It is a mistake to think that you should be able to feed in the "raw 8 bit
encoding" in the URL to start with. Also, a URL should work the same no matter
which OS you run where you enter it so treating it differently if you feed it
on windows than on non-windows is asking for trouble.
> This problem has been brought up before:
... and never properly dealt with in any of those situations.
"This problem" is at least two separate ones: 1 - what the URL should look
like to allow a unicode file name to get opened and 2 - have the actual file:
code understand and work with a file name provide according to (1).
So a question that would help me at least form my opinion on this better:
given a unicode file name example like "Å•Ã©Ã¼Ã±ÃÃ¶Ã±", how does a file: URL that
works with IE, Firefox and Chrome look like? I don't mean what it looks like
in the URL bar, but if you copy it and paste it somewhere, what does that look
In both Firefox and Chrome on Linux, such a file name in my home directory
uses this URL:
Percent-encoded UTF-8 it looks like to me.
No "current code page" necessary. A single defined way how to decode it.
-- / daniel.haxx.se
List admin: http://cool.haxx.se/list/listinfo/curl-library
Received on 2014-09-28