Re: Escaping URL's (was Japanese characters in URL)
Date: Tue, 08 May 2001 14:04:33 +1000
..... utf8 encoding elided...
> Hehe. No clue....
> What I'm more thinking is if someone sets CURLOPT_URL (I know very little
> about the command line tool. ;) to:
> http://www.google.com?search=Johnny Carson's Last %
> That the url get automagically encoded by cURL...
Not automagically please - I don't want curl to manipulate the URL
unless I say so... Its good/fine to have a function (curl_escape?) which
normalises URLS to canonical/legal form, but if I've already done that
to my URL, it's wrong/broken to do it again. The problem is that you
don't know if a URL is escaped or unescaped, and can't tell by looking.
http://www.google.com?search=Johnny Carson's Last %
should obviously be escaped as:
but if curl did this automa(tg)ically, what happens when I give it this
Does it escape it or not? Why should it, its already correct, but how do
you know? If it come out as:
then you broke it....
Curl could have a switch to say 'escape/normalize this url', but it
should default to off.
If you want to turn multibyte characterts into escaped strings, the same
switch might suffice, but I suspect what you really want is a
mutlibyte->UTF-8 normalizer, followed by selective escaping of the utf-8
bytes. Remember, the URL might already have other (legal) %-encoded
strings, so you can't just run the general escaping switch over the
string - only over your newly normailzed binary characters, otherwise
you'll double escape the '%20's etc again...
Received on 2001-05-08