cURL / Mailing Lists / curl-users / Single Mail


Re: Escaping URL's (was Japanese characters in URL)

From: Cris Bailiff <>
Date: Tue, 08 May 2001 14:04:33 +1000

..... utf8 encoding elided...

> Hehe. No clue....
> What I'm more thinking is if someone sets CURLOPT_URL (I know very little
> about the command line tool. ;) to:
> Carson's Last %
> That the url get automagically encoded by cURL...

Not automagically please - I don't want curl to manipulate the URL
unless I say so... Its good/fine to have a function (curl_escape?) which
normalises URLS to canonical/legal form, but if I've already done that
to my URL, it's wrong/broken to do it again. The problem is that you
don't know if a URL is escaped or unescaped, and can't tell by looking.

E.g. Carson's Last %

should obviously be escaped as:

but if curl did this automa(tg)ically, what happens when I give it this

Does it escape it or not? Why should it, its already correct, but how do
you know? If it come out as:

then you broke it....

Curl could have a switch to say 'escape/normalize this url', but it
should default to off.

If you want to turn multibyte characterts into escaped strings, the same
switch might suffice, but I suspect what you really want is a
mutlibyte->UTF-8 normalizer, followed by selective escaping of the utf-8
bytes. Remember, the URL might already have other (legal) %-encoded
strings, so you can't just run the general escaping switch over the
string - only over your newly normailzed binary characters, otherwise
you'll double escape the '%20's etc again...

Received on 2001-05-08