cURL / Mailing Lists / curl-users / Single Mail


Re: Escaping URL's (was Japanese characters in URL)

From: Cris Bailiff <>
Date: Wed, 09 May 2001 11:41:25 +1000

Nielsen Linus wrote:
> > I believe that they're are routines out there that will
> > correctly handle
> > this situation, and will only encode data that needs
> > encoding, in which
> > case this wouldn't be problem...
> But the question remains. How is this routine supposed to know
> when the data "needs encoding"?
> How does it decide that the string "Hello%20World" is to be
> escaped or not? Imagine a CGI that takes a URL formatted string
> as input. Then curl would have to escape the "%", otherwise the
> CGI would receive "Hello World", which would be wrong.
> Curl can't possibly know that this particular CGI _wants_ the
> string in this URL-encoded format.
> I can't think of any automatic way of solving this.

Exactly. There is no way of knowing automatically (a weakness in the URL
escaping specification I think).

You can certainly 'normalize' the escaping - escape things that should
be escaped, but aren't and unescape (if you like) things that don't need
escaping (for security or other reaons, %2e%2e == .. for example), and
leave alone things that are correct, but that routine must know that the
url is already escaped, so that it doesn't try to double escape the '%'
characters again - you still need a switch, even to do this level of

Received on 2001-05-09