cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: Escaping URL's (was Japanese characters in URL)

From: Cris Bailiff <c.bailiff_at_awayweb.com>
Date: Wed, 09 May 2001 11:41:25 +1000

Nielsen Linus wrote:
>
> > I believe that they're are routines out there that will
> > correctly handle
> > this situation, and will only encode data that needs
> > encoding, in which
> > case this wouldn't be problem...
>
> But the question remains. How is this routine supposed to know
> when the data "needs encoding"?
>
> How does it decide that the string "Hello%20World" is to be
> escaped or not? Imagine a CGI that takes a URL formatted string
> as input. Then curl would have to escape the "%", otherwise the
> CGI would receive "Hello World", which would be wrong.
>
> Curl can't possibly know that this particular CGI _wants_ the
> string in this URL-encoded format.
>
> I can't think of any automatic way of solving this.

Exactly. There is no way of knowing automatically (a weakness in the URL
escaping specification I think).

You can certainly 'normalize' the escaping - escape things that should
be escaped, but aren't and unescape (if you like) things that don't need
escaping (for security or other reaons, %2e%2e == .. for example), and
leave alone things that are correct, but that routine must know that the
url is already escaped, so that it doesn't try to double escape the '%'
characters again - you still need a switch, even to do this level of
escaping.

Cris
Received on 2001-05-09