cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: Japanese characters in URL

From: Sterling Hughes <sterling_at_designmultimedia.com>
Date: Mon, 7 May 2001 05:17:44 -0400 (EDT)

On Mon, 7 May 2001, Daniel Stenberg wrote:

> On Sun, 6 May 2001, Sterling Hughes wrote:
>
> > Perhaps a function should be added to cURL to urlencode these strings
> > (either as an option or by default)...
>
> libcurl actually already has a function that url-encodes strings, its named
> curl_escape() (lib/escape.c). It is still very byte-oriented and not much use
> for Japanese characters as they would probably be using unicode or something.
>

The PHP code isn't much different, I've contacted Rui Hirowaka (the
editor/translator for the Japanese PHP manual, and the author of the
multibyte string manipulation functions for PHP4, hopefully she'll know
of an algorithm for url encoding multibyte strings.)

> > We have such a function in the PHP source that could easily be ported to
> > cURL (Its a pretty standard routine, and I'll do it if you want).
>
> I am of course always happy to receive improvements/corrections to existing
> functions.
>

Well, I don't know how much of an improvement, I didn't think of
curl_escape(), its certainly a different way of doing the escaping (same
way apache does it afaik).

> > It wouldn't affect existing code (ie, it won't improperly munge correctly
> > encoded URL's), and might be of convience to users...
>
> If I were to add "URL encoding" to the curl command line tool, it would be
> made with a special option that would switch it on. Leaving it to work
> exactly as today without the option. (I'm not totally convinced this needs
> to be added.)
>
> Now, how the heck do Japanese guys enter a Japanese URL in a command line? I
> mean, what kind of byte-stream will be read from the argv[] array?
>

Hehe. No clue....

What I'm more thinking is if someone sets CURLOPT_URL (I know very little
about the command line tool. ;) to:

 http://www.google.com?search=Johnny Carson's Last %

That the url get automagically encoded by cURL...

-Sterling
Received on 2001-05-07