cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: libcurl and IDNA

From: Daniel Stenberg <daniel-curl_at_haxx.se>
Date: Wed, 7 Apr 2004 08:11:39 +0200 (CEST)

On Tue, 6 Apr 2004, Gisle Vanem wrote:

> With the recent possibility to register domain-names with non-ASCII
> characters it would be nice if libcurl would support that in some way.

I agree completely!

> What would happen in curl now if one enters some IDN in some east-asian
> encoding? I guess it would break in sscanf() etc. (but maybe UTF-8 works?)

I haven't paid enough attention on how the URL would be formatted in these
cases. You have any examples?

> This should ideally be the task of the OS or tcp/ip stack, but
> since the standard is very new, it's not. Besides it would not work for
> protocols that exposes hostnames in the app. layer.
> E.g. HTTP 1.1 "Host" header must include the domain-name on ACE
> (ASCII Compatible Encoding) form. So this won't work if the web-server
> serves multiple domains:
> GET /some/document HTTP/1.1
> Host: www.tromsų.no
>
> But this should work:
> GET /some/document HTTP/1.1
> Host: www.xn--troms-zuA.no
>
> (try it in curl and see you'll get 2 different outputs).

'www.tromsų.no' is not really a host name I can use with curl on my Linux box
since the resolver refuses to resolve it to an IP address:

curl: (6) Couldn't resolve host 'www.tromsų.no'

www.xn--troms-zuA.no works though.

> There are several IDN libraries around, but GNU libidn looks promising.
> http://josefsson.org/libidn/. A drawback is that it requires iconv that adds
> approx 1MByte of data/code for all those charset tables. On Windows it would
> be better off with using WinNLS. But that's another matter.
>
> Any comments?

I think we are gonna see quite a lot of domains using non-ASCII characters
within shortly, so I think getting curl to work with them is a rather
important task.

-- 
     Daniel Stenberg -- http://curl.haxx.se -- http://daniel.haxx.se
      Dedicated custom curl help for hire: http://haxx.se/curl.html
Received on 2004-04-07