curl-users
Re: libcurl and IDNA
Date: Wed, 7 Apr 2004 08:11:39 +0200 (CEST)
On Tue, 6 Apr 2004, Gisle Vanem wrote:
> With the recent possibility to register domain-names with non-ASCII
> characters it would be nice if libcurl would support that in some way.
I agree completely!
> What would happen in curl now if one enters some IDN in some east-asian
> encoding? I guess it would break in sscanf() etc. (but maybe UTF-8 works?)
I haven't paid enough attention on how the URL would be formatted in these
cases. You have any examples?
> This should ideally be the task of the OS or tcp/ip stack, but
> since the standard is very new, it's not. Besides it would not work for
> protocols that exposes hostnames in the app. layer.
> E.g. HTTP 1.1 "Host" header must include the domain-name on ACE
> (ASCII Compatible Encoding) form. So this won't work if the web-server
> serves multiple domains:
> GET /some/document HTTP/1.1
> Host: www.tromsų.no
>
> But this should work:
> GET /some/document HTTP/1.1
> Host: www.xn--troms-zuA.no
>
> (try it in curl and see you'll get 2 different outputs).
'www.tromsų.no' is not really a host name I can use with curl on my Linux box
since the resolver refuses to resolve it to an IP address:
curl: (6) Couldn't resolve host 'www.tromsų.no'
www.xn--troms-zuA.no works though.
> There are several IDN libraries around, but GNU libidn looks promising.
> http://josefsson.org/libidn/. A drawback is that it requires iconv that adds
> approx 1MByte of data/code for all those charset tables. On Windows it would
> be better off with using WinNLS. But that's another matter.
>
> Any comments?
I think we are gonna see quite a lot of domains using non-ASCII characters
within shortly, so I think getting curl to work with them is a rather
important task.
-- Daniel Stenberg -- http://curl.haxx.se -- http://daniel.haxx.se Dedicated custom curl help for hire: http://haxx.se/curl.htmlReceived on 2004-04-07