cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: More fun with NTLM: Unicode support

From: Joshua Kwan <jkwan_at_vmware.com>
Date: Fri, 2 Oct 2009 11:01:40 -0700

On 10/2/09 10:31, "Daniel Stenberg" <daniel_at_haxx.se> wrote:
> I don't know. The NTLM docs by Eric Glass says "An OEM string is a string in
> which each character is represented as an 8-bit value from the local machine's
> native character set (DOS codepage).". Which makes it feel even older ;-)

Well, if the patch works out, we needn't ever send the name and password
over the wire in OEM format ever again.
 
>> I'm just trying to gauge what the official opinion of cURL developers is on
>> the iconv front before I get to coding.
>
> Would it be a working approach to instead offer an API for providing the name
> and password as UTF-8 or UTF-16 encoded and thus avoid having to recode them
> within libcurl?

That would work, but then you have the inverse problem: what if you need to
pass such information into something that wants UTF-8? Then you'd need to
downconvert it. Or pass it in twice using two different curl_easy_setopt()
options. I think it'll get messy this way. I think the best thing to do is
to establish a contract (within documentation, etc.) that the canonical
codeset we are using within cURL is UTF-8 unless you specify otherwise.

If you're afraid of introducing link-time dependencies on cURL, we could
twiddle the configure script this way:

1. If iconv is not available, don't define HAVE_ICONV or
CURL_DOES_CONVERSIONS. (Identical to the old behavior)
2. If iconv is only available as a shared library, only enable it if you
pass --enable-iconv on the configure command line (for example.)
3. If iconv is part of the C library you're going to link against anyway,
automatically enable it unless you explicitly disable it with
--disable-iconv.

If you're still afraid, as a stopgap measure it shouldn't be too hard to
provide a UTF-8 -> UTF-16 upconverter in http_ntlm.c. In this case it would
still be useful to assert that all curl_easy_setopt() input passed in via
char*s is in UTF-8 (or at the very least, for the username/password
functions.)

Let me know what you think.
-Josh

-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2009-10-02