cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Building libcurl on MS-Windows with UNICODE defined

From: Vincent Torri <vincent.torri_at_gmail.com>
Date: Tue, 1 Nov 2011 20:15:35 +0100

On Tue, Nov 1, 2011 at 7:53 PM, Tom Bishop, Wenlin Institute <
tangmu_at_wenlin.com> wrote:

>
> On Oct 15, 2011, at 2:57 PM, Daniel Stenberg wrote:
>
> > On Mon, 10 Oct 2011, Tom Bishop, Wenlin Institute wrote:
> >
> > Thanks for your work and research!
> >
> >> I don't know whether this significantly affects the operation of
> libcurl as it is actually used. If libcurl needs any of these functions to
> handle non-Latin strings, it will presumably fail.
> >
> > ...
> >
> >> Is there any documentation of this issue for libcurl? I don't find any
> mention of it in the source code itself. Excuse me if the issue has already
> been addressed on this mailing list.
> >
> > No, I don't believe we've discussed this to any particular degree in the
> past. At least I can't recall it.
> >
> >> And, is there any interest in adding support for UNICODE? MS-Windows
> has supported Unicode since 1995, sixteen years ago. Half the world's
> population uses non-Latin scripts, and even for languages such as English,
> Unicode provides useful characters that aren't in the ANSI/Windows code
> pages.
> >
> > Well, does the current code cause some kind of problem? The way I read
> your mail is that you think there _might_ be problems, but I don't think
> anyone has reported/mentioned any up until now and you're not being very
> specific either so in my view this is not a criticial issue.
>
> Please excuse the lateness of this reply. I agree this does not appear to
> be a critical issue and I understand that a new version is coming soon, so
> I don't suggest making any immediate changes. I'm not aware of any problem
> with the current code, provided that it's built with the makefile (with
> UNICODE not defined), and assuming that user-names, etc., passed to these
> MS-Windows functions, are always ASCII in actual current usage (probably
> true since nobody has complained).
>
> > But if we can fix problems by altering the code, and not cause backwards
> compatible problems, then I'm all for it!
> >
> > The only unicode related issue on windows that I can recall is people
> trying to use curl_formadd() and pass in a unicode file name path.
> >
> > I'll certainly appreciate a patch and I hope I can get more Windows savy
> people than me to help me review it for correctness.
>
> Thank you! When I have more time to spare I'd like to study it further and
> possibly make some suggestions. The only change I'd suggest testing after
> the next version is to replace FormatMessage(), etc., with
> FormatMessageA(), etc., so that the code will compile and run correctly
> (for code points < U+0080) regardless of whether UNICODE is defined, for
> the benefit of people who compile CURL with different makefiles. (I've done
> that already in my own copy, and it compiles without warnings, but so far
> I'm using only a tiny part of CURL's functionality so I can't say it's
> well-tested. Still, the logic is simple: if UNICODE is not defined, then
> FormatMessage is defined as FormatMessageA anyway, so the replacement has
> no effect. If UNICODE is defined, then FormatMessage is defined as
> FormatMessageW, and it triggers a compiler warning and run-time failure if
> called with "char *". Therefore it's better to use the name FormatMessageA
> explicitly, as long as you're calling it with "char *".)
>
> Probably there are also potential improvements that would use the
> Unicode-capable versions of the MS-Windows functions, possibly supporting
> UTF-8 strings that get converted to UTF-16 so that CURL's API can still use
> "char *" but non-ASCII characters will get passed correctly to the
> MS-Windows functions.
>

In case such transformation is necessary, i wrote one based on that code
below (mine is a bit different, though, as I compute the size of the
destination string with MultiByteToWideChar() and I do not use CString
(just char * and wchar_t *)) :

http://msmvps.com/blogs/gdicanio/archive/2010/01/04/conversion-between-unicode-utf-16-and-utf-8-in-c-win32.aspx

regards

Vincent Torri

-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2011-11-01