curl / Mailing Lists / curl-library / Single Mail

curl-library

Re: Status of IDN support?

From: Tim Ruehsen <tim.ruehsen_at_gmx.de>
Date: Wed, 11 Jan 2017 12:23:34 +0100

On Tuesday, January 10, 2017 11:40:49 PM CET Daniel Stenberg wrote:
> On Tue, 10 Jan 2017, Alessandro Ghedini wrote:
> >> TESTFAIL: These test cases failed: 165 1034 1035 2046 2047
> >
> > Note that this is with curl 7.52.1 and libidn2 0.14 from Debian unstable.
>
> I suspect this has something to do with libidn2's limitations, but we
> haven't changed any IDN code in curl since 7.52.1 that I can recall and I
> use 0.14 too.

Sorry for dropping in late... I made the recent changes to libidn2 which is
basically TR46 support.

Now the bad news: I introduced a bug (regression) regarding NFC conversion in
libidn 0.14. A fix is already in upstream repo but not released yet.
This might introduce the test failures you experience... on some systems
UTF-8/Unicode might be decomposed and on some it is composed. Using decomposed
codepoints with IDN2_NFC_INPUT fails with libidn 0.14.

But if you enable the new TR46 feature, the input is NFCed (and lowercased)
automatically:

Example:
$ printf "\x62\x6c\x61\xcc\x8a\x62\xc3\xa6\x72\x67\x72\xc3\xb8\x64\x2e\x6e
\x6f"|idn2
idn2: lookup: string is not in Unicode NFC format

$ printf "\x62\x6c\x61\xcc\x8a\x62\xc3\xa6\x72\x67\x72\xc3\xb8\x64\x2e\x6e
\x6f"|idn2 -T

You can check for TR46 availability:

#if IDN2_VERSION_NUMBER >= 0x00140000
        if ((rc = idn2_lookup_u8((uint8_t *)utf8, (uint8_t **)ascii,
IDN2_TRANSITIONAL)) == IDN2_OK)
...
#else
...
#endif

The IDN2_TRANSITIONAL enables TR46 'transitional' conversion (tries to be
compatible to IDNA2008 and IDNA2003 as much as possible), IDN2_NONTRANSITIONAL
enables TR46 'non-transitional (IDNA2008, the way that every app should go...
may arise some incompatibilties with IDNA2003 which is still under heavy use).

Hope that helps.

Tim

-------------------------------------------------------------------
List admin: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://curl.haxx.se/mail/etiquette.html

Received on 2017-01-11