curl-library
Re: [SECURITY NOTICE] libidn with bad UTF8 input
Date: Thu, 2 Jul 2015 08:42:42 -0600
A clarification question:
After reading some of the links in Dennis's message, I am under the
impression that the problem is purely one of encoding; that is, we have no
problem as long as what's passed is properly encoded utf8. This is in
contrast to a "well formed but unsupported input" scenario where we can
pass validly formed utf8, but it encapsulates a unicode sequence that the
libidn can't use.
Which issue are we worried about?
An example of an encoding problem would be passing the 2-byte sequence 0xE4
0x35, which isn't well formed and doesn't map to anything. An example of
"well formed but unsupported" would be passing the 2-byte sequence "\r\n",
which is valid utf8 but which can't possibly map to a valid domain name.
Other familiar corner cases in the latter category include encoding
codepoints from the private use range, codepoints > 0xFFFF, and so forth.
The links in Dennis's message give examples of encoding problems, but the
pushback against doing validation (in libidn, in libcurl, or both) seems to
assume that validation is a difficult task--which, if it's true at all, it
can only be true the "well formed but unsupported" case. Validating utf8
encoding is incredibly easy, because of the way utf8 is designed. It can be
done definitively, quickly, and with perfect confidence with a single for
loop over a buffer. unicode.org has such code in its samples. I can track
that code down, if someone else doesn't.
Given the ease and speed of validation, if we're only worried about an
encoding issue, I think both libidn and libcurl ought to validate.
If the issue is one of unsupported input, then libidn ought to do the work,
because no other codebase knows what libidn's requirements are.
--Daniel
On Thu, Jul 2, 2015 at 8:15 AM, dev <dev_at_cor0.com> wrote:
>
>
> > On June 29, 2015 at 5:09 PM Daniel Stenberg <daniel_at_haxx.se> wrote:
> >
> >
> > Hi all libcurl users.
> >
> > Here's a little problem many of us need to be aware of!
> <snip>
> > RECOMMENDATION
> >
> > Rebuild libcurl with libidn support disabled.
> <snip>
>
> Another better way to proceed would be to call a routine to clean up,
> check, fix, or trap bad UTF-8 data.
>
> > REFERENCES
> >
> > [1] =
> >
> https://blog.thijsalkema.de/me/blog//blog/2015/04/17/validate-the-encoding-before-passing-strings-to-libcurl-or-glibc/
>
>
> This was discussed on the libidn list with Simon Josefsson :
> see https://tools.ietf.org/html/rfc3629#section-11
>
> That was back in November last year and the general feeling within the
> libidn world is that it is the responsibility of the application to
> detect and prep utf-8 and not the responsibility of libidn.
>
> As seen at
>
> https://blog.thijsalkema.de/me/blog//blog/2015/04/17/validate-the-encoding-before-passing-strings-to-libcurl-or-glibc/
> :
>
>
> So who should check it?
>
> The libidn developers show little motivation to fix this, pointing
> the blame to applications instead:
>
> Applications should not pass unvalidated strings to stringprep(),
> it must be checked to be valid UTF-8 first. If stringprep()
> receives non-UTF8 inputs, I believe there are other similar
> serious things that can happen.
> Simon Josefsson
>
> http://lists.gnu.org/archive/html/help-libidn/2014-11/msg00002.html
>
>
> Serious things? Yes, you can bet on it. My response to this was that
> we should be checking for bad utf-8, and possibly even doing repair to
> avoid security leakage :
>
> http://lists.gnu.org/archive/html/help-libidn/2014-11/msg00003.html
>
> Wherein I said :
>
> I wrote a UTF-8 check routine on a recent project and it did require
> a fair amount of thought and was not a perfect check algorithm by
> any means. Things such as bytes 0xC080h ( as mentioned in section 10
> of RFC 3629 ) would be reasonable to check but a complete and
> stringent check for compliance could be a fair chunk of work.
>
> Oddly enough I have done some more work and am still not at a "stringent
> check" and perhaps I should get back onto that.
>
> Now then, the key thought on my mind at the moment is "finger pointing"
> where we are all pointing elsewhere and saying "its your job not mine"
> along with "why didn't you do this?" as opposed to just sitting down and
> reading all of RFC 3629 and then with coffee in hand code out a check
> routine. Yep, that sounds like lots of fun but it is getting to be a
> valid "necessity" as opposed to a fun "want". Also, it just feels so
> wrong to strip away functions from libcurl to protect ourselves from a
> problem that can be solved.
>
> I have code now, working nicely for at least a year, which catches just
> about all four byte utf-8 bit sequence issues and neatly repairs the
> damage. I feel that I should get this code bit out in the open and let
> good people such as you hack at it and maybe we can provide a final and
> reasonable solution to the dreaded nasty UTF-8 bad bits issue. Mostly I
> don't like finger pointing and would rather just put fingers to keys and
> write a solution that works. Mine doesn't. Not perfectly and not in a
> really strict fashion but it is better than being no where on this. Also
> I took the approach of "repair" as opposed to just signaling an error. I
> was wrong to do that. Bad utf-8 is bad. However it works for my code
> world and traps and fixes bad data being fired into a database backend.
>
> You can see some of what I meam by looking at :
>
> http://lists.gnu.org/archive/html/help-libidn/2014-11/msg00003.html
>
>
> Let me know your thoughts.
>
> Dennis Clarke
> -------------------------------------------------------------------
> List admin: http://cool.haxx.se/list/listinfo/curl-library
> Etiquette: http://curl.haxx.se/mail/etiquette.html
>
-- "We must remember that those mortals we meet in parking lots, offices, elevators, and elsewhere are that portion of mankind God has given us to love and to serve. It will do us little good to speak of the general brotherhood of mankind if we cannot regard those who are all around us as our brothers and sisters.” --Spencer W. Kimball
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2015-07-02