cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: Slow DNS lookups compared to wget

From: Rick Richardson <rickr_at_mn.rr.com>
Date: Mon, 18 Feb 2002 09:06:23 -0600

On Mon, Feb 18, 2002 at 09:54:24AM +0100, Daniel Stenberg wrote:
> On Fri, 15 Feb 2002, Rick Richardson wrote:
>
> > DNS lookups of some (but not all) URLs are extremely slow using cURL 7.8,
> > but are not slow when using wget:
>
> Judging from other comments in your mail, I'm guessing that you're using a
> default Redhat 7.2 curl installation, right? That would mean that curl says
> it is "ipv6 enabled", right (in the curl -V output)?

Yes.

> > # wget is nice and fast....
> >
> > $ time wget -q -O- "http://quote.bloomberg.com/markets/earnings/ecal.cgi" > xxx
> > real 0m2.511s
> > user 0m0.000s
> > sys 0m0.010s
> >
> >
> > # cURL 7.8 is picking lint from its bellybutton...
> >
> > $ time curl -s "http://quote.bloomberg.com/markets/earnings/ecal.cgi" > xxx
> > real 0m31.857s
> > user 0m0.010s
> > sys 0m0.000s
>
> This certainly indicates a problem somewhere, and I'll be much suprised if
> this is anything that the curl source code does wrong! :-/
>
> The main difference between wget and curl in the DNS resolving parts, is that
> wget uses the good old traditional gethostbyname() for name resolves, while
> curl (if compiled "IPv6 enabled") uses getaddrinfo() and if not IPv6 enabled,
> it uses gethostbyname_r() (on systems that offer it).
>
> Is your kernel IPv6 enabled at all?

Its the latest RH 7.2 kernel 2.4.9-21, which has a module for ipv6. The
module is not insmod'ed by default. Insmod'ing ipv6.o doesn't change
the results.

> Does anyone else have more clues to fill in on this subject here? We could
> check for glibc-related notes, or Redhat erratas or whatever...
>
> > Are there any workarounds besides a painful domino-like upgrade of the RPM
> > and the several co-dependant RPMs (libssl, libcrypto, php)?
>
> You either do as Troy suggests, or you grab a source archive and build
> yourself. It really isn't hard. Even for someone who never did it before
> (which doesn't necessarily mean you, I just mean that it is easy).

In the end, its not me I'm worried about, its the end users of my software
having to perform the incantations.

I rebuilt curl version 7.9.3 binaries from the Redhat rawhide sources as Troy
suggested, but the problem still remains:

$ cd /tmp/curl
$ LD_PRELOAD=usr/lib/libcurl.so.2 usr/bin/curl --version
curl 7.9.3 (i386-redhat-linux-gnu) libcurl 7.9.3 (OpenSSL 0.9.6b) (ipv6 enabled)

$ LD_PRELOAD=usr/lib/libcurl.so.2 time \
usr/bin/curl -s "http://quote.bloomberg.com/markets/earnings/ecal.cgi" > xxx
0.01user 0.00system 0:30.89elapsed

I think that you may be on to something with this getaddrinfo() lead.

I'll try writing a short test program to see if getaddrinfo() is
broken when trying to resolve this particular DNS name. N.B. the
problem only occurs with *some* DNS names, not all.

-Rick

-- 
Rick Richardson  rickr@mn.rr.com        http://home.mn.rr.com/richardsons/
"other e-mail programs like Eudora are not designed to enable virus replication"
  -- advice from Microsoft Support
Received on 2002-02-18