Buy commercial curl support from WolfSSL. We help you work
out your issues, debug your libcurl applications, use the API, port to new
platforms, add new features and more. With a team lead by the curl founder
himself.
Re: Happy Eyeballs doesn't seem to work with c-ares when IPv6 name servers on top of the name server list don't respond
- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]
From: Timothe Litt <litt_at_acm.org>
Date: Thu, 25 Nov 2021 08:51:02 -0500
On 25-Nov-21 03:01, Daniel Stenberg via curl-library wrote:
> On Thu, 25 Nov 2021, Dmitry Karpov via curl-library wrote:
>
>> It seems the problem is that even though libcurl implements Happy
>> Eyeballs for DNS queries and runs A and AAAA queries in parallel,
>> c-ares still goes through the list of name servers as they are listed
>> in the resolv.conf.
>
> All the name resolver solutions libcurl uses performs the entire name
> resolve phase first, before libcurl gets a chance to start trying to
> connect to the first address. This makes a slow name resolve affect
> the starting point of all connect attempts libcurl makes.
>
> In an ideal world, libcurl could start trying to connect to addresses
> as they trickle in from the DNS servers, but since they need to be
> sorted and provided in a certain order, that's far easier said than
> done and the regular libc getaddrinfo() API doesn't allow for it either.
>
>> Is it possible to work around this issue somehow?
>
> My immediate thought is that it needs to be dealt with in c-ares
> somehow. How does the regular getaddrinfo() function behave in this
> situation?
>
This is a resolution issue, not a connection issue.
Performant resolvers such as BIND do (roughly) the following:
* Query the nameservers for a domain in parallel for all requested
record types
* Assuming both IPv4 and IPv6 are desired: once any of them replies
with the A record(s) and it (or any other) with the AAAA, return the
results
o Connection follows
o Note that once you have both record types, further replies will
(modulo DNS consistency) return identical information - there's
no point in waiting for them
* Any replies that come in later are ignored
* A cache of nameserver response times is maintained; future queries
for a domain go (only) to the fastest server
* Periodically, the slow nameservers are included in future queries,
since network congestion or operational issues may have cleared,
speeding up the "slow" server(s).
* This process applies equally to final resolution of the desired host
and any intermediate queries (e.g. to the root nameservers, the TLD
servers, and any intermediates.
The problem with putting name resolution into an application such as
cURL is that you typically don't have a good way to persist the
knowledge about NS performance or the reply cache across invocations.
This is why site caching resolvers (e.g. BIND) are used. And why the
systemd-resolved daemon was invented as an alternative.
It's also worth noting that these solutions also provide DNSSEC
validation and resolution via other protocols (e.g. LLMNR, mcDNS), which
custom resolvers typically don't.
The async resolver does prevent blocking when talking to multiple hosts,
but in a modern environment, it would do better to use threads to invoke
getaddrinfo in order to achieve its goals. That is, create a thread for
each resolution request, and let the thread block. Then wait for any
thread to complete, and proceed with that connection; wait for the next
thread (DNS or I/O completion) until done. That way, the issues with
caching and server selection are better handled - better yet, by someone
else!
Keeping your local caching resolver running - if you have one - is a
separate, global issue. And things like systemd-resolved are
implemented as local resolvers (on 127.0.0.x), which resolv.conf points to.
As in another recent discussion, one wants to avoid re-inventing the wheel.
Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.
Received on 2021-11-25
Date: Thu, 25 Nov 2021 08:51:02 -0500
On 25-Nov-21 03:01, Daniel Stenberg via curl-library wrote:
> On Thu, 25 Nov 2021, Dmitry Karpov via curl-library wrote:
>
>> It seems the problem is that even though libcurl implements Happy
>> Eyeballs for DNS queries and runs A and AAAA queries in parallel,
>> c-ares still goes through the list of name servers as they are listed
>> in the resolv.conf.
>
> All the name resolver solutions libcurl uses performs the entire name
> resolve phase first, before libcurl gets a chance to start trying to
> connect to the first address. This makes a slow name resolve affect
> the starting point of all connect attempts libcurl makes.
>
> In an ideal world, libcurl could start trying to connect to addresses
> as they trickle in from the DNS servers, but since they need to be
> sorted and provided in a certain order, that's far easier said than
> done and the regular libc getaddrinfo() API doesn't allow for it either.
>
>> Is it possible to work around this issue somehow?
>
> My immediate thought is that it needs to be dealt with in c-ares
> somehow. How does the regular getaddrinfo() function behave in this
> situation?
>
This is a resolution issue, not a connection issue.
Performant resolvers such as BIND do (roughly) the following:
* Query the nameservers for a domain in parallel for all requested
record types
* Assuming both IPv4 and IPv6 are desired: once any of them replies
with the A record(s) and it (or any other) with the AAAA, return the
results
o Connection follows
o Note that once you have both record types, further replies will
(modulo DNS consistency) return identical information - there's
no point in waiting for them
* Any replies that come in later are ignored
* A cache of nameserver response times is maintained; future queries
for a domain go (only) to the fastest server
* Periodically, the slow nameservers are included in future queries,
since network congestion or operational issues may have cleared,
speeding up the "slow" server(s).
* This process applies equally to final resolution of the desired host
and any intermediate queries (e.g. to the root nameservers, the TLD
servers, and any intermediates.
The problem with putting name resolution into an application such as
cURL is that you typically don't have a good way to persist the
knowledge about NS performance or the reply cache across invocations.
This is why site caching resolvers (e.g. BIND) are used. And why the
systemd-resolved daemon was invented as an alternative.
It's also worth noting that these solutions also provide DNSSEC
validation and resolution via other protocols (e.g. LLMNR, mcDNS), which
custom resolvers typically don't.
The async resolver does prevent blocking when talking to multiple hosts,
but in a modern environment, it would do better to use threads to invoke
getaddrinfo in order to achieve its goals. That is, create a thread for
each resolution request, and let the thread block. Then wait for any
thread to complete, and proceed with that connection; wait for the next
thread (DNS or I/O completion) until done. That way, the issues with
caching and server selection are better handled - better yet, by someone
else!
Keeping your local caching resolver running - if you have one - is a
separate, global issue. And things like systemd-resolved are
implemented as local resolvers (on 127.0.0.x), which resolv.conf points to.
As in another recent discussion, one wants to avoid re-inventing the wheel.
Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.
-- Unsubscribe: https://lists.haxx.se/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html
- application/pgp-signature attachment: OpenPGP digital signature