curl / Mailing Lists / curl-library / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: IPv6 resolution problems for IPv4 resolve mode

From: Timothe Litt <litt_at_acm.org>
Date: Mon, 23 May 2022 16:02:45 -0400


On 23-May-22 07:21, Daniel Stenberg via curl-library wrote:
> On Fri, 20 May 2022, Dmitry Karpov wrote:
>
>> I understand the rationale for keeping DNS entry in the cache for
>> both addresses, but in my proposal, I suggest to use "dual-stack" DNS
>> queries only for dual-stack and IPv6-only modes. This will make
>> IPv4-only requests in IPv6-enabled libcurl builds behave the same way
>> as they do in IPv4-only builds.
>
> I believe that suggestion would basically revert 84d2839740ca7804, so
> it would need some careful considerations.
>
> Maybe we should rather add some variation to CURLOPT_IPRESOLVE for
> more explicit *also applies to name resolving*? We might need to do
> something about caching/connection reuse too, or at least decide and
> document exactly how those would work in these siutations.
>
This seems as if the cache state is not granular enough, and a heuristic
fix was attempted.  Adding more heuristics is not the right approach; it
will make matters worse.

My take:

Only the IPRESOLVE-enabled DNS resolution(s) should be done.  The cache
should reflect what was done, and the result(s).  Subsequent requests
with a different IPRESOLVE setting may hit on the name, but if the
resolution for the new request's record type is missing, the missing
record type should be resolved, and the cache updated.  That is, the
per-hostname cache state for each protocol type can be (unknown - no
resolution attempted; or 0 - n addresses of the specified type.)

If IPRESOLVE is restricted to V4, the DNS request should only be for A
records, and the cache entry should reflect A record(s) for the
specified host, and that only A record state is known.  E.g. hostname,
A, 0-n

If IPRESOLVE is restricted to V6, the same for AAAA.

If IPRESOLVE is unrestricted, then, and only then, the DNS request
should be for both A & AAAA, and both status values and record types cached.

With the correct state, there is no confusion when consulting the cache
for a subsequent request that may have a different IPRESOLVE setting.

If the new request includes V4, and the host has a cache entry marked
valid for V4, the entry tells us how many A records exist. If none,
don't connect using V4.  If the entry is not valid for V4, do a new
lookup and cache the result.

Same for V6.

After consulting/updating the cache, if the IPRESOLVE for the curl
handle doesn't produce a hit (e.g. resolve_allowed(4) && >0 A ||
resolve_allowed(6) && >0 AAAA), fail the request.  Otherwise, try to
connect using each available address.  (Using whatever
preference/parallelism scheme you like - sequential, v4-first, v6-first,
happy eyes... once you have the correct state, you can do the right things.)

Similarly, for the connection cache, record the protocol type (you can
also get this from the peer's address length).  When checking the cache
for a reusable connection, only consider cached connections that match
the handle's IPRESOLVE constraints.

You can implement the necessary state several ways.

This solves the original problem and does no extraneous (speculative)
DNS lookups.  Thus it also prevents this reporter's issue of the
speculative lookups timing out.

Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.


-- 
Unsubscribe: https://lists.haxx.se/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html
Received on 2022-05-23