Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CURLOPT_IPRESOLVE: preventing wrong IP version from being used #6853

Closed
wants to merge 1 commit into from

Conversation

lvella
Copy link
Contributor

@lvella lvella commented Apr 6, 2021

It is still missing documentation, but I hope it is pretty much this.

Added a new easy option CURLOPT_IP_VERSION, which can take either:

  • CURL_IP_VERSION_WHATEVER
  • CURL_IP_VERSION_V4
  • CURL_IP_VERSION_V6

Where WHATEVER is default behavior, and Vx forces the use of that IP version for the connection, by either selecting an appropriate connection from the pool, or by establishing the connection only via an IP address of the appropriate kind.

This is useful for implementing BitTorrent's BEP-7, which requires the client to announce twice, once via IPv4 and once via IPv6.

@bagder
Copy link
Member

bagder commented Apr 6, 2021

Adding a new option for IP version selection makes it hard to figure out exactly how these two separate and independent options interact and why we allow them to be set differently etc. As far as I can tell, the new functionality you want to bring here is that the selection also prevents an existing connection with the "wrong" address version to be reused and in other ways it is expected to do the same?

An alternative take could then be to perhaps extend the existing option and add these two selectors:

CURL_IPRESOLVE_V4_STRICT and CURL_IPRESOLVE_V6_STRICT (the names can be tweaked further) that would work as the previous options but also be strict about only ever reusing connections with that specific IP version.

It seems easier to document and understand for users and less risk for internal confusions for us libcurl developers.

Thoughts?

@bagder
Copy link
Member

bagder commented Apr 6, 2021

The failed CI jobs are mostly test 1119 that fails due to missing symbols in docs/libcurl/symbols-in-versions

@bagder bagder added the feature-window A merge of this requires an open feature window label Apr 6, 2021
@lvella
Copy link
Contributor Author

lvella commented Apr 6, 2021

Adding a new option for IP version selection makes it hard to figure out exactly how these two separate and independent options interact and why we allow them to be set differently etc. As far as I can tell, the new functionality you want to bring here is that the selection also prevents an existing connection with the "wrong" address version to be reused and in other ways it is expected to do the same?

An alternative take could then be to perhaps extend the existing option and add these two selectors:

CURL_IPRESOLVE_V4_STRICT and CURL_IPRESOLVE_V6_STRICT (the names can be tweaked further) that would work as the previous options but also be strict about only ever reusing connections with that specific IP version.

It seems easier to document and understand for users and less risk for internal confusions for us libcurl developers.

Thoughts?

I do agree that it might be confusing, but I see two issues with extending CURLOPT_IPRESOLVE:

  • I don't like the name, because the new settings do not affect name resolution.
  • It is a valid setting to have both CURL_IPRESOLVE_WHATEVER and CURL_IP_VERSION_V4, because I might want to force an IPv4 connection, but still want the IPv6 addresses to be added to DNS cache (that is my use case).

@lvella
Copy link
Contributor Author

lvella commented Apr 6, 2021

If the two options can be kept separate, it would be better to change the internal variable names to better reflect what option has set them, so it would be obvious for a developer reading the code.

lib/url.c Outdated Show resolved Hide resolved
include/curl/curl.h Outdated Show resolved Hide resolved
@bagder
Copy link
Member

bagder commented Apr 6, 2021

I do agree that it might be confusing, but I see two issues with extending CURLOPT_IPRESOLVE:
I don't like the name, because the new settings do not affect name resolution.

In some ways it does because it can trigger a new name resolve for cases where it previously without this option, wouldn't. So it is at least related.

It is a valid setting to have both CURL_IPRESOLVE_WHATEVER and CURL_IP_VERSION_V4, because I might want to force an IPv4 connection, but still want the IPv6 addresses to be added to DNS cache (that is my use case).

I think that's an internal decision that libcurl will not guarantee based on these options. Sure it might. But it also might not. We don't expose that behavior or knowlwedge to the outside.

To get that functionality we could instead make IPRESOLVE not affect get getaddrinfo() call but only limit what addresses to use from its result, as then it could still cache addresses for a family it won't use (right now).

I would rather argue that if you insist on having a new option for this, the new one should instead be a boolean and be something "insist-on-the-set-IP-version-for-reused-connection" so that it can't contradict the first option.

@lvella
Copy link
Contributor Author

lvella commented Apr 6, 2021

It is tricky to mimic the original behavior of CURL_IPRESOLVE_* if we actually allow both address types to be resolved. Imagine the case where the first request to a host uses CURL_IPRESOLVE_V4, then subsequent requests uses CURL_IPRESOLVE_V6. What I saw is that all connections will use IPv4 unless I also disable DNS caching. If I disable caching, a newly established connections will be of correct type, but reused connections will be random.

In my opinion, this original behavior is not really useful keeping, because it is random and unpredictable, thus unreliable.

If you also don't care to keep the original behavior, I think we should simply replace the behavior of CURL_IPRESOLVE_* with the strict behavior introduced in this PR. Maybe make the new name CURL_IP_VERSION_* an alias, to better reflect the new meaning.

If you want to keep the old behavior, I propose is to make the new options CURL_IPRESOLVE_Vx_STRICT not limit the name resolving, because the connection type will be restricted later, anyway.

What do you think, @bagder ?

@lvella lvella force-pushed the strict-ip-proto branch 2 times, most recently from 85d5256 to f69abf8 Compare April 9, 2021 00:23
@lvella
Copy link
Contributor Author

lvella commented Apr 9, 2021

In this new version pushed, the functionality was implemented by adding options CURL_IPRESOLVE_V4_STRICT and CURL_IPRESOLVE_V6_STRICT to CURLOPT_RESOLVE. The original options are the same, and restrict name resolution. The new option do not affect name resolution, but instead they restrict the connection creation and selection from the connection pool.

@lvella lvella force-pushed the strict-ip-proto branch 5 times, most recently from 96e2597 to 64f7ac7 Compare April 10, 2021 21:06
@lvella
Copy link
Contributor Author

lvella commented Apr 10, 2021

Now the new options are documented, too.

lib/doh.c Outdated Show resolved Hide resolved
lib/url.c Outdated Show resolved Hide resolved
docs/libcurl/opts/CURLOPT_IPRESOLVE.3 Outdated Show resolved Hide resolved
docs/libcurl/opts/CURLOPT_IPRESOLVE.3 Outdated Show resolved Hide resolved
lib/urldata.h Outdated Show resolved Hide resolved
@bagder
Copy link
Member

bagder commented Apr 23, 2021

I've gone back and forth on this, both here in this PR and in my head.

I think I've now (finally) arrived at the place where I agree with your stance: let's change the meaning of this option slightly so that it actually means use this IP version - including when reusing connections. When we don't need any "STRICT" options, we can just use the ones we already have.

It is a minor change in behavior but I really think very very few applications risk getting hurt by this.

My apologies for my "sick-sacking" on this.

@lvella lvella force-pushed the strict-ip-proto branch 2 times, most recently from de4704e to e96bfb5 Compare April 24, 2021 16:51
@lvella
Copy link
Contributor Author

lvella commented Apr 24, 2021

Done! CURLOPT_IPRESOLVE now means "use this IP version".

Since no new options or symbols are introduced, does the label "next-feature-window" still makes sense? Can it be considered a bug fix?

@@ -32,7 +32,7 @@ http
HTTP GET with localhost --interface
</name>
<command>
http://%HOSTIP:%HTTPPORT/%TESTNUMBER -4 --interface localhost
http://%HOSTIP:%HTTPPORT/%TESTNUMBER -4 --interface 127.0.0.1
Copy link
Contributor Author

@lvella lvella Apr 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change, this test fails in CI. I could not reproduce the failure locally. I assumed it failed because the interface whose IPv6 address is [::1] on the test machines is different from interface of 127.0.0.1, and since localhost resolved to [::1], the socket was bound to the wrong interface.

Maybe this is a good reason to keep CURLOPT_IPRESOLVE separated from the new option CURLOPT_IP_PROTO?

@bagder bagder removed the feature-window A merge of this requires an open feature window label Apr 25, 2021
@bagder bagder self-assigned this May 7, 2021
@lvella lvella closed this May 8, 2021
@lvella lvella reopened this May 8, 2021
@bagder
Copy link
Member

bagder commented May 17, 2021

@lvella sorry for being slow on this PR. Isn't it a better take on the test 2100 to make it IPv4-only by adding -4 to the command line? Making it only run for non-ipv6 builds is basically gonna disable it and that would be unfortunate.

@lvella
Copy link
Contributor Author

lvella commented May 17, 2021

@bagder That is the original solution, but -4 no longer disables the IPv6 DNS request, it just prevents IPv6 addresses from being used. I am not sure how to fix this. Do you think I should reverse this behavior, and make DNS resolvers only request the IP version needed immediately? Or alternatively, is there a way to make the server on the test answer both DoH requests?

@bagder
Copy link
Member

bagder commented May 17, 2021

Ah right, didn't think that through! I now created #7083 as an attempt to make test 2100 work with IPv6 enabled.

@bagder
Copy link
Member

bagder commented May 20, 2021

Okay, now try to remove your test2100 commit and I think things should be fine!

@bagder
Copy link
Member

bagder commented May 20, 2021

I think you also need to rebase it on top of master (at least d9eb3e3)

In some situations, it was possible that a transfer was setup to
use an specific IP version, but due do DNS caching or connection
reuse, it ended up using a different IP version from requested.

This commit changes the effect of CURLOPT_IPRESOLVE from simply
restricting address resolution to preventing the wrong connection
type being used, when choosing a connection from the pool, and
to restricting what addresses could be used when establishing
a new connection.

It is important that all addresses versions are resolved, even if
not used in that transfer in particular, because the result is
cached, and could be useful for a different transfer with a
different CURLOPT_IPRESOLVE setting.
@lvella
Copy link
Contributor Author

lvella commented May 20, 2021

Done! In repos with linear git history, I never know if I must push a merge commit in order to facilitate review, or force push a rebase directly and mess with GitHub "Files changed" tool.

Pushed as a rebase, since you mentioned.

@bagder bagder changed the title Option for strict IP version selection CURLOPT_IPRESOLVE: preventing wrong IP version from being used May 20, 2021
@bagder
Copy link
Member

bagder commented May 20, 2021

Thank you so much for your hard work and patience with this pull request. Stellar work!

@bagder bagder closed this in 84d2839 May 20, 2021
@Appla
Copy link

Appla commented Jun 24, 2022

What if I only want to resolve IPv4?
We if we do not use IPv6 and dont want to waste another dns lookup?
this commit totally changed the original behavior which people may rely on
I think the problem described above should be solve via different path but change an option's behavior

@lvella
Copy link
Contributor Author

lvella commented Jun 24, 2022

What if I only want to resolve IPv4? We if we do not use IPv6 and dont want to waste another dns lookup? this commit totally changed the original behavior which people may rely on I think the problem described above should be solve via different path but change an option's behavior

To make everybody happy, maybe the option should indeed restrict the DNS lookup (like it was was before), and also restrict the connection type (like it is now). But to do this way, the DNS cache would have to change quite a bit (track what IP kinds were resolved or not, track the timeout independently, know if an IP version was not available or not requested, etc), and I was not inclined to do such invasive change back when I wrote this, being it my first patch and all (imagine being the guy who broke curl?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants