Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrong handling of broken IPv6 #3585

Closed
buzo-ffm opened this issue Feb 18, 2019 · 17 comments
Closed

wrong handling of broken IPv6 #3585

buzo-ffm opened this issue Feb 18, 2019 · 17 comments

Comments

@buzo-ffm
Copy link

Curl behaves differently than before when a site cannot be reached via IPv6, but only IPv4. This causes network-manager (using libcurl) to loop forever and take 100% of one CPU. Please see https://bugs.archlinux.org/task/61688 for all the details.

In that report, it is stated that the change 4c35574 is the cause, see comment https://bugs.archlinux.org/task/61688#comment177215 .

@buzo-ffm
Copy link
Author

The bug report for network-manager about this issue is here: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/issues/123

@bagder
Copy link
Member

bagder commented Feb 18, 2019

I think this problem (even if this report is doing a lot of wild guessing and drawing funny conclusions) is already addressed in git with 4015fae, since it seems similar to the symptoms in #3542.

@heftig
Copy link
Contributor

heftig commented Feb 19, 2019

We backported that commit and the one it fixes without success.

@bagder
Copy link
Member

bagder commented Feb 19, 2019

Can you make a standalone libcurl-using example that reproduces the problem?

@bagder
Copy link
Member

bagder commented Feb 19, 2019

And maybe try the git master version in its entirety first to make sure that the problem is still there.

@buzo-ffm
Copy link
Author

The libcurl-using code is here within the #if WITH_CONCHECK blocks, but I don't know how to make a standalone example from it.

@heftig
Copy link
Contributor

heftig commented Feb 19, 2019

The problem is still present as of commit 5908e90.

FTR, the URL in question here is http://www.archlinux.org/check_network_status.txt, so http2 is not involved.

@heftig
Copy link
Contributor

heftig commented Feb 19, 2019

@buzo-ffm NM's connectivity code had lots of changes in master; we're actually running https://gitlab.freedesktop.org/NetworkManager/NetworkManager/blob/nm-1-14/src/nm-connectivity.c .

@bagder
Copy link
Member

bagder commented Feb 19, 2019

The title says "wrong handling" of "broken IPv6". In what way is it broken?

@heftig
Copy link
Contributor

heftig commented Feb 19, 2019

I have a working dual stack setup here; dropping all IPv6 packets from www.archlinux.org (via ip6tables -A INPUT -s 2a01:4f8:172:1d86::1/128 -j DROP) makes NM enter an infinite loop once it attempts to retrieve the URL.

@bagder
Copy link
Member

bagder commented Feb 19, 2019

If you, on the same machine with that filter enabled, run curl -v http://www.archlinux.org/check_network_status.txt (built from the problematic commit), what happens then?

@heftig
Copy link
Contributor

heftig commented Feb 19, 2019

It runs fine, first trying IPv6 and quickly failing, then using IPv4. With -6 it times out after two minutes.

@bagder
Copy link
Member

bagder commented Feb 19, 2019

Unfortunately my dev machines don't have working IPv6 so I can't easily reproduce this setup.

With a debug build created with configure --enable-debug, the curl command line tool gets an undocumented new option called --test-event that will make it run its normal operations internally with the event-based API (that NM is using) instead of the regular easy API that it otherwise uses. Any chance you can see if that reproduces it?

$ curl --test-event -v http://www.archlinux.org/check_network_status.txt

@heftig
Copy link
Contributor

heftig commented Feb 19, 2019

Hah, that does get into a loop printing * call curl_multi_socket_action(socket 4).

@heftig
Copy link
Contributor

heftig commented Feb 19, 2019

* STATE: INIT => CONNECT handle 0x55590a766ea8; line 1429 (connection #-5000)
* Added connection 0. The cache now contains 1 members
* STATE: CONNECT => WAITRESOLVE handle 0x55590a766ea8; line 1470 (connection #0)
*   Trying 2a01:4f8:172:1d86::1...
* TCP_NODELAY set
* STATE: WAITRESOLVE => WAITCONNECT handle 0x55590a766ea8; line 1551 (connection #0)
* socket cb: socket 3 ADDED as OUT
*   Trying 138.201.81.199...
* TCP_NODELAY set
* socket cb: socket 4 ADDED as OUT
* call curl_multi_socket_action(socket 4)
* call curl_multi_socket_action(socket 4)
* call curl_multi_socket_action(socket 4)
[et cetera]

@buzo-ffm
Copy link
Author

FWIW, my IPv6 is working fine, yet I am hit by this bug. The reason could be me having more than one default route:

default via fe80::**** dev enp0s31f6 proto ra metric 100 pref medium
default via fe80::**** dev wlp58s0 proto ra metric 600 pref medium
default via fd28:**** dev buzo metric 1024 pref low

@heftig
Copy link
Contributor

heftig commented Feb 19, 2019

Since curl moves on to IPv4 quite quickly if the initial attempt at IPv6 takes too long, I guess just having a slow enough connection to the server is enough to trigger this.

bagder added a commit that referenced this issue Feb 19, 2019
The variable wasn't properly reset within the loop and thus could remain
set for sockets that hadn't been set before and thus missed notifying
the app.

Detected-by: Jan Alexander Steffens
Fixes #3585
@bagder bagder closed this as completed in afc00e0 Feb 20, 2019
@lock lock bot locked as resolved and limited conversation to collaborators May 21, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

Successfully merging a pull request may close this issue.

3 participants