Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CURLMOPT_SOCKETFUNCTION and CURLMOPT_TIMERFUNCTION can't report errors to libcurl #8083

Closed
juchem opened this issue Dec 2, 2021 · 5 comments
Assignees

Comments

@juchem
Copy link

juchem commented Dec 2, 2021

I did this

According to the documentation, CURLMOPT_TIMERFUNCTION return value can report an error back to libcurl.

I've returned -1 from this function upon error. The application simply sat idle, cycling on the event loop without any reports of a failed transfer - which makes sense since curl_multi_socket_action never gets subsequently called.

After checking the documentation for the adjacent CURLMOPT_SOCKETFUNCTION, it has no mention of the return value's semantics.

Returning -1 from the latter seemed to have the same effect.

Further inspection of the source code indicates the return value of these callbacks is always disregarded:

  • CURLMOPT_TIMERFUNCTION: 1 2
  • CURLMOPT_SOCKETFUNCTION: 1 2 3

I expected the following

The documentation is unclear about how the return values from these functions are handled.

Arguably, a reasonable interpretation would expect libcurl to take appropriate measures - e.g.: aborting the transfer for the easy handle in question.

At a minimum, the documentation should state that the values are ignored, with possible workarounds - e.g.: how to cancel the easy handle appropriately in the case of CURLMOPT_SOCKETFUNCTION; is it safe to remove the handle from within the callback?

Ideally, the return values should be inspected and handled appropriately.

It is common for CURLMOPT_TIMERFUNCTION to start a custom timer, as well as for CURLMOPT_SOCKETFUNCTION to (possibly) allocate and start a poller. If these procedures fail, the state of the asynchronous state machine is undefined / undocumented.

Some state machines rely on appropriate behavior in order to stop / free resources (e.g.: stop and free the poller upon CURL_POLL_REMOVE).

It is unclear which events are guaranteed to be delivered, even in the presence of errors, so that appropriate resource management can take place without additional custom machinery being implemented (e.g.: garbage collect pollers from dead requests; call curl_multi_socket_action with CURL_SOCKET_TIMEOUT from the event loop as a fallback for failed timer setup).

Likewise, it is unclear what's the appropriate follow up for the failed requests, or how to even detect if the request failed due to a callback returning -1.

Should one of these callbacks fail, thus having to report -1 back to libcurl, is it ever possible for the state machine to fall into some inconsistent state?

I'd love to be able to contribute a fix, but I currently don't have the time, nor expertise in libcurl's codebase to do so appropriately. The least I can do as a token of gratitude for this great library is a detailed bug report.

curl/libcurl version

Checked sources for latest head as of this reporting.

Observed on the following build:

curl 7.78.0-DEV (Linux) libcurl/7.78.0-DEV OpenSSL/1.1.1l zlib/1.2.11 c-ares/1.17.0
Release-Date: [unreleased]
Protocols: http https 
Features: alt-svc AsynchDNS HSTS HTTPS-proxy IPv6 Largefile libz SSL UnixSockets

operating system

Seemingly not OS specific. Observed on Linux bst 5.14.0-4-amd64 #1 SMP Debian 5.14.16-1 (2021-11-03) x86_64 GNU/Linux.

@bagder bagder self-assigned this Dec 2, 2021
@bagder
Copy link
Member

bagder commented Dec 2, 2021

Thanks for this fine report!

I believe I never implemented any action in the event of an error being returned from these callbacks partly because I couldn't make up my mind on what exactly the correct course of action would be when one of those callbacks return error. That's also why the documentation doesn't say anything about it. Another reason would probably be that applications very rarely actually return error there so this omission has remained ignored/forgotten...

I guess the only sensible way to treat these errors is to immediately cancel the transfer, since a transfer done with an application that is setup to use these callbacks cannot really be handled correctly anymore from that point.

@bagder
Copy link
Member

bagder commented Dec 2, 2021

Ah right, it needs to cancel all transfers currently in progress because these callbacks are per multi-handle, not per transfer.

@bagder
Copy link
Member

bagder commented Dec 3, 2021

I think perhaps the slightly trickier part with this behavior is that it will take down all currently active transfers but once they're all closed down, it should probably be possible to start over and add new connections to the multi handle again.

I'll try to make a test case for all this...

bagder added a commit that referenced this issue Dec 3, 2021
The callbacks were partially documented to support this. Now the
behavior is documented and returning error from either of these
callbacks will effectively kill all currently ongoing transfers.

Added test 530 to verify

Reported-by: Marcelo Juchem
Fixes #8083
Closes #....
@juchem
Copy link
Author

juchem commented Dec 3, 2021

I've done some testing and had success with the following approach:

  • when the timer callback fails, the application sets an internal flag so it can later call curl_multi_socket_action with CURL_SOCKET_TIMEOUT from the event loop. It doesn't seem like curl needs to take any action other than document the workaround so that application developers can implement it;
  • when the socket callback fails, call curl_multi_socket_action for the fd in question with CURL_CSELECT_ERR. This is something that curl might be able to do internally when the callback returns -1. I observed that the callback got called with CURL_POLL_REMOVE as expected and the state machine worked just fine to free resources associated with that transfer.

I don't believe curl needs to always cancel all transfers, since:

  • the socket callback is per fd so only that fd's transfers need to be cancelled;
  • there's a workaround for timeout callback, albeit not ideal. Maybe let the application decide in this case? E.g.: if callback errors out, it can:
    • return 0 and do the flag/poll thing;
    • return -1 and the whole multi handle gets aborted.
      Does that make any sense?

@bagder
Copy link
Member

bagder commented Dec 3, 2021

when the timer callback fails

If the application wants to avoid causing all transfers to fail, it can do that by not returning error from the callback and it can then opt to handle it using a work-around as you describe. I don't think that's a way we should encourage though, as it will be complicated and error-prone. The timer is meant to be able to expire many times during a single transfer, and a work-around then needs to replace all of them artificially.

the socket callback is per fd so only that fd's transfers need to be cancelled;

The specific fd is not always unique for one or a subset of transfer, it can also be the name resolving fd for example which is used by (almost) all transfers. So, yes, while we could make libcurl not fail all transfers for socket cb failure, I don't think think it is worth the trouble to support that.

bagder added a commit that referenced this issue Dec 3, 2021
The callbacks were partially documented to support this. Now the
behavior is documented and returning error from either of these
callbacks will effectively kill all currently ongoing transfers.

Added test 530 to verify

Reported-by: Marcelo Juchem
Fixes #8083
Closes #8089
@bagder bagder closed this as completed in 2b3dd01 Dec 6, 2021
creaktive added a commit to sparky/perl-Net-Curl that referenced this issue Feb 10, 2022
Socket callback is mandatory for Net::Curl::Multi; it looks like this
was not enforced until some late version.

Likely related to curl/curl#8083
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants