curl-library
SIGSEGV in libcurl while removing stalled easy handle from multi handle
Date: Sun, 26 Oct 2008 14:36:53 +0100
rtorrent uses the multi socket interface, and normally it all works
fine. However some users of libcurl 7.18.2 and 7.19.0 have reported
that occasionally some transfers stall permanently, despite
CURLOPT_TIMEOUT being set to 120 seconds. That is, despite calling
curl_multi_socket_action regularly, the amount of running handles
that it returns doesn't change when the timeout passes and the app
never receives a notification for this transfer from
curl_multi_info_read.
So as a workaround, I added a separate timeout in the app, which
after 125 seconds calls curl_multi_remove_handle for the transfer,
and then curl_easy_cleanup. Most of the time this seems[1] to work
fine too, but occasionally libcurl receives a SIGSEGV in the
curl_multi_remove_handle function. This crash has only been reported
for 64-bit platforms, while the stall itself is observed on all kinds
of systems.
Because I haven't been able to reproduce the crash myself, I need
some hints on how best to investigate this crash. Is it possible that
libcurl has internally removed the easy handle without adjusting the
number of running transfers, or that the multi handle has become
corrupt (even though it seems[1] to still work for other transfers)?
I am positive that curl_multi_remove_handle is NEVER called twice on
an easy handle, that it is NEVER called before curl_easy_cleanup and
that curl_easy_cleanup is NEVER called twice on an easy handle. (I
set flags when each function has been called once that will prohibit
further calls.)
Is there a way of testing whether an easy handle is still supposed to
be valid (that doesn't crash libcurl in the process), and whether
it's still part of a multi handle? The transfer stall is not in
itself a big problem (because it could in principle be worked around
by manually removing the transfer after a while), but if that crashes
libcurl it indicates a deeper problem that needs to be fixed.
Basically, even some assert() tests for whether the handles are valid
would be useful to find out where and why the problem first surfaces.
I appreciate any ideas or hints.
[1] This is something I have deduced from user reports, so it may not
actually be correct.
-- Josef Drexler josef_at_ttdpatch.netReceived on 2008-10-26