curl-library
running_handles and the multi API
Date: Sat, 30 Jan 2010 23:08:09 +0100 (CET)
Hi friends,
multi interface users of libcurl, this issue may concern you so please bare
with my longish mail and I'll get to the point eventually...
I visited a company yesterday (who's name is left here since it isn't
important to this dicussion) that is using libcurl quite heavily and I got to
debug libcurl when used by an application simulating hundreds of clients, each
doing tens of requests.
These guys of course use the multi_socket API but it turns out at times it
seemed to "forget" connections (which caused a hang). It turned out to be an
existing (7.19.7) bug in libcurl and it happened like this:
app calls curl_multi_add_handle() to add a new easy handle, libcurl will then
set it to timeout in 1 millisecond so libcurl will tell the app about it.
The app's timeout fires off that there's a timeout, the app calls libcurl as
we so often document it:
do {
res = curl_multi_socket_action(... TIMEOUT ...);
} while(CURLM_CALL_MULTI_PERFORM == res);
And this is the problem number one:
When curl_multi_socket_action() is called with no specific handle, but only
a timeout-action, it will *only* perform actions within libcurl that are
marked to run at this time. In this case, the request would go from INIT to
CONNECT and return CURLM_CALL_MULTI_PERFORM. When the app then calls libcurl
again, there's no timer set for this handle so it remains in the CONNECT
state. The CONNECT state is a transitional state in libcurl so it reports no
sockets there, and thus libcurl never tells the app anything more about that
easy handle/connection.
libcurl _does_ set a 1ms timeout for the handle at the end of
multi_runsingle() if it returns CURLM_CALL_MULTI_PERFORM, but since the loop
is instant the new job is not ready to run at that point (and there's no
code that makes libcurl call the app to update the timout for this new
timeout). It will simply rely on that some other timeout will trigger later
on or that something else will update the timeout callback. This makes the
bug fairly hard to repeat.
We can fix this problem in two ways, both with side-effects attached:
A) The high performance fix is to introduce a loop in lib/multi.c (line 1988
in current CVS), around the call to multi_runsingle() for the timeout-
actions and simply check for CURLM_CALL_MULTI_PERFORM internally. This
has the added benefit that this goes in line with my long-term wishes to
get rid of the CURLM_CALL_MULTI_PERFORM all together from the public API.
The downside of this fix, is that the counter we return in
'running_handles' in several of our public functions then gets a slightly
new and possibly confusing behavior during times:
If an app adds a handle that fails to connect (very quickly) it may just
as well never appear as a 'running_handle' with this fix. Previously it
would first bump the counter only to get it decreased again at next call.
Even I have used that change in handle counter to signal "end of a
transfer". The only *good* way to find the end of a individual transfer
is calling curl_multi_info_read() to see if it returns one.
Of course, if the app previously did the looping before it checked the
counter, it really shouldn't be any new effect.
Further, curl_multi_info_read() is not as "cheap" as we'd like it to be
for being this single function to use for this purpose. It currently scans
over all added easy handles and we should improve that to instead become
a proper queue of messages that is instant to check if empty.
B) A way to keep the existing running_handles behavior is to make sure the
timeout callback gets called when the timeout has been changed for this
case, but that will delay the processing 1ms and it makes the flow a lot
uglier.
But even with such a fix I hesitate to claim that running_handles always
will get bumped when a new handle has been added. If the handle simply
dies instantly, it will/should never count as running.
Hm, now when I've spent the time to write all this, I must say I lean towards
method (A) rather strongly even though it may alter behavior slightly.
If you've read this far and have an idea of what I've tried to say, I'd say
you're qualified to have an opinion on the matter so please tell me what you
think or if I've missed some detail or whatever.
-- / daniel.haxx.se ------------------------------------------------------------------- List admin: http://cool.haxx.se/list/listinfo/curl-library Etiquette: http://curl.haxx.se/mail/etiquette.htmlReceived on 2010-01-30