cURL / Mailing Lists / curl-library / Single Mail

curl-library

running_handles and the multi API

From: Daniel Stenberg <daniel_at_haxx.se>
Date: Sat, 30 Jan 2010 23:08:09 +0100 (CET)

Hi friends,

multi interface users of libcurl, this issue may concern you so please bare
with my longish mail and I'll get to the point eventually...

I visited a company yesterday (who's name is left here since it isn't
important to this dicussion) that is using libcurl quite heavily and I got to
debug libcurl when used by an application simulating hundreds of clients, each
doing tens of requests.

These guys of course use the multi_socket API but it turns out at times it
seemed to "forget" connections (which caused a hang). It turned out to be an
existing (7.19.7) bug in libcurl and it happened like this:

  app calls curl_multi_add_handle() to add a new easy handle, libcurl will then
  set it to timeout in 1 millisecond so libcurl will tell the app about it.

  The app's timeout fires off that there's a timeout, the app calls libcurl as
  we so often document it:

   do {
    res = curl_multi_socket_action(... TIMEOUT ...);
  } while(CURLM_CALL_MULTI_PERFORM == res);

  And this is the problem number one:

  When curl_multi_socket_action() is called with no specific handle, but only
  a timeout-action, it will *only* perform actions within libcurl that are
  marked to run at this time. In this case, the request would go from INIT to
  CONNECT and return CURLM_CALL_MULTI_PERFORM. When the app then calls libcurl
  again, there's no timer set for this handle so it remains in the CONNECT
  state. The CONNECT state is a transitional state in libcurl so it reports no
  sockets there, and thus libcurl never tells the app anything more about that
  easy handle/connection.

  libcurl _does_ set a 1ms timeout for the handle at the end of
  multi_runsingle() if it returns CURLM_CALL_MULTI_PERFORM, but since the loop
  is instant the new job is not ready to run at that point (and there's no
  code that makes libcurl call the app to update the timout for this new
  timeout). It will simply rely on that some other timeout will trigger later
  on or that something else will update the timeout callback. This makes the
  bug fairly hard to repeat.

We can fix this problem in two ways, both with side-effects attached:

  A) The high performance fix is to introduce a loop in lib/multi.c (line 1988
     in current CVS), around the call to multi_runsingle() for the timeout-
     actions and simply check for CURLM_CALL_MULTI_PERFORM internally. This
     has the added benefit that this goes in line with my long-term wishes to
     get rid of the CURLM_CALL_MULTI_PERFORM all together from the public API.

     The downside of this fix, is that the counter we return in
     'running_handles' in several of our public functions then gets a slightly
     new and possibly confusing behavior during times:

     If an app adds a handle that fails to connect (very quickly) it may just
     as well never appear as a 'running_handle' with this fix. Previously it
     would first bump the counter only to get it decreased again at next call.
     Even I have used that change in handle counter to signal "end of a
     transfer". The only *good* way to find the end of a individual transfer
     is calling curl_multi_info_read() to see if it returns one.

     Of course, if the app previously did the looping before it checked the
     counter, it really shouldn't be any new effect.

     Further, curl_multi_info_read() is not as "cheap" as we'd like it to be
     for being this single function to use for this purpose. It currently scans
     over all added easy handles and we should improve that to instead become
     a proper queue of messages that is instant to check if empty.

  B) A way to keep the existing running_handles behavior is to make sure the
     timeout callback gets called when the timeout has been changed for this
     case, but that will delay the processing 1ms and it makes the flow a lot
     uglier.

     But even with such a fix I hesitate to claim that running_handles always
     will get bumped when a new handle has been added. If the handle simply
     dies instantly, it will/should never count as running.

Hm, now when I've spent the time to write all this, I must say I lean towards
method (A) rather strongly even though it may alter behavior slightly.

If you've read this far and have an idea of what I've tried to say, I'd say
you're qualified to have an opinion on the matter so please tell me what you
think or if I've missed some detail or whatever.

-- 
  / daniel.haxx.se
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette:  http://curl.haxx.se/mail/etiquette.html
Received on 2010-01-30