cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Multi cURL connect bug

From: Keyur Govande <keyurgovande_at_gmail.com>
Date: Sun, 7 Jul 2013 14:36:27 -0400

On Sat, Jul 6, 2013 at 5:24 PM, Daniel Stenberg <daniel_at_haxx.se> wrote:
> On Fri, 5 Jul 2013, Keyur Govande wrote:
>
>> Looking at the code it seems like if for example, the protocol connect for
>> HTTPS didn't finish quickly enough, curl_multi_perform() will return
>> CURLM_CALL_MULTI_PERFORM.
>
>
> No, that's reading it a bit wrong. First off, curl_multi_perform() doesn't
> even return that return code since a few years back!
>
> Then, libcurl has always only returned the CURLM_CALL_MULTI_PERFORM return
> code when there was actually something more for libcurl to do. It would
> never return that simply because an operation wasn't completed.
>
>
>> I'm proposing that this edge-case is the same as not finishing a TCP
>> connect() and both should be handled similarly.
>>
>> My goal is to be able to make an asynchronous RPC with curl. If there are
>> other ways to accomplish this, please do let me know. From my point of view,
>> the library is 99.9% of the way there in supporting this behavior, except
>> for this one corner case around the TCP connection.
>
>
> I don't understand why you need to know those libcurl state transitions just
> to make asynchronous operations!
>

The use-case is: Consider a PHP script (a.php) that takes 1000ms to
finish. Part of the 1000ms (say 500ms) is gathering some data. We want
to split up this data gathering into an asynchronous RPC. So a.php
fires off a multi-curl request to b.php at the beginning of a.php.
Then a.php will continue doing its thing and when it is ready, will
check for a response from b.php, and use the response to generate the
final output. So the response time for a.php is cut down from 1000ms
to something close to 500ms.

So basically in a.php, we'd like to know if the request was
successfully sent over, so a.php can continue on.

Here's some psuedo code for a.php:

$mc_handle = curl_multi_init();
$c = curl_init();
// Set up the handle and options
curl_multi_add_handle($mc_handle, $c);
do {
    $cme = curl_multi_exec($mc_handle, $remain);
} while ($cme == CURLM_CALL_MULTI_PERFORM);
// When we arrive here, we assume the request has been flushed to the
remote host
// Continue doing work
// etc..etc...
// Now we're ready to receive the response.
$rc = curl_multi_select($mc_handle, $timeout_sec);
// Depending on $rc read response and use as needed

When connecting to a slow remote host, the request is stuck in the
connect() phase when curl_multi_exec() returns CURLM_OK. So a.php
needs to know to that the request has not yet been flushed over, and
call select() and then curl_multi_exec() again.

When connecting to localhost though, then curl_multi_exec() returns
CURLM_OK, almost always the request has been flushed over and calling
select() ends up waiting on the response, which is not what we want to
do.

The goal of this email was to know if there's a way to distinguish
these 2 cases, so a.php can make the appropriate decision.

We're currently using the 7.19.7 version of libcurl.

I noticed in the later version (7.21.5), by using
CURLOPT_OPENSOCKETFUNCTION + CURLOPT_SOCKOPTFUNCTION, we can pass a
connected socket to libcurl, so when curl_multi_exec() returns
CURLM_OK, we can most likely assume the request was flushed over.

> --
>
> / daniel.haxx.se
>
> -------------------------------------------------------------------
> List admin: http://cool.haxx.se/list/listinfo/curl-library
> Etiquette: http://curl.haxx.se/mail/etiquette.html
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2013-07-07