cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Issues in constraining number of open connections with curl multi interface

From: Mukul Kedare <kedare.mukul_at_gmail.com>
Date: Fri, 30 Dec 2011 13:37:36 +0530

Hi Daniel,

It seems like I had not put up the application logic properly in previous
mail. Let me explain the it to you more clearly. Further, let me tell you
what our assumptions were and what we understood after trying out sample
codes and going through curl documentation. :) And finally problem
statement and few questions.

APPLICATION LOGIC:
1. We have a set of URLs we need to connect with, say URL_1, URL_2, URL_3
etc. Every such URL has MAX number of connections that it can support at
any point of time. To do so we maintain a global pool of curl easy handles
per URL. NOTE that the pool am talking about has nothing to do with curl's
connection pooling/caching. Its specific to our application. Lets term this
pool as "Curl Easy Pool".

2. This pool is used by multiple threads, say 100 in our app, so as to make
requests to some subset of URLs. NOTE that any thread only needs to connect
with a subset of the URLs, choosing only one curl easy handle from each
URL's "Curl Easy Pool". To do so, each of these threads in the app has a
multi curl handle. At each transaction we determine the subset of URLs to
connect with, acquire easy handles from corresponding "Curl Easy Pool", add
them to multi handle and call multi-perform. Once done, we remove easy
handles from multi handle. Multi handle allows asynchronous
request-response with multiple URLs there by doing parallel
request/response handling for multiple URLs.

3. E.g. Say URL_1 has "Curl Easy Pool" size 10. So only 10 threads can
acquire an easy handle from the "Curl Easy Pool" of size 10 corresponding
to "URL_1". While the other 90 threads will drop the requests for "URL_1".
This way we are trying to achieve limited number of connections per URL in
our system.

OUR ASSUMPTION:
Restricting the number of connections per URL through curl easy handles,
with MAXCONNECT for each easy handle set to 1, will restrict the number of
open connections for that URL.

It seems that above assumption was wrong, since we are using multi curl per
thread and with multi curl, caching will happen at multi curl handle and
not at easy handles. So for 100 threads, even though we restrict the number
of curl easy handles for "URL_1" to 10, each with MAXCONNECT=1, we are
seeing open connections >= 10 and <= 100. I assume this is due to use of
multi handle at thread level. "URL_1" connections are getting cached at
each thread (number of threads = 100).

HOT FIX :
To solve this problem we are now using "CURLOPT_FORBID_REUSE", which is
closing the connection explicitly
after use. But this isnt the nice way of solving the problem since we are
giving-up the benefits of persistent connection.

PROBLEM STATEMENT:
Q1. We want to make persistent open connections and restrict the number of
open connections with URLs. Can you suggest the correct way in which we
should be using curl multi and easy interfaces to solve this problem?

Q2. Can we have a common curl connection cache shared between curl multi
handles?

Q3. Also I have read a thread in the mailing list (
http://curl.haxx.se/mail/lib-2011-05/0017.html) where you stated that
"Share Interface" can be used for sharing connection cache. Is this
possible? :)

Please let me know your thoughts/suggestions on this.

Regards
Mukul

On Thu, Dec 29, 2011 at 2:10 AM, Daniel Stenberg <daniel_at_haxx.se> wrote:

> On Wed, 28 Dec 2011, Mukul Kedare wrote:
>
> All the threads, have their own multi handle to which we add the easy
>> handles from the connection pool based on the URLs to which the requests
>> are to be made.
>>
>
> Keeping easy handles around for a multi handle will not affect the
> connection pool at all. The pool is kept within the multi handle completely
> separate from the easy handles to allow applications to not have to bother
> about such stunts.
>
> Connections opened for a perticular URL should not be more than their
>> respective pool size.
>> Say for* URL_1* pool size is 5, in that case there should not be more than
>> 5 curl easy handles in use at a time.
>>
>
> Are you talking about easy handles or actual connections? You can of
> course make sure you only have 5 handles for a specific URL at any one
> time, but you can't as easily know or control the amount of idle
> connections kept in all the pools that are still connected.
>
> The Problem we are facing is that there are open connections more than
>> the pool size of a perticular URL.
>>
>
> Only if you add more handles than what you set the size to be I think. Or
> perhaps you found a bug.
>
> To solve this, we tried using *CURLMOPT_MAXCONNECTS* and *
>> CURLOPT_MAXCONNECTS* but it is not working.
>>
>
> I think they work like they are documented and implemented to work. They
> may not match your use cases exactly.
>
> Please suggest how do we *limit the number of open connections without
>> giving away persistence of connection*.
>>
>
> I'll help you understand how libcurl works, then you can convert that into
> how your app should behave or possibly suggest ways we can improve.
>
> --
>
> / daniel.haxx.se
> ------------------------------**------------------------------**-------
> List admin: http://cool.haxx.se/list/**listinfo/curl-library<http://cool.haxx.se/list/listinfo/curl-library>
> Etiquette: http://curl.haxx.se/mail/**etiquette.html<http://curl.haxx.se/mail/etiquette.html>
>

-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2011-12-30