cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: [SAD TRUTH] does curl_multi handle can be accessed from 2 threads?

From: Christian Grade <spam_at_tickles.de>
Date: Sat, 09 Sep 2006 19:30:06 +0200

Spawning one thread for each transfer was the way I
went for some time. Unfortunately a UltraSPARC environment
is not the target platform aimed at, and I seek perfection.

I don't know anything about the impact of an implementation
of http pipelining when it comes to reacting to unexpected,
undesirable server responses. I quoted the one responsible
on "major overhaul". It sounded as if everything is going
to change, mankind being on the verge of extinction. Well,
so it's just a bit more than refactoring.

There *might* be a good reason not to take the approach
of using keys instead of handles themselves but I currently
don't see why not to make use of this abstraction. Identifying
data only by an 'easy handle' seems a shortsighted approach
since all transfer-relevant data continues to exist in an
application, but this does not hold true for the association
between 'socket' and a short-termed 'currently transferring'.
And in a multi-threaded application, everything is modeled
around shared entities.
It's so obvious: if a transfer is enqueued, waiting for actual
transfer, thus no handle associated with it yet, only thing
you can grab entities with is a key.
A consequence is the need to make those entities local
to 'a' or 'the' multi-interface. This might help to
understand why I consider the maximum number of
current 'easy handles' in use pretty constant.

In my case, I don't see the scenario "I need to stream three
files off the net onto my BSD box using only one console
command instead of the usual three."

Of course libcurl can support the three scenarios I mentioned
fine ("file", "continuous buffer", "chunked buffer"). One
still needs to implement everything else (e.g. chunk collector
class for memory management).

The next natural advance of an interface includes simplification
while maintaining versatility.

Regards
Chr. Grade

ps.
"Name user interface thingies according to the perception
of the user, not the lib developer, be unambiguous and
precise." -- unknown
e.g.: CURLOPT_ASSOCIATE_DATA, void*

pps.
setopt( hSingle, CURLOPT_GIMME_REDIRECTIONS, &first_node );

Daniel Stenberg wrote:
> On Thu, 7 Sep 2006, Christian Grade wrote:
>
>> My requirement is/was to have a libcurl module which consumes data
>> (remote paths/urls) from [two] producing threads. At the same time,
>> the libcurl module is/was supposed to produce data (mainly files) for
>> [three] consuming threads. Modelling the logics for the *data-
>> structure-wise separation* of the *transfer statús*, I found I'd
>> better adapt "multi.c" by building in some locking mechanisms and by
>> replacing the "multi.c" list with a *lock-free* alternative: one
>> thread transferring, one thread retrieving data about transfers. I
>> didn't get far with this.
>
> I think it sounds totally crazy. To me, it sounds like you've had made
> your mind up how this would be done already before you ever saw
> libcurl or read its documentation. And since then you've tried to
> squeeze libcurl into working with this design.
>
> When using the multi interface to transfer multiple files, it doesn't
> make any sense to split it up into multiple threads. If using many
> threads is your game, then I suggest you instead simply use indivual
> easy transfers in each thread.
>
>> When the necessity arose to even adapt "url.c" (monitored strings)
>> and as I didn't find out how to retrieve a stack of redirection urls
>> and when I heard the multi-interface will undergo a major overhaul
>> soon anyway, I took a break from fiddling with it, pondering about
>> alternatives.
>
> Well, the multi interface API is not about to change, but the
> internals are gonna be somewhat changed within a few days when I
> commit the HTTP pipelining support. Further, "a stack of redirection
> urls" is not a problem to the multi interface. Not now and not tomorrow.
>
> You will get a slight worry if the stack is more than 1000
> simultanoues transfers using the goold old *perform() approach but you
> can then switch to the *socket() way and enjoy unlimited number of
> transfers supported at a speed that I don't think any other HTTP
> library in the wild can match.
>
>> I must have overlooked toUpper( "curlopt_private" ). Keyword 'private'
>> (suggesting "better keep hands off")
>
> Yes, libcurl keeps its hands off it. Private for you, the application.
>
>> Now the existence of this option fulfills the *should-at-least-have*
>> part.
>
> Thanks for your conforting words. You sure know how to take people.
>
>> I see three transfer scenarios (downstreaming):
>> [1] Module supplied with url, file name, optional offset (resumption)
>> [2] Module supplied with url, continuous buffer, salvation function
>> [3] Module supplied with url, chunked buffer, what-if/process function
>
> libcurl can of course support them all fine.
>
>> This introduces new data members which one has to care for:
>> book-keeping.
>
> Yeah, sure. You have the data and the URLs in some kind of entities.
>
>> So, one could associate these with 'easy handles' but it would be
>> more performant, less tedious to implement if these were in the
>> multi-interface already.
>
> What would be in the multi interface already?
>
>> One would have two lists for the same concept in parallel otherwise.
>
> Two lists of what?
>
>> So, why 'unique' IDs for connections?
>
> There are "unique IDs", not for connections but for transfers. They
> are called easy handles.
>
>> When the libcurl module is running, and this practically an infinite
>> mode
>
> In your case, it seems to be yes.
>
>> there's little need to cope with 'easy handles' themselves outside a
>> universal module.
>
> The easy handle is the handle to a specific transfer. This is a very
> common approach used all over. How else would you control options and
> specifics for a single transfer?
>
>> The number of maximum simultaneous transfers is 'pretty' constant;
>> naturally subject to hardware limitations.
>
> Constant? Yes if you fill up all your RAM and when all your other apps
> have been swapped to disk, then there's likely a fixed limit on the
> amount of simultaneous transfers you can do on the single particular
> host you use. But in reality that won't appear as constant.
>
>> Why would the outside have to care which handle is used for a
>> connection?
>
> Because you want to set specific options for each single transfer.
> Because you want to know for which particular transfer data comes from
> or goes to. Because the multi interface is a sister-API to the easy
> interface and makes life simple to users who wants to move from
> asynchronous easy transfers to asynchronous multi transfers.
>
Received on 2006-09-09