curl-library
sharing is caring
Date: Sun, 13 Jan 2002 12:37:38 +0100 (MET)
[ Warning: this mail contains numerous unfinished thoughts, brainstorming
style ramblings and lots of questions! ]
* Sharing
With Sterling's newly introdced name resolve cache code, he provided an
interesting approach to a problem that actually several other curl subsystems
could benefit from.
I'm talking about the way the DNS cache is shared between all easy-handles
that are added to a multi-handle. As all handles within a multi-handle are
used only one at a time, it is safe to do this without any mutexes or
similar.
I am of course thinking about the connection cache, SSL session cache and
cookies (more?). They are all different lists with information that today are
stored per easy-handle, but would benefit to work per multi-handle too,
shared between multiple easy-handles.
Then the question arises: how? And I'm not talking about the actual
implementation, as the making the current code support this concept should be
pretty straight-forward. I'm talking about how should the interface work that
controls this sharing of various lists/caches/pools?
Imagine that you want to transfer several HTTP simultaneous streams using the
multi interface. By default it'll work as today, with all easy-handles having
one individual cache of each kind.
Somehow, you should be able to tell (in an order of inreasing complexity):
A) All easy-handles added to a multi handle share all caches.
B) Specified easy-handles share all caches, the rest have their own.
C) Specified easy-handles share specificly mentioned caches.
The question is, is level C necesary? Do we gain/lose anything significantly
by only allowing level B or A?
* Mutexing
When we start thinking about sharing data between easy-handles while they're
in a multi-stack, it is easy to let your thoughts drift off and yes, then
someone will suggest being able to share the above mentioned lists between
easy-handles that are NOT present in the same multi-handle. Multi-threaded
applications could indeed benefit a lot from having all or some libcurl
transfer threads share some or all of the information.
Then we step right in the next pile of questions. How do we deal with the
mutex problem? libcurl just cannot attempt to mutex sensitive parts, as
there's no good enough standard for it. pthread might work for most systems,
but there are just too many different versions that it would be silly to try
to do mutexing natively for all operating systems libcurl can run on.
Instead, I suggest that we have libcurl call two application specified
callbacks for retrieving and releasing mutexes, leaving the actual
implementation for the outside to decide.
Comments?
* Resource Owners
When suddenly several handles would share one or more resources, we face a
minor dilemma. Who owns the resources and when are they removed?
I could imagine a system where we remove the resource completely when the
last handle involved in the sharing is removed. But is that the best possible
system?
Perhaps we should allow the resources to "live" outside the strict control of
the handles? I mean, so that you can create a "resource" that continue to
live without any specific handle being around... Would there be any point
support that kind of thing?
It could possibly allow us to introduce a separate API for querying for
resources, like asking if we have a particular cookie set or setting a
particular cookie in the resource "pool" etc.
* Pipeline
While on the subject of sharing, I've come to think of another little feature
we could think about for a while: pipelining. Pipelined HTTP requests are
requests that are sent to the server before the previous request has been
fulfilled, to minimize the gap between multiple responses from the same
server.
I came to think of that we could in fact offer pipelined requests using the
multi interface. We could offer an option to an easy-handle that makes it
"hook" on to an already existing connection (on another easy-handle) if such
a one exists.
It would be a little like two easy-handles sharing a connection cache, and if
one of them would like to use the *exact* same connection that is already in
use by the other one, the second request would get pipelined and served after
the first one is done...
Enough for now. There are many things we could do. What do you think?
-- Daniel Stenberg -- curl groks URLs -- http://curl.haxx.se/Received on 2002-01-13