cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: a curl_multi_fdset() replacement? (TODO-RELEASE #55)

From: Jamie Lokier <jamie_at_shareable.org>
Date: Mon, 31 Jan 2005 19:33:36 +0000

Ben Greear wrote:
> So, you're suggesting some async notification instead of select/poll?

No, it's synchronous.

> In that case, curl could just provide a callback every time it added or
> deleted a socket, and your application could deal with everything on it's
> own. Curl would just need a 'go process this socket fd' method after your
> app has determined something needs to be done to the socket.

Yes, I agree. That's what I'd do. But Daniel does not seem to like
callbacks, so an array of changes is just as good algorithmically.

> That said, select/poll is not so bad as you may think, and if you have 10k
> sockets serviced by a single application, you may instead want to consider
> breaking your own app into multiple processes.

select/poll is particularly bad on Linux as it happens, but yes, only
for 1k-10k socket applications.

While I agree that it's time to consider breaking the application up
at that point, that's not something libcurl should force on the
application designer. There are perfectly good apps that handle 10k
or even more connections without any problems. It's not hard.

Now that Dan's redesigning curl_fdset it seems appropriate to pick a
design that isn't limited in a known way.

After all, the whole point of this new API is to get around an earlier
known limitation... may as well do it properly.

> >After the app calls select() or whatever, and then passes curl the 20
> >ready descriptors to process, _only_ those 20 descriptors are going to
> >cause differences before the next return to the app. (Ok, a few more
> >due to timeouts, but only a few). So you just build the array to
> >return based on each descriptor that is actually processed during that
> >call into curl.
>
> If your app is still calling select or poll, you've gained almost exactly
> nothing because the kernel is still going to go through all 10k FDs in O(N)
> time.

Er, I didn't mean that it really calls select(), that was just a
metaphor or something :) I was trying to keep the explanation simple
and I guess that was a bit too much simplifying.

By the way, going through 10k or even 0.5k fds is much slower in the
kernel than userspace on Linux. On Solaris, the kernel is much
better. The select() and poll() implementations are relatively slow,
however they won't speed them up like Solaris precisely because the
API is inherently O(n) and they offer a much better one called epoll_wait().

I'm not suggesting that curl should use any of the better polling APIs.

Only that curl's own API should be compatible with that type of
algorithmic performance, instead of curl's API becoming the bottleneck
in a program which otherwise uses (so-called) "O(1)" polling.

-- Jamie
Received on 2005-01-31