cURL / Mailing Lists / curl-library / Single Mail

curl-library

RE: libcurl and async I/O

From: Andrew Barnert <abarnert_at_adobe.com>
Date: Mon, 18 Aug 2008 16:40:15 -0700

Sorry; I didn't notice all of the other replies before I sent mine.

18 Aug 2008 15:07, Cory Nelson:
[lots of snipping not reflected below]
> On Mon, Aug 18, 2008 at 2:32 PM, Daniel Stenberg <daniel_at_haxx.se>
> wrote:
> > For an ordinary application with < 100 connections, why would an app
> > particularly insist on such an asynch API for libcurl?

My particular reason is that I'm using asio for the native protocol, and
I want to use the same loop for the tunneled protocol. Yes, I could just
create another thread and put a select loop in there for the libcurl
sockets, and add all of the inter-thread synchronization stuff, but you
can see why I wouldn't want to do this.

Other reasons:

1. Your app might one day need to handle thousands of connections.

2. Even if you only need 70, select won't do that many on Windows.

3. There is no better cross-platform solution than select, without
   going to hefty libraries like ACE reactor. (I'm assuming libevent
   still uses select on Windows, right?)

4. If you think in terms of asio (as die-hard Windows networking types
   do), ready-notification is as backwards and hard to wrap your head
   around as asio is for everyone else. (And remember, these are
   Windows programmers, who aren't notorious for flexibility.)

5. The boost.asio library is pretty nice, especially if your app is
   already highly boost-y; you might just prefer writing boost.asio
   code to writing select or libevent code.

> > What possibly is lacking on Windows, as Jamie Lokier possibly
> > suggests, is an event-based system that is good enough or suitable
> > to do the job in a convenient manner!

You're suggesting that Windows should have something that looks a
lot like epoll/kqueue/etc., and that's just as efficient as them? That
would be nice. Even poll would be nice. But it's not going to happen.

As I understand things, Microsoft has determined that there's no way
to add an epoll-style API to the Windows kernel that scales well to
multiple cores, while the linux networking team has determined that
there's no way to add aio networking to the linux kernel in a way
that's as low-overhead as epoll, so we may be stuck with these two
different interfaces forever. (And of course Sun will provide both,
and 20 others besides.)

Given that it's pretty easy and not too slow to fake asio on epoll
(as boost.asio, ACE's proactor, etc. already do), and pretty hard to
fake epoll on asio efficiently, you could argue that asio is a better
choice. But, as others in this thread have pointed out, there's no
reason to make that choice. You can define a "network filter" API
that works well with both, and libcurl's API is already pretty close
to this today.

> IIRC the primary difference between IOCP and epoll is that IOCP gives
> notification when an operation finishes, and epoll gives notification
> when an operation is ok to start. So while there is some difference,
> it's not really a super inconvenient one.

Exactly. As I said, the primary difference is really just a matter of
where the buffers go in the API.

But the secondary difference is that you have to replace your recv
and send calls with ReadEx/WriteEx, aio_read/aio_write, etc., which
means that handy wrappers like SSL_read designed to drop in as
replacements for recv and send can't be used. And this one _is_
super inconvenient.

> >> asio does some magic to make a truly async OpenSSL socket (see
> >> asio::ssl::stream),

Yes; that's exactly why, for my own local purposes, I can break
libcurl's SSL support. But that doesn't help at all for the general
case.

> >> but OpenSSL's API, much like almost every other
> >> protocol library, is not really designed well for this kind of
> >> async.

OpenSSL can be used with async I/O, actually, and it works quite well.

But that many not be true of all of the other libraries that libcurl
wraps. And, even if it were, they may not all be async-able in
similar enough ways for libcurl to wrap them similarly. (Then again,
they may be--it's worth looking into.)

> Yes. But again, I don't think the external API of libcurl would have
> to change at all. It would all be internal changes.

I'm not sure I agree. Even forgetting about SSL/SSH/Kerb, the socket
callback is going to need the buffer to be read/written so it can make
the async read/write call. That means either exposing internal info
about how the connectdata, SingleRequest, etc. structs work, changing
things so that the app manages the buffers instead of libcurl, or
putting the buffers into the API.
Received on 2008-08-19