cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: poll() POLLERR / POLLHUP without POLLIN / POLLOUT

From: Joshua Kwan <jkwan_at_vmware.com>
Date: Fri, 11 Sep 2009 08:49:11 -0700

Hi Yang,

On 9/11/09 05:34, "Yang Tse" <yangsita_at_gmail.com> wrote:
> IOW poll() might set POLLHUP in revents without setting POLLIN, and
> might set POLLERR without setting POLLIN and POLLOUT.

That's correct.

> Joshua, which I/O library and version is behaving in this way? Or
> which OS and version is doing this?

It was a bug reported by one of my coworkers, who is running a laptop with
some variety of Fedora.

> No the patch does not tell c-ares that the file descriptor has an
> POLLERR or POLLHUP condition, it simply tells that c-ares should
> process the file descriptor as if it had a POLLIN revent whenever it
> has a POLLERR or POLLHUP condition, and process the file descriptor as
> if it had a POLLOUT revent whenever it has a POLLERR.

Okay, yes, that is really what it does. Did you check the URL in the patch
comments? Some people had a similar problem and they rationalized doing it
this way.

http://lists.danga.com/pipermail/memcached/2003-October/000336.html
  
> In this way , c-ares will attempt to read or write on the fd and fail.
> But this failure might very well trigger additional lookups and
> finally make c-ares mark all configured DNS servers as unavailable.
> Which isn't necessarily bad.

Certainly. In this case, it was true. My coworker only experienced this when
disconnected from any network (but he still had valid IP information.)
 
> My concern is that this patch addresses the lack of POLLIN and POLLOUT
> when POLLERR or POLLHUP is reported by poll in a very specific point
> of libcurl awaiting c-ares to resolve. But if the condition arises at
> other moments of the connection/transfer I fear that you will be
> facing the same problem of not detecting POLLERR or POLLHUP when poll
> does not set POLLIN or POLLOUT, but in this case libcurl should simply
> timeout somewhere.

That's true, but if it happened again we would notice CPU spinning at that
point as well. I think it turns out that this is a corner case related to
network unavailability and DNS (via c-ares) is often the first 'port of
call' (no pun intended)

I'm happy to back this out and try a more general solution, but given that
DNS is one of the first things that happens during a transfer I feel like my
patch will catch most of the occurrences in the wild.

But thanks for giving the patch a critical eye. I was definitely hoping for
some feedback.

-Josh
Received on 2009-09-11