cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: debugging a crash in Curl_pgrsTime/checkPendPipeline?

From: <johansen_at_sun.com>
Date: Thu, 23 Jul 2009 12:08:46 -0700

On Wed, Jul 22, 2009 at 11:11:40PM +0200, Daniel Stenberg wrote:
> On Tue, 21 Jul 2009, johansen_at_sun.com wrote:
>
>> A colleague's machine seems adept at reproducing this problem, but I
>> haven't yet discovered how to induce the conditions myself.
>> Regardless, I was able to throw together a more instrumented version of
>> libcurl and get better debugging information out of the core.
>
> So can you now see this issue with 7.19.5 or even the current CVS
> version? I'm sorry, but until you've repeated the problem with these I
> won't be bothered to work very hard on it.

I have built a version of 7.19.5 and passed it on to the colleague whose
machine hits this problem regularly. He wasn't in the office yesterday,
and I'm working remotely for the rest of this week, so it may take a bit
longer than usual to get the experiment run.

To be clear, though, I'm not asking you to spend a lot of energy on this
problem. I believe that I've found the root cause, but I would
appreciate a second opinion of my analysis from an expert. I'm also
happy to code and test a fix, provided that someone is willing to review
the changes and offer feedback.

To paraphrase the problem: I have 10 easy handles all sharing a
connection because they're pipelining. One handle, let's call it A,
transitions from CURLM_STATE_PERFORM to CURLM_STATE_DONE in
multi_runsingle(). As part of the transition, the code in
CURLM_STATE_PERFORM removes the session handle for A from the recv
pipeline. The function returns and we go on to process another handle.
Some time later, handle F encounters a send error that requires it to
set bits.closed to True. It then calls Curl_done()
on the connection. Curl_done() calls Curl_disconnect(), which calls
signalPipeClose() on the send, recv, and pend pipelines. Part of the
PipeClose is a call to Curl_multi_handlePipeBreak(), which sets the
easy_conn in Curl_one_easy's attached to the pipeline to NULL. The
connection is then free'd.

When handle A gets run in the CURLM_STATE_DONE state, it's easy_conn has
been free'd by this send error, but because its SessionHandle was
removed from the pipeline, handlePipeBreak() didn't set handle A's
easy_conn to NULL. This is what causes us to access free'd memory and
crash.

I took a look at both the 7.19.4 and 7.19.5 source, and the code paths
appear to be the same. By inspection this bug ought to be in both
places.

In terms of solving the problem itself, what do you think about
introducing a 4th list called the done_pipeline? This is where we can
put requests that are done, but that haven't been processed in the
CURLM_STATE_DONE state yet. If we modify signalPipeClose() to not set
pipe_broke on these requests, but make sure that their easy_conn gets
set to NULL, I believe that should be sufficient to avoid a use after
free. We'll just have to make sure that the processing for
CURLM_STATE_DONE checks that easy_conn isn't NULL before proceeding.

>> 08046094 libcurl.so.3.0.0`Curl_removeHandleFromPipeline+0x16(cf3b00c, deadbeef,
>> 2ec2298c, fd50b332)
>
> How come your libcurl soname is 3? We've used 4 for quite some time by now...!

I believe that this is an artifact of our build process, but I don't
know the rationale. My guess is that we didn't want to break consumers
who depended upon the so.3 version, but since I'm not involved in that
part of the process, I'm not sure.

-j
Received on 2009-07-23