cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: valgrind shows invalid write, leads to SIGSEGV

From: Nick Gerner <nick_at_seomoz.org>
Date: Sun, 13 Jul 2008 13:15:46 -0700

On Sat, Jul 12, 2008 at 2:33 PM, Nick Gerner <nick_at_seomoz.org> wrote:
> By the way, here is a stacktrace from my segfault:
>
> #0 Curl_removeHandleFromPipeline (handle=0x8b6bcd0,
> pipeline=0x6161616161616161) at url.c:2279
> 2279 curr = pipeline->head;
> (gdb) bt
> #0 Curl_removeHandleFromPipeline (handle=0x8b6bcd0,
> pipeline=0x6161616161616161) at url.c:2279
> #1 0x0000000005d8f885 in multi_runsingle (multi=0x903db90, easy=0x9227f50)
> at multi.c:1335
> #2 0x0000000005d8fc5b in curl_multi_perform (multi_handle=0x903db90,
> running_handles=0x7feffff14) at multi.c:1461
>
> Notice that I'm having valgrind write 'a' (0x61) into memory it frees.
> This is not the stacktrace I get when I don't have valgrind do this,
> but I suspect that filling freed memory is causing things to crash
> closer to the actual error.
>
> Thanks!
>
> --Nick
>
> On Sat, Jul 12, 2008 at 2:30 PM, Nick Gerner <nick_at_seomoz.org> wrote:
>> I'm using libcurl 7.18.2 + c-ares 1.5.0
>>
>> running valgrind under my app I see:
>>
>>
>> ==5959== Invalid write of size 8
>> ==5959== at 0x5D8F253: multi_runsingle (multi.c:907)
>> ==5959== by 0x5D8FC5A: curl_multi_perform (multi.c:1461)
>> ...my app here...
>> ==5959== Address 0xb546a08 is 0 bytes inside a block of size 1,304 free'd
>> ==5959== at 0x4C22B2E: free (vg_replace_malloc.c:323)
>> ==5959== by 0x5D7D035: Curl_disconnect (url.c:2216)
>> ==5959== by 0x5D7D8CC: ConnectionExists (url.c:2506)
>> ==5959== by 0x5D7E14D: Curl_connect (url.c:3968)
>> ==5959== by 0x5D8F3C5: multi_runsingle (multi.c:926)
>> ==5959== by 0x5D8FC5A: curl_multi_perform (multi.c:1461)
>> ...my app here...
>>
>> And then I see other valgrind issues which I believe are related, for example:
>>
>> ==5959== Invalid read of size 8
>> ==5959== at 0x5D8F879: multi_runsingle (multi.c:1335)
>> ==5959== by 0x5D8FC5A: curl_multi_perform (multi.c:1461)
>> ...my app here...
>> ==5959== Address 0xb546c58 is 592 bytes inside a block of size 1,304 free'd
>> ==5959== at 0x4C22B2E: free (vg_replace_malloc.c:323)
>> ==5959== by 0x5D7D035: Curl_disconnect (url.c:2216)
>> ==5959== by 0x5D7D8CC: ConnectionExists (url.c:2506)
>> ==5959== by 0x5D7E14D: Curl_connect (url.c:3968)
>> ==5959== by 0x5D8F3C5: multi_runsingle (multi.c:926)
>> ==5959== by 0x5D8FC5A: curl_multi_perform (multi.c:1461)
>> ...my app here...
>>
>> And sometimes my app seg faults inside curl, but always after the
>> above sorts of errors, always starting with:
>>
>> ==5959== Invalid write of size 8
>> ==5959== at 0x5D8F253: multi_runsingle (multi.c:907)
>>
>> My original thought was that this was heap corruption in from bugs in
>> my code. But I don't think this is the case any more. Maybe I'm
>> misusing CURL, but things in my app run along quite nicely for quite
>> some time, and in general work great (thanks for a great lib), and in
>> fact this error is rather rare (more rare than http error codes,
>> server connect failures, timeouts, dns resolution failures, etc.).
>>
>> So here's a question for you:
>>
>> near multi_runsingle (multi.c:926, the source of the free) I see:
>>
>> case CURLM_STATE_CONNECT:
>> /* Connect. We get a connection identifier filled in. */
>> Curl_pgrsTime(easy->easy_handle, TIMER_STARTSINGLE);
>> easy->result = Curl_connect(easy->easy_handle, &easy->easy_conn,
>> &async, &protocol_connect);
>>
>> if(CURLE_OK == easy->result) {
>> ...some other stuff...
>> }
>> break;
>>
>> What is the behavior if easy->result is not CURLE_OK? Should this be
>> doing something else?
>>
>> I look forward to hearing more!
>>
>> Thanks!
>>
>> --Nick
>>
>

Sorry about earlier top post...

I just rewrote a little bit of my app to use multi_socket_action
(using the socket callback and poll() ) instead of multi_perform. Is
this the preferred method of using the multi interface?

I did not hit either the valgrind invalid writes or the SIGSEGV (well
there were some at cleanup in ares, but my app exits immediately after
that, so I'm hoping that's benign).

So it could be that my app had bugs, and rewriting this fixed them.
But the change was fairly small to my app (all I did was add the
socket and timer callbacks storing stuff in a global data structure
and call poll() on an array pollfds instead of blindly looping on
multi_perform, without a select because of the FD_SETSIZE issue)

or it could be that I'm hitting a different code path in CURL.

or I suppose this could also have been a transient error, but it was
completely consistent before.

Anyway, I appreciate any thoughts anyone has, because I know I spent
forever trying to figure this issue out, even if I can now work around
it.

--Nick
Received on 2008-07-13