cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Curl coredump in autobuild

From: Tor Arntsen <tor_at_spacetec.no>
Date: Tue, 2 Sep 2008 17:47:50 +0200

Daniel wrote:
>Perhaps you can set a break-point in there for the failing test and see how
>things look when Curl_connect() gets called and the following functions that
>will be used from that point onwards.

It's problematic to single-step this, because due to the optimization
most variables can't be printed (they're presumably held in registers
or optimized away by other means), and return values from e.g. malloc
appears to be NULL but the code path behaves as if there weren't (and
they probably weren't). So, very difficult to find anything. And with
-O0 everything works ok.

Yang Tse <yangsita_at_gmail.com> wrote:
> Another idea...
>
> Trying CFLAGS="-O2 -no-ansi-alias" or CFLAGS="-O2 -fno-omit-frame-pointer" ??

Well, that's slightly interesting.. with -no-ansi-alias it still
coredumps at the same point, but gdb shows just rubbish for the 'ai'
structure pointer (in Curl_freeaddrinfo)

With -fno-omit-frame-pointer it still crashes, but now in an interesting way:

Program terminated with signal 11, Segmentation fault.
#0 Curl_freeaddrinfo (ai=0x8070fd0) at hostip.c:565
565 free(ai->ai_canonname);
(gdb) where
#0 Curl_freeaddrinfo (ai=0x8070fd0) at hostip.c:565
#1 0xb7f9a394 in freednsentry (freethis=0x8071018) at hostip.c:516
#2 0xb7fb8924 in hash_element_dtor (user=0x80708b0, element=0x8071028)
    at hash.c:45
#3 0xb7fb879b in Curl_llist_remove (list=0x8070938, e=0x8071050,
    user=0x80708b0) at llist.c:116
#4 0xb7fb8737 in Curl_llist_destroy (list=0x8070938, user=0x80708b0)
    at llist.c:128
#5 0xb7fb8d12 in Curl_hash_clean (h=0x80708b0) at hash.c:235
#6 0xb7fb8cc3 in Curl_hash_destroy (h=0x80708b0) at hash.c:272
#7 0xb7fa8826 in Curl_close (data=0x8068028) at url.c:471
#8 0xb7fb6826 in curl_easy_cleanup (curl=0x8068028) at easy.c:553
#9 0x0804be60 in operate (config=0xfcba64b7, argc=134680528, argv=0x8071008)
    at main.c:5038
#10 0x0804a437 in main (argc=10, argv=0xbf89b334) at main.c:5098

(gdb) print *ai
$1 = {ai_flags = 0, ai_family = 2, ai_socktype = 1, ai_protocol = 0,
  ai_addrlen = 16, ai_canonname = 0x8071008 "127.0.0.1", ai_addr = 0x8070ff0,
  ai_next = 0x0}

Notice how ai_canonname now actually has a valid pointer, with the
expected value.
(gdb) print ai->ai_canonname
$2 = 0x8071008 "127.0.0.1"

(Without -fno-omit-frame-pointer ai_canonname was an address that
couldn't be printed.)

So the code fails when it calls free() on that pointer.. segmentation
fault. I can't see how that can happen, as long as it can be printed
by gdb. If it e.g. wasn't a properly malloc'ed pointer I can see how
free() could fail internally, but not how that can give a sigsegv
error - unless the error is actually in the address of the free()
call. That would be similar to the earlier problems I've seen with icc
9.1 and newer, which we discussed a bit a few months back - apparently
the call to calloc() failed with a sig11.

I've also tried with -fno-builtin and -fno-alias andsome more options,
with no difference, and I also tried options for different structure
alignments (as I noticed that alignments seemed to have changed from
icc 9.0, when I looked at assembly code output). Still, no difference.

There's a -fstack-security-check option which I have tested as well,
with no effect.

Lastly, with -O1 instead of -O2, it fails in a different way (this is
for test 1):

Program terminated with signal 11, Segmentation fault.
#0 0xb7edb39f in Curl_he2ai (he=0xbff286bc, port=8990) at hostip.c:694
694 ai = calloc(1, sizeof(Curl_addrinfo) + ss_size);
(gdb) where
#0 0xb7edb39f in Curl_he2ai (he=0xbff286bc, port=8990) at hostip.c:694
#1 0xb7edb33f in Curl_ip2addr (num=16777343, hostname=0x806ae58 "127.0.0.1",
    port=8990) at hostip.c:630
#2 0xb7efad03 in Curl_getaddrinfo (conn=0x806a9c0,
    hostname=0x806ae58 "127.0.0.1", port=8990, waitp=0xbff28734)
    at hostip4.c:142
#3 0xb7edb116 in Curl_resolv (conn=0x806a9c0, hostname=0x806ae58 "127.0.0.1",
    port=8990, entry=0xbff288a0) at hostip.c:452
#4 0xb7eeb773 in resolve_server (data=0x8062028, conn=0x806a9c0,
    addr=0xbff28af8, async=0xbff28b38 "") at url.c:3898
#5 0xb7eec012 in create_conn (data=0x8062028, in_connect=0xbff28b60,
    addr=0xbff28af8, async=0xbff28b38 "") at url.c:4383
#6 0xb7eec1e2 in Curl_connect (data=0x8062028, in_connect=0xbff28b60,
    asyncp=0xbff28b38 "",
    protocol_done=0xbff28b3c "\001qð·h\213ò¿\211=ï·(
\006\b`\213ò¿|\213ò¿\230qð·( \006\b") at url.c:4512
#7 0xb7ef3c4a in connect_host (data=0x8062028, conn=0xbff28b60)
    at transfer.c:2357
#8 0xb7ef3d89 in Curl_perform (data=0x8062028) at transfer.c:2438
#9 0xb7ef44ee in curl_easy_perform (curl=0x8062028) at easy.c:538
#10 0x0804f931 in operate (config=0xbff28f70, argc=10, argv=0xbff291b4)
    at main.c:4792
#11 0x0804fef5 in main (argc=10, argv=0xbff291b4) at main.c:5098

This reminds me of the problem I spoke about earlier (which was a
while back), where the call to 'calloc' itself seemed to fail.

Quite frustrating.

-Tor
Received on 2008-09-02