cURL / Mailing Lists / curl-library / Single Mail

curl-library

strange memory "leak" with libcurl, ssl, network errors

From: Michael Schuster <Michael.Schuster_at_sun.com>
Date: Tue, 02 Feb 2010 11:34:40 +0100

Hello,

I'm seeking assistance with a rather problematic issue I've been working on
for a while, and would welcome all/any thoughts, pointers, RTFMs ... you're
willing to share.

TIA
Michael

Executive Summary:
==================
When trying to connect or talk to an endpoint over SSL in the presence of
(network) errors, (lib)curl seems to be "losing" memory in a non-obvious
way, not detected as a "memory leak" by libumem (on Solaris) or with
openssl's/libcrypto's built-in memory debugging mechanism
(CRYPTO_malloc_debug_init(), etc).

Details:
========
Our application, the agent plugin for the MySQL Proxy, is a long-running
app that receives and sends lots of traffic over the network - one could
say doing this is its "raison d'etre".
To handle the (http) traffic, we employ curl, using plain text transfer or
SSL/TLS, depending on customers' requirements.

One customer noticed that a long-running agent process had amassed over
2GB memory size (measured with pmap); Support noticed that customer's log
contained quite a few network errors.

Tests have shown that indeed, in the presence of "network errors" we
simulated (eg. by restarting the http server process), the agent's memory
footprint would increase steadily when we were using SSL, and remain
constant when not (when we replaced openssl by gnutls, this behaviour was
even worse).
The fact that we see this type of behaviour both with openssl and gnutls
seems to indicate that it's not (alone) a fault of one of these components,
but rather CURL or the interaction between CURL and whatever SSL
implementation we're using.

We implemented a test program (attached) that uses nothing but calls to
CURL and whatever libraries it uses, and saw the same pattern repeated,
albeit at a lower level.

As a hack, every time such a network error occurs, we call
curl_easy_cleanup() to tear down the curl state, and then re-initialise it
with curl_easy_init(), but this does not show significant improvement in
the test case.

Analyses using both libumem on OpenSolaris and valgrind on linux have not
uncovered any memory leaks that in any way look responsible for the amount
of memory being consumed (nor the rate!).

Using openssl's/libcrypto's built-in memory debugging mechanism
(CRYPTO_malloc_debug_init(), etc) did not reveal any leaks in the test
program either.

We also used libumem without any debugging and still saw heap growth as
long as we observed the programs - we wouldn't expect a slab allocator to
suffer heap fragmentation over extended periods of time.

see the following appendices for test szenario.

APPENDIX A: Test Setup
======================
I used this setup on OpenSolaris, something similar probably works on linux
as well.

- server side:
    + create a simple html file "index.html"
    + run this shell script:

        while [ 1 ]; do
                openssl s_server -cert mycert.pem -WWW & SERVPID=$!
                sleep 120
                kill $SERVPID
                echo -n 'killed ... '; date;
                sleep 10;
                echo -n 'restarting ... '; date;
        done

    in the same directory as "index.html" is located.

- client side:
    + run "curltest" thus:
            curltest -u https://<server>:4433/index.html -r -p
     (option -s can be used to turn off ssl use and use plain text only)

- to observe: once in a while (ever 5-10 minutes), run
        $ "pmap `pgrep curltest`|egrep 'total|heap'"
    to observe memory footprint.

APPENDIX B: Numbers
===================
in our test setup, we see almost uniform growth numbers, whether we use
libumem or libc, and whether we employ the "cleanup after curl error" hack
or not: somewhere between 100 and 120 KB/h. (I did not invest the effort to
get measurements at precise points in a test run, therefore I may have
captured some startup overhead in some measurements that I didn't in others).
Extrapolated, this leads to over 70MB memory growth per month.

APPENDIX C: More Details
========================
To identify the calls that actually cause heap growth, I used DTrace to
show the call stack whenever "sbrk" was called, here's a relevant snippet:

                libc.so.1`sbrk
                libc.so.1`_morecore+0xfc
                libc.so.1`_malloc_unlocked+0x17f
                libc.so.1`malloc+0x35
                libc.so.1`_findbuf+0xcc
                libc.so.1`fgets+0x98
                libcrypto.so.0.9.8`file_gets+0x2c
                libcurl.so.4.1.1`ossl_connect_common+0xb1
                libcurl.so.4.1.1`Curl_ossl_connect+0x47
                libcurl.so.4.1.1`Curl_ssl_connect+0x63

these calls to fgets() seem to be the trigger for almost all cases of heap
growth, though not a problem as such (fgets is called much more often than
it actually extends the heap) - to accomodate the data read from a file,
fgets needs storage which it allocates once for every file; this storage is
freed when the file is closed.
The obvious conclusion for libc is "heap fragmentation", which is not
surprising given the amount of malloc/free activity in curl (and even more
so in glib).

The interesting thing is that libumem's heap extensions also happen when
fgets is called:

                libc.so.1`_brk_unlocked
                libc.so.1`_sbrk_grow_aligned+0xb0
                libumem.so.1`vmem_sbrk_alloc+0x82
                libumem.so.1`vmem_xalloc+0x4e8
                libumem.so.1`vmem_alloc+0x169
                libumem.so.1`vmem_xalloc+0x4e8
                libumem.so.1`vmem_alloc+0x169
                libumem.so.1`umem_alloc+0x7e
                libumem.so.1`malloc+0x2a
                libc.so.1`_findbuf+0xcc
                libc.so.1`fgets+0x98
                libcrypto.so.0.9.8`file_gets+0x2c
                [..]

-- 
Michael Schuster        http://blogs.sun.com/recursion
Recursion, n.: see 'Recursion'




-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html

Received on 2010-02-02