curl / Mailing Lists / curl-library / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

dynbuf: PR #5300

From: Daniel Stenberg via curl-library <curl-library_at_cool.haxx.se>
Date: Thu, 30 Apr 2020 15:10:32 +0200 (CEST)

Hi team,

In PR 5300 I'm introducing a library-wide change that I wanted to make you all
aware of!

# Background

1 - Looking back, several of our past security problems have been related to
reallocs and ridiculous string lengths.

2 - It is rather ineffective to feature multiple almost identical code parts
for virtually the same functionality when we can unify and use the same simple
set of functions everywhere instead.

# Introducing dynbuf

'dynbuf' is a new series of functions to create, manage and free "dynamic
buffers" in curl. This means a buffer/string that can be appended to
dynamically and that then can grow the allocation automatically as needed.

We have functions to init the buffer, appending data to it in three different
ways and to get a pointer and length to it etc.

The details is explained in the new docs/DYNBUF.md file. (check the PR for it,
as I might rebase the branch again and I don't want to provide a link that
might die soon)

The function set and API is of course only used for libcurl internals, it is
not exposed or user-visible anywhere.

# Maximum size for everyone

This brings a new things: each dynamic buffer MUST have a maximum size set
when created and the functions will return an error if the limit is reached.
No buffer using this API risks accidentally growing larger than we could
anticipate. The risk we exchange this for is instead the reversed: that we set
a too low maximum length for a particular buffer but I think that's less of a
security risk and instead becomes more of a regular bug.

# Moving things over

Within the PR I've moved over most of all previous realloc-buffer code to this
new concept. It actually turns out to be a net gain as I can remove more code
than this introduces and in general code that uses this new concept ends up
more readable (subjective of course) than it was before!

In my basic test "count number of memory allocations done by curl for
downloading an 8GB HTTPS file", the new code actually ends up doing fewer
allocations than before but not by much.

As a bonus, I also moved over the HTTP CONNECT logic that previously used a
fixed 16K buffer to use dynbuf instead, which has the fine property that a
standard 'curl *' easy handle now is 16KB smaller by default.

# Outstanding

- Within the PR, Marc Hörsken mentioned the schannel code which has its own
realloc() logic and I'm convinced that could also be moved over and benefit
from it, but that's work left to be done outside of this PR.

- Both vssh/libssh.c and vssh/libssh2.c have realloc buffers that can be moved
over, and I might have a go at that after the initial dynbuf lands.

- There are some additional uses of realloc() too that probably can be
addressed.

# Feedback?

Reply here or comment in the PR. My plan is to land this thing during next
week unless we run into issues.

https://github.com/curl/curl/pull/5300

-- 
  / daniel.haxx.se | Commercial curl support up to 24x7 is available!
                   | Private help, bug fixes, support, ports, new features
                   | https://www.wolfssl.com/contact/

-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://curl.haxx.se/mail/etiquette.html
Received on 2020-04-30