Re: dynbuf: PR #5300
Date: Thu, 30 Apr 2020 09:41:22 -0700
On 4/30/2020 6:10 AM, Daniel Stenberg via curl-library wrote:
> Hi team,
>
> In PR 5300 I'm introducing a library-wide change that I wanted to make
> you all aware of!
>
> # Background
>
> 1 - Looking back, several of our past security problems have been
> related to reallocs and ridiculous string lengths.
>
> 2 - It is rather ineffective to feature multiple almost identical code
> parts for virtually the same functionality when we can unify and use
> the same simple set of functions everywhere instead.
>
> # Introducing dynbuf
>
> 'dynbuf' is a new series of functions to create, manage and free
> "dynamic buffers" in curl. This means a buffer/string that can be
> appended to dynamically and that then can grow the allocation
> automatically as needed.
>
> We have functions to init the buffer, appending data to it in three
> different ways and to get a pointer and length to it etc.
>
> The details is explained in the new docs/DYNBUF.md file. (check the PR
> for it, as I might rebase the branch again and I don't want to provide
> a link that might die soon)
>
> The function set and API is of course only used for libcurl internals,
> it is not exposed or user-visible anywhere.
>
> # Maximum size for everyone
>
> This brings a new things: each dynamic buffer MUST have a maximum size
> set when created and the functions will return an error if the limit
> is reached. No buffer using this API risks accidentally growing larger
> than we could anticipate. The risk we exchange this for is instead the
> reversed: that we set a too low maximum length for a particular buffer
> but I think that's less of a security risk and instead becomes more of
> a regular bug.
>
> # Moving things over
>
> Within the PR I've moved over most of all previous realloc-buffer code
> to this new concept. It actually turns out to be a net gain as I can
> remove more code than this introduces and in general code that uses
> this new concept ends up more readable (subjective of course) than it
> was before!
>
> In my basic test "count number of memory allocations done by curl for
> downloading an 8GB HTTPS file", the new code actually ends up doing
> fewer allocations than before but not by much.
>
> As a bonus, I also moved over the HTTP CONNECT logic that previously
> used a fixed 16K buffer to use dynbuf instead, which has the fine
> property that a standard 'curl *' easy handle now is 16KB smaller by
> default.
>
> # Outstanding
>
> - Within the PR, Marc Hörsken mentioned the schannel code which has
> its own realloc() logic and I'm convinced that could also be moved
> over and benefit from it, but that's work left to be done outside of
> this PR.
>
> - Both vssh/libssh.c and vssh/libssh2.c have realloc buffers that can
> be moved over, and I might have a go at that after the initial dynbuf
> lands.
>
> - There are some additional uses of realloc() too that probably can be
> addressed.
>
> # Feedback?
>
> Reply here or comment in the PR. My plan is to land this thing during
> next week unless we run into issues.
>
> https://github.com/curl/curl/pull/5300
>
Years ago I got tired of sprintf() buffer overflow problems and wrote my
own version of sprintf(), passing in a function pointer block. The
first use was to reallocate sprintf() buffers on the fly; the second use
was to sprintf() into C++ objects. Naturally I think that dynbuf is a
great idea. Won't have time to review its code, unfortunately.
-- David Chapman dcchapman_at_acm.org Chapman Consulting -- San Jose, CA EDA Software Developer, Expert Witness www.chapman-consulting-sj.com 2018-2019 Chair, IEEE Consultants' Network of Silicon Valley ------------------------------------------------------------------- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.htmlReceived on 2020-04-30