cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Let the user specify the download buffer

From: Cristian Morales Vega <cristian_at_samknows.com>
Date: Tue, 1 Apr 2014 15:30:03 +0100

On 31 March 2014 18:47, Daniel Stenberg <daniel_at_haxx.se> wrote:
> On Mon, 31 Mar 2014, Cristian Morales Vega wrote:
>
>> I would like to have a "CURLOPT_BUFFER" option, so I can instruct libcurl
>> to copy the data there directly. What are your thoughts about adding
>> it/accepting a patch?
>
>
> We've discussed such alternatives many times and I'll be willing to do it
> again.

Can you point me to those discussions? I am sure there are a
problems/user cases I am unaware of.

> Since libcurl may download an infintely large amount of data, just pointing
> out a single buffer is not enough. It would have to be the first in a series
> of buffers. The question is then how libcurl spends the buffer(s) and how
> you give it new/more buffers to fill.

I was thinking about simply changing (struct UrlState).buffer from an
array to a pointer.
I am not trying to solve the problem of downloading an arbitrary big
file to memory for later use, which I guess is what you are thinking
about. For such a thing, at least in Linux, I guess the future
"memfd"s plus splice() could be used.

> Also, in the attempt to avoid memcpy'ing data in your application, it is
> important to realize that you it won't be easy to avoid them in libcurl -
> I'm suggesting that it easily will just make the copy get done within
> libcurl instead. Things like (abstracted) SSL and chunked encoding for
> example make it very hard to avoid that.

I am not using SSL here. I must admit I am totally unaware of the
problems it adds.
But aren't these *extra* copies? The data still ends in (struct
UrlState).buffer and any copy from it can still be avoided, no?

> If you have any ideas, please let us know and we can go from there. It is
> easier to discuss around and improve an actual suggestion.
>
> Lastly, you say this is a performance consideration. Then I presume you
> imply that doing a memcpy in the callback is too much? By which measurement?
> And what's considered good enough?

My measurement is being able to let libavformat do all the processing
in the callback time in a BCM4716A_at_457MHz. With the memcpy()s it stops
libcurl for too long, and by the time recv() is called the TCP
receiver window is already full.
To be honest, the only thing av_malloc() does is making sure the
memory is properly aligned (perhaps letting ffmpeg use SIMD). In my
simple tests so far I am being able to make libavformat parse the
buffer provided by curl, so so far I don't have a "real" problem.

-- 
Cristian Morales Vega
Email cristian_at_samknows.com
Office +44 (0) 20 3111 4330
Web:  www.samknows.com
This email is sent for and on behalf of SamKnows Limited.
This email and any attachments are confidential, legally privileged
and protected by copyright. If you are not the intended recipient
dissemination or copying of this email is prohibited. If you have
received this in error, please notify the sender by replying by email
and then delete the email completely from your system.
SamKnows Limited, Registered Number: 06510477, Registered Office: Hill
House, 1 Little New Street, London, EC4A 3TR. Registered in England
and Wales. Trade Mark 2507103
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette:  http://curl.haxx.se/mail/etiquette.html
Received on 2014-04-01