cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Info request about the zero copy interface (2)

From: Jamie Lokier <jamie_at_shareable.org>
Date: Mon, 5 Dec 2005 16:23:22 +0000

Legolas wrote:
> >>MainLoop:
> >> received_size = recv(yoursocket, internal_buffer,
> >> internal_buffer_size, yourflags);
> >> buffer_size = forecast_size(received_size);
> >> /* forecast someway by libcurl */
> >> buffer = write_buffer(custom_data, &buffer_size);
> >> /* application may return a bigger buffer */
> >> ... (decode SSL, join chunks...)
> >> /* work on received data putting final result data into 'buffer'
> >> */
> >> write_callback(custom_data, final_size);
> >> /* previous code must set 'final_size' to the size of data
> >> written to 'buffer' */
> >>
> >>The line followed by the comment "work on received data putting final
> >>result data into 'buffer'" copies data into 'buffer'. That copy is
> >>not necessary. In what way is this doing "zero-copy"?
> >>
> Does not copy! I have used the verb _putting_, i.e. each final result
> byte or block is written directly to the 'buffer' by the algorithm
> intended to be in place of the ellipsis (...). Overall, there is no copy
> at the end of the work process (otherwise I would have expressely
> pointed out it).

Ah, I think "each final result byte of block is written" is sometimes
an unnecessary copy :)

Sometimes, it's unavoidable. If we use zlib, or the openssl library,
then they will always write their data to a caller-specified buffer,
so the above does not cause any extra copying in that case.

However, for chunked decoding, then putting "each final result byte"
in 'buffer' means copying the bytes from 'internal_buffer'. For small
runs of bytes, that is not significant (because of other overheads).
But for large runs, such as large HTTP nchunks, then it is a notable
extra copy. The same applies to recv() blocks that contain part HTTP
headers and part data.

(That also applies to compression and decryption, when not limited by
the libraries used for those things. For example, there are deflate
implementations which give access directly to the 32k circular buffer
that the algorithm uses. However, curl probably will always use the
standard libraries, which force a certain amount of copying).

I'm sure you understand, from my points a, b, c, d and e, what I mean
though, so I won't argue more about this.

> In the hypothesis that this buffer interface is adaptable to the used
> algorithms, there would be at least ONE copy less this way...however I
> am just trying to draw up a scheme. I am also giving attention to your
> previous reply: your points a,b,c,d,e are of course to be implemented,
> but I can't figure out the case (e) ...

Unforunately, I wrote two (e)s - one should have been labelled (f). :)
Do you mean the one about contiguous vs non-contiguous regions, or the
one about negotiating the release of library-supplied buffers?

-- Jamie
Received on 2005-12-05