Buy commercial curl support from WolfSSL. We help you work
out your issues, debug your libcurl applications, use the API, port to new
platforms, add new features and more. With a team lead by the curl founder
himself.
Re: Should CURLOPT_READFUNCTION need EOF twice?
- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]
From: Jiří Hruška via curl-library <curl-library_at_lists.haxx.se>
Date: Thu, 16 Nov 2023 04:46:13 +0100
> I don't think this behavior was done on purpose and I think it would benefit
> users if we made it not do this.I think it should be sufficient for the
> callback to signal the end of the read *once*.
Thanks for the clarification. Actually I've found even other hints that this
was the case since sending the message, for example:
tests/libtest/lib670.c:
fprintf(stderr, "Read callback called after EOF\n");
> Fixing the code probably requires a better separation between end-of-input
> and end-of-upload than what that function does right now.
Yeah. And it should probably happen already in the inner Curl_fillreadbuffer(),
in order to be able to give a strong guarantee to users of CURLOPT_READFUNCTION
in general.
What do you think about a new BIT(fread_func_eof) in RequestState and
something like this?
if(!data->req.fread_func_eof) {
Curl_set_in_callback(data, true);
nread = data->state.fread_func(data->req.upload_fromhere, 1,
buffersize, data->state.in);
Curl_set_in_callback(data, false);
/* make sure the callback is not called again after EOF */
data->req.fread_func_eof = (nread == 0);
}
else
nread = 0;
> Writing up a test case to reproduce this scenario might take some creativity.
> Do you have any special setup that makes this happen more likely than others?
I ran the whole test suite with the code above and got no hits of the `else`
branch (with some extra code to silently create a marker file for me to see).
Then I found that it is this part of readwrite_upload() which limits the
exposure of this bug:
827: if(0 != k->upload_present &&
828: k->upload_present < curl_upload_refill_watermark(data) &&
829: !k->upload_chunky &&/*(variable sized chunked header;
append not safe)*/
830: !k->upload_done && /*!(k->upload_done once k->upload_present sent)*/
831: !(k->writebytecount + k->upload_present - k->pendingheader ==
832: data->state.infilesize)) {
833: offset = k->upload_present;
834: }
So it will happen only if:
#1: uploading something with an unknown size (lines 832-832)
#2: using HTTP/2 (because otherwise no size implies TE: chunked, line 829)
#3: the buffer contains between 1..(¹/₃₂ · upload_buffer_size) at the end of
the upload, i.e. 3.125% chance for an arbitrary length (line 828)
With this knowledge, I can reproduce it at will:
$ cat tests/http/gen/apache/docs/data-10k | \
CURL_SMALLSENDS=2000 ./src/curl --upload-file - --http2 \
--insecure --url https://127.0.0.1:45779/curltest/put
...
readwrite_upload(): k->upload_present=0, offset=0
CURLOPT_READFUNCTION called, fillcount=10240
readwrite_upload(): k->upload_present=8240, offset=0
readwrite_upload(): k->upload_present=6240, offset=0
readwrite_upload(): k->upload_present=4240, offset=0
readwrite_upload(): k->upload_present=2240, offset=0
readwrite_upload(): k->upload_present=240, offset=240
CURLOPT_READFUNCTION called, fillcount=0
readwrite_upload(): k->upload_present=0, offset=0
CURLOPT_READFUNCTION called, fillcount=0
...
It has a piped `cat` and `--upload-file -` to address #1, `--http2` to address
#2, and the library must be compiled with CURLDEBUG to be able to do
CURL_SMALLSENDS=2000 to address #3. (UPLOADBUFFER_DEFAULT is 65536, divided by
32 gives 2048, so using 2000 makes sure that the condition always applies.)
Doing the same in tests is tricky, though. I got no hits for the `else` branch
in the whole test suite with CURL_SMALLSENDS=2000 either.
What tests/http/test_07_upload.py does comes the closest, but the tests upload
bigger data from files, so they do not satisfy the unknown upload size
condition #1.
When I wrote a new test that used `-T -` and a pipe instead, the source data
sizes like 100k or 10M could not hit #3 without CURL_SMALLSENDS either. That
seems to be the hardest condition to satisfy in general, because it really
depends on how the I/O behaves. If the network can take up to 64k bytes at
a time, getting the right amount of data to remain in the buffer at EOF is
very tricky.
Do you have any more creative ideas to try?
Or would a manual test with CURL_SMALLSEND be good enough?
Date: Thu, 16 Nov 2023 04:46:13 +0100
> I don't think this behavior was done on purpose and I think it would benefit
> users if we made it not do this.I think it should be sufficient for the
> callback to signal the end of the read *once*.
Thanks for the clarification. Actually I've found even other hints that this
was the case since sending the message, for example:
tests/libtest/lib670.c:
fprintf(stderr, "Read callback called after EOF\n");
> Fixing the code probably requires a better separation between end-of-input
> and end-of-upload than what that function does right now.
Yeah. And it should probably happen already in the inner Curl_fillreadbuffer(),
in order to be able to give a strong guarantee to users of CURLOPT_READFUNCTION
in general.
What do you think about a new BIT(fread_func_eof) in RequestState and
something like this?
if(!data->req.fread_func_eof) {
Curl_set_in_callback(data, true);
nread = data->state.fread_func(data->req.upload_fromhere, 1,
buffersize, data->state.in);
Curl_set_in_callback(data, false);
/* make sure the callback is not called again after EOF */
data->req.fread_func_eof = (nread == 0);
}
else
nread = 0;
> Writing up a test case to reproduce this scenario might take some creativity.
> Do you have any special setup that makes this happen more likely than others?
I ran the whole test suite with the code above and got no hits of the `else`
branch (with some extra code to silently create a marker file for me to see).
Then I found that it is this part of readwrite_upload() which limits the
exposure of this bug:
827: if(0 != k->upload_present &&
828: k->upload_present < curl_upload_refill_watermark(data) &&
829: !k->upload_chunky &&/*(variable sized chunked header;
append not safe)*/
830: !k->upload_done && /*!(k->upload_done once k->upload_present sent)*/
831: !(k->writebytecount + k->upload_present - k->pendingheader ==
832: data->state.infilesize)) {
833: offset = k->upload_present;
834: }
So it will happen only if:
#1: uploading something with an unknown size (lines 832-832)
#2: using HTTP/2 (because otherwise no size implies TE: chunked, line 829)
#3: the buffer contains between 1..(¹/₃₂ · upload_buffer_size) at the end of
the upload, i.e. 3.125% chance for an arbitrary length (line 828)
With this knowledge, I can reproduce it at will:
$ cat tests/http/gen/apache/docs/data-10k | \
CURL_SMALLSENDS=2000 ./src/curl --upload-file - --http2 \
--insecure --url https://127.0.0.1:45779/curltest/put
...
readwrite_upload(): k->upload_present=0, offset=0
CURLOPT_READFUNCTION called, fillcount=10240
readwrite_upload(): k->upload_present=8240, offset=0
readwrite_upload(): k->upload_present=6240, offset=0
readwrite_upload(): k->upload_present=4240, offset=0
readwrite_upload(): k->upload_present=2240, offset=0
readwrite_upload(): k->upload_present=240, offset=240
CURLOPT_READFUNCTION called, fillcount=0
readwrite_upload(): k->upload_present=0, offset=0
CURLOPT_READFUNCTION called, fillcount=0
...
It has a piped `cat` and `--upload-file -` to address #1, `--http2` to address
#2, and the library must be compiled with CURLDEBUG to be able to do
CURL_SMALLSENDS=2000 to address #3. (UPLOADBUFFER_DEFAULT is 65536, divided by
32 gives 2048, so using 2000 makes sure that the condition always applies.)
Doing the same in tests is tricky, though. I got no hits for the `else` branch
in the whole test suite with CURL_SMALLSENDS=2000 either.
What tests/http/test_07_upload.py does comes the closest, but the tests upload
bigger data from files, so they do not satisfy the unknown upload size
condition #1.
When I wrote a new test that used `-T -` and a pipe instead, the source data
sizes like 100k or 10M could not hit #3 without CURL_SMALLSENDS either. That
seems to be the hardest condition to satisfy in general, because it really
depends on how the I/O behaves. If the network can take up to 64k bytes at
a time, getting the right amount of data to remain in the buffer at EOF is
very tricky.
Do you have any more creative ideas to try?
Or would a manual test with CURL_SMALLSEND be good enough?
-- Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.htmlReceived on 2023-11-16