cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: Curl retry without notifying application?

From: Mark Aldred <mark_at_twinstrata.com>
Date: Thu, 27 Aug 2009 15:23:34 -0400

*> > My application uses CURLOPT_READFUNCTION and CURLOPT_WRITEFUNCTION copy
the *
*> > data into and out of curl. My callback functions us structures to track
how *
*> > much data has been copied to maintain sane pointers as the copy
progresses. *
*> > Fairly standard stuff. When curl return an error to our application my
code *
*> > resets the pointers used in the data copy and retires the curl request
(for *
*> > retryable errors). All this works fine. *

> Well, that's one way to see it. I'd say it doesn't really "retry"
requests.
> But it does have some minor ways to detect that a re-used connection dies
and
> then it needs to have a way to recover from that to make the app not have
to
> care about it etc.

*> > Today I encountered a situation where it appears I issued a PUT
request to *
*> > curl. Curl got the 100-continue from the host, invoked by copy function
to *
*> > get all the data then encountered an error. It seems that curl then
retried *
*> > the operation without returning an error to my application *

> That sounds fishy. In what way did it fail?

This is the error that curl reported. I'll try to explain everything in
detail below.

0x14f7ae0 2009-Aug-26 23:22:24.665157 CURLINFO_TEXT 23006848 SSL read:
error:00000000:lib(0):func(0):
reason(0), errno 104
0x14f7ae0 2009-Aug-26 23:22:24.665243 CURLINFO_TEXT 23006848 Connection
died, retrying a fresh connect
0x14f7ae0 2009-Aug-26 23:22:24.665283 CURLINFO_TEXT 23006848 Closing
connection #0
0x14f7ae0 2009-Aug-26 23:22:24.668640 CURLINFO_TEXT 23006848 Issue another
request to this URL:
'https://s3.amazonaws.com:443/some_url<https://s3.amazonaws.com/some_url>'

0x14f7ae0 2009-Aug-26 23:22:24.668719 CURLINFO_TEXT 23006848 Re-using
existing connection! (#1) with host s3.amazonaws.com
0x14f7ae0 2009-Aug-26 23:22:24.668757 CURLINFO_TEXT 23006848 Connected to
s3.amazonaws.com (72.21.202.97) port 443 (#1)

*> > The result is that when curl invoked the copy function in my
application for *
*> > the retry, my copy function returned zero because it thought
(correctly) *
*> > that all the data had been copied already. *

> Do you have CURLOPT_SEEKFUNCTION set? Was it then not used?

I do not use CURLOPT_SEEKFUNCTION. Although I'm using the latest curl the
man pages I was looking at were a bit out of data and did not include this
feature.

*> > Does curl retry requests without returning an error to the application
so *
*> > that it can prepare for the retry? *
> No, not really. I think it looks like you've found an error that it
wrongly
> retries stuff on.
>

*> > This comment in Curl_retry_request() indicates that curl is doing
exactly *
*> > what I thought. Sounds like it is retrying a PUT without returning an *
*> > error. This behavior is fine for GETs and DELETEs, but not PUTs. *

> You draw the wrong conclusions based on what I believe are
misunderstandings
> of what you read in a particular piece of the code.

> libcurl supports PUT and POST and we even have test cases that rewind data

> when libcurl has to send the same data again. This is not a design flaw in

> libcurl, you're experience a bug of some kind.

> I think it'd be better if you would clarify in more details exactly how
this
> bug appears and what the server has sent up to this point and what your
read
> callback has done up to this point.
What I see happening.

I issue a PUT request of 128kb. The server sends to 100/continue. Curl
invokes my CURLOPT_READFUNCTION 5 times. Four invocations to copy all the
data and my function returns 0 on the last invocation signaling there is no
more data. Curl then encounters the SSL read error shown above. As far as
I can tell curl has not yet transmitted any data to the server, but I'm not
sure if that is relevant to my issue. Curl closes the connection (#0) used
for the PUT, and reissues the PUT request on another connection (#1).

The reissued request invokes my CURLOPT_READFUNCTION to get the data. My
READFUNCTION tracks the number of bytes copied on each invocation. Because
all the data was copied in the earlier invocation of this PUT operation my
READFUNCTION returns 0 indication that there is no more data to transfer.
At this point the request until it times out and terminates with 400 because
the connection was idle for too long.

If I used the SEEKFUNCTION what would it do in this case? When would curl
invoke it? I suspect the SEEKFUNCTION would be expected to reset the copy
pointers I maintain to the specified value (probably zero in this case)
prior the the READFUNCTION being invoked when the request is reissued. Is
using SEEKFUNCTION the solution?

-- 
Mark
Received on 2009-08-27