Bugs item #3048197, was opened at 2010-08-19 00:59
Message generated for change (Comment added) made by bagder
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=100976&aid=3048197&group_id=976
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: ftp
Group: bad behaviour
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: catalin (catalinr)
Assigned to: Daniel Stenberg (bagder)
Summary: Incorrect data uploaded in case of CURLE_SEND_ERROR
Initial Comment:
These are my [hopefully right] conclusions:
- When doing an FTP upload, in the progressCallback function the uploadedTillNow argument (last one) holds the correct size for the uploaded data (well, rather "sent data");
- the uploaded data is sent in chunks of at most CURL_MAX_WRITE_SIZE bytes (btw, CURL_MAX_WRITE_SIZE is not documented anywhere and it'd be useful to know that that is the maximum size attempted to be uploaded at one time);
- in case the link is broken/disconnected, the remote destination file gets appended with a chunk of the size reported by progressCallback but the last KB are composed of NULL bits and not the real data. So a subsequent APPE[-nd] to that file will render a file of the same size with the original one, but somewhere in the middle it'll have the wrong NULL bytes.
The workaround for this was to add the size reported by the progressCallback in a totalTillNow variable, and after any network error that would require a resume subtract CURL_MAX_WRITE_SIZE*8 from totalTillNow, and setup a REST upload with the calculated starting point. (See #3048174 for issues with CURLOPT_RESUME_FROM).
I'm not sure if my workaround is the best solution, but it works. OTOH, that behavior when uploading and getting disconnected leads to corrupted files at destination and that is IMO very wrong...
libcurl 7.21.1
msw Vista
mingw-gcc 3.4.5
----------------------------------------------------------------------
>Comment By: Daniel Stenberg (bagder)
Date: 2010-08-23 00:24
Message:
I've tried, and I've not seen any zeroes in my broken uploads when using
vsFTPd.
The progress callback gets the amount told that the system calls have
reported were successfully sent. Unless of course there's a bug somewhere
but I've not been able to find any such. Can you?
----------------------------------------------------------------------
Comment By: catalin (catalinr)
Date: 2010-08-22 04:13
Message:
I am trying to make myself understood as best as I can already. Sorry if I
should do more but just fail at it... OTOH I'm not sure I can describe this
any better than I already have, my English is not that good.
"When the connection breaks, libcurl CANNOT [...] check anything else on
the remote site"
What I'm saying is not for libcurl to do more, but for you (a person) to
check the uploaded data after a CURLE_SEND_ERROR occurs. IOW I've asked you
to check for the problem I'm signaling, not to implement something.
Again, consider this a test case: an upload is in progress,
CURLE_SEND_ERROR occurs, transfer is aborted; desired outcome: the uploaded
data is the same as the source (partial, but identical so far); actual
outcome (at my end at least): last part of the data is incorrect (null
bits).
I feel the need to express this yet again: this is not a request for
automatically doing anything, but for reproducing what I'm experiencing.
"libcurl cannot guarantee what the server does, nor can it assume
anything"
Ok, so reading the last part I can only think that on the contrary, the
progressCallback reports the uploaded size _assuming_ it all was correctly
received at destination. But it should probably report only the data that
is confirmed to have been sent correctly so far.
If the upload consists of lets say 5 chunks of CURL_MAX_WRITE_SIZE bytes,
when a call to progressCallback is triggered i.e. while uploading chunk 4
it should only report the bytes sent in the first 3 chunks, and not adding
the [unconfirmed so far] bytes of the 4th chunk, which seems to be done
now.
"1. I can't see any error in libcurl's side"
I'd say the partialUpload reported by progressCallback is not always
correct, see above.
"2. It sounds like bad behavior on the server side "
It may very well be like that, but is the "good behavior" defined in a RFC
or just as a cURL concept?
_If_ cURL does assume that everything sent is also correctly received,
then this is a rather arbitrary call.
If this is an impossible to change fact, it should be better described in
the docs.
"3. you have not presented any way to repeat this problem"
I believe I have, even if not by using a piece of code. If it was not
understood from my previous post maybe this time will be luckier. If still
not, I'll probably give up...
----------------------------------------------------------------------
Comment By: Daniel Stenberg (bagder)
Date: 2010-08-21 21:20
Message:
When the connection breaks, libcurl CANNOT send anything further as the
connection is no more, nor can it check anything else on the remote site as
the connection... broke! Having libcurl try to reconnect just to check the
end of the file in case it got disconnected just previously is completely
out of the question.
Alas, the problem you see at disconnect depends on what the server does on
a disconnect. libcurl cannot guarantee what the server does, nor can it
assume anything. Some servers are likely to act differently than others on
disconnect. Appending zero-bytes to the file does sound like a case of bad
behaviour ON THE SERVER END.
1. I can't see any error in libcurl's side
2. It sounds like bad behavior on the server side
3. you have not presented any way to repeat this problem
I can't see what libcurl can do about this.
Anyone who decides to append data to an existing file because it got
aborted in a previous upload attempt may of course consider to check the
end of the file to see that the end looks OK before blindly appending more
data to it. libcurl will not do that automatically though but does provide
the powers to get the data etc.
----------------------------------------------------------------------
Comment By: catalin (catalinr)
Date: 2010-08-21 07:02
Message:
I'm sure I fail to see a lot more than you do, but I'm just signaling what
looks like bad behavior to me. Maybe you can find a way to try and
reproduce this as I don't think I can make a sample program that will make
a server disconnect (or an ISP to interrupt it etc), can I?
Of course that ftp server (comes with a BusyBox linux on a NAS device) may
be broken but I have some doubts about that and IMO it's worth
investigated...
The error received at my end was CURLE_SEND_ERROR and IIRC once I also got
CURLE_RECV_ERROR (although only uploads were being done, but maybe it was
about receiving some response from the server).
A long-shot interpretation would be that curl sends the size of the packet
being uploaded, but only part of the actual data gets to destination.
Again, I may be far away with my guess...
I don't think those zeroes are exactly random neither... Comparing the
source and destination files, the difference is made by the zeroes in the
destination file, not some _random_ bits being there.
Maybe a shorter way would be to get an ftp upload to break with
CURLE_SEND_ERROR and then check the end of the uploaded file part?
HTH
----------------------------------------------------------------------
Comment By: Daniel Stenberg (bagder)
Date: 2010-08-19 15:15
Message:
I don't see how libcurl sends any data as zeroes. Also, I fail to see how
it could send that block of zeroes if the connection disconnected?
To me it sounds like your server behaves oddly and add random data to the
file being written at the time of the disconnect. I don't think libcurl can
do anything about it.
If this is not the case, can you please clarify your point for us?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=100976&aid=3048197&group_id=976
Received on 2010-08-23