curl-library
End of Line Handling
Date: Fri, 24 Mar 2006 00:11:03 +0000
End of Line Handling
I've broken this thread off of the EBCDIC thread since my "line end" changes
are no longer EBCDIC-only.
The changes to end of line handling are intended to comply with FTP RFC959
(http://www.faqs.org/rfcs/rfc959.html).
FTP RFC959 section 3.1.1.1 (ASCII TYPE) states:
"The sender converts the data from an internal character
representation to the standard 8-bit NVT-ASCII
representation (see the Telnet specification). The receiver
will convert the data from the standard form to his own
internal form.
In accordance with the NVT standard, the <CRLF> sequence
should be used where necessary to denote the end of a line
of text. (See the discussion of file structure at the end
of the Section on Data Representation and Storage.)"
Opening my code up to other platforms raises questions I didn't originally
have to face:
1) On uploads (puts) libcurl makes line end conversion optional based on the
data->set.crlf flag (in transfer.c's Curl_readwrite function). If that flag
is set, all LFs in the data being sent are converted to CRLFs.
Should transfer.c be change so that ASCII-mode FTP transfers unconditionally
convert to CRLF?
If so, we'll have to identify which platforms internally use CRLF so we
leave them alone (Windows only? There are probably others).
2) Should Curl_readwrite be changed to leave existing CRLF sequences alone?
That's the friendly thing to do but it deviates from a strict interpretation
of RFC959.
Take the case of a Windows file that was originally transferred as binary to
a system like Unix.
That file will already have CRLF line ends, so transfer.c's existing code
converts the CRLFs to CRCRLFs when the file is sent (if set.crlf is on).
I've seen this quite a bit (the annoying ^M).
I tried out the scenario with various FTP servers and some leave CRLFs alone
when sending data while others change them to CRCRLFs.
What should libcurl do?
And should that be done across the board or just for FTP?
3) On the inbound (get) side, my code in Curl_readwrite will do the reverse:
turn CRLFs into LFs.
Many of the same questions apply:
Do it for some or all platforms?
Do it unconditionally or conditionally based on a flag (data->set.crlf or a
new one)?
Do it for FTP only or everyone going through transfer.c's Curl_readwrite.
-David McCreedy
Received on 2006-03-24