curl-users
Content-Disposition parser confirming to RFC 6266
Date: Sat, 27 Oct 2012 22:48:28 +0900
Hi,
I'd like to contribute a patch to add Content-Disposition parser
conforming to RFC 6266.
The comment in tool_cd_hdr.c says it does not support encoded
filenames (*=) right now.
The parser was originally written for other C++ project, but the
function was written in C and
easily ported to curl under its style guidelines.
The question is how to handle encoded file names. The characters
outside of the defined set
(see BNF below) are all percent-encoded and its original charset must
appear in the header value when encoded filenames are used.
filename-parm = "filename" "=" value
| "filename*" "=" ext-value
ext-value = charset "'" [ language ] "'" value-chars
charset = "UTF-8" / "ISO-8859-1" / mime-charset
value-chars = *( pct-encoded / attr-char )
pct-encoded = "%" HEXDIG HEXDIG
attr-char = ALPHA / DIGIT
/ "!" / "#" / "$" / "&" / "+" / "-" / "."
/ "^" / "_" / "`" / "|" / "~"
I observed that curl does not decode percent-encoded filename in URL
with -O option.
So the safe and consistent way is probably preserve percent-encoded
string as is and
do usual sanitizing (i.e., use string after last /). The drawback of
this approach is
user does not know the charset of the string when it is decoded.
What do you think?
Best regards,
Tatsuhiro Tsujikawa
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-users
FAQ: http://curl.haxx.se/docs/faq.html
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2012-10-27