cURL / Mailing Lists / curl-library / Single Mail

curl-library

More on POST and PUT with 100-continue

From: Jamie Lokier <jamie_at_shareable.org>
Date: Sun, 1 Aug 2004 22:34:23 +0100

On 6th April, a change was made to the processing of POST requests
with authentication. Related mailing list threads are

"IIS says: Bad Request (Invalid Verb)":
   http://curl.haxx.se/mail/lib-2004-03/0248.html
"NTLM, HTTP 100 Continue, and IIS 6 / .NET 1.1":
   http://curl.haxx.se/mail/lib-2004-03/0384.html

I'm writing this not as a libcurl user, but as someone who has studied
and discussed the HTTP/1.1 protocol extensively, and sometimes uses
curl among other clients for testing.

The first part of this mail explains intended HTTP/1.1 behaviour with
100-continue, and the second part explains an alternative change to
libcurl, which may be appropriate instead of the one of 6th April.

This is part of the change description:

    New authentication code added, particularly noticable when doing
    POST or PUT with Digest or NTLM. libcurl will now use HEAD to
    negotiate the authentication and when done perform the requested
    POST. Previously libcurl sent POST immediately and expected the
    server to reply a final status code with an error and then libcurl
    would not send the request-body but instead send then next request
    in the sequence.

    The reason for this change is due to IIS6 barfing on libcurl when
    we attempt to POST with NTLM authentication. The reason for the
    problems is found in RFC2616 section 8.2.3 regarding how servers
    should deal with the 100 continue request-header:

          If it responds with a final status code, it MAY close the
          transport connection or it MAY continue to read and discard
          the rest of the request.

    Previous versions of IIS clearly did close the connection in this
    case, while this newer version decided it should "read and
    discard". That would've forced us to send the whole POST (or PUT)
    data only to have it discarded and then be forced to send it
    again. To avoid that huge penality, we switch to using HEAD until
    we are authenticated and then send the POST.

That change was required due to a HTTP bug in libcurl: libcurl was
sending request headers with "Expect: 100-continue", and when a
non-100 error status was returned, it would send the next request
headers on the same connection _without_ sending the request body.

That's a HTTP implementation bug in libcurl, nothing to do with IIS6.
It was always wrong. IIS6 was the first server where this was
noticed, but it could break other servers too.

First, let me clarify why that was a libcurl HTTP bug, because a
posting by Daniel Stenberg illustrates some potential disagreement.
It isn't surprising, as the language of RFC 2616 is not very clear in
this area.

Daniel wrote:
> > Summary: Libcurl does not properly implement handling of POST
> > requests with Content-Length: X and Expect: 100-Continue headers,
> > which leads to the sending of unexpected and invalid data to IIS 6.
>
> Let's at least say that IIS and libcurl have different opinions
> about what "proper" handling of this means.

In my opinion, based on careful study of HTTP, and conversing with the
few folks who respond on ietf-http-wg, IIS6 implements this correctly.

I originally thought the same as Daniel, i.e. that if "100 Continue"
is not received then the client should omit the request body and send
the next request's headers, but the folks on ietf-http-wg said no it
isn't possible, and explained why, and indeed their explanation makes
sense. So I'm passing that on.

Daniel wrote:
> What specific section in the RFC are reading when you say that the client is
> expected to continue to the POST even though no 100 continue arrived? My
> interpretation of the 100-continue is that we require the 100 code reponse
> before we send anything, and if we don't get it we don't send anything.
>
> You quoted this section:
>
> - A client MUST NOT send an Expect request-header field (section
> 14.20) with the "100-continue" expectation if it does not intend
> to send a request body.

The logic of those parts of the spec provides these rules:

    1. A client mustn't sent "Expect: 100-continue" is it doesn't intend to
       send a request body.
    2. If it does intend to send a request body, then "Expect: 100-continue"
       is _optional_.

>
> And I think that paragraph only confirms my view of things. The
> client *does* intend to send the body, it just needs clearance first.

No, that doesn't follow, and it turns out a "100 Continue" response
isn't a well-defined "clearance to send" but rather a hint.

That's because of a further rule for the client:

    3. If a client sends "Expect: 100-continue" and does not see any response
       at all then it must send the request body a short time later.

Rule 3 comes from this text a little later in RFC 2616:

    Because of the presence of older implementations [...] when a
    client sends this header field ["Expect: 100-continue"] to an
    origin server (possibly via a proxy) from which it has never seen
    a 100 (Continue) status, the client SHOULD NOT wait for an
    indefinite period before sending the request body.

That indicates this interpretation is incorrect, by the way:
> My interpretation of the 100-continue is that we require the 100
> code reponse before we send anything, and if we don't get it we
> don't send anything.

A consequence of rule 3 is that the following sequence of events can occur:

    a. Client sends request headers including "Expect: 100-continue".
    b. Server sends "401 Authentication Required" response, no "100 Continue".
    c. Before receiving the 401, client MAY have sent some of the request body.

After sending the 401 response (or any other error response), the
*server* may want to read another request. It cannot possibly know
whether the client has begun sending the request body at about the same time.

This is because "Expect: 100-continue" is not recognised by all
servers: even some HTTP/1.1 ones ignore it (see RFC2068). Therefore
100-continue cannot act as a logical constraint, but only as a
performance hint.

For this reason, the server has only two possible subsequent
behaviours: read and discard the request body, or don't process any
further input from that connection (i.e. close it, using TCP-safe
lingering close). And the client has only two possible subsequent
behaviours: send the request body to be discarded, or close the
connection after receiving the error response.

It is the only logical possibility. Anything else would break
interaction between some clients and some servers some of the time.

> > so the next 308 bytes it gets are "read and discarded" according
> > to the RFC.
>
> This is painfully true.
>
> Argh, I don't see the point in having this behavior defined like
> this! It makes no sense at all at it makes operation such as this slower.

Now hopefully I have explained why this behaviour makes sense. It is
all because 100-continue is a performance hint and cannot be a logical
protocol constraint -- for a full compatibility matrix among different
HTTP implementations, some of which don't know about Expect or
100-continue.

It's obviously not the ideal behaviour, but it's a practical requirement.

Now, you may wonder -- why would a server ever choose to waste
bandwidth and time reading and discarding, instead of just closing the
connection in this case?

A well implemented server can choose. If it _knows_ that the request
body is short, or that only a short amount hasn't yet been received,
then it is more efficient to read and discard than to close a
connection. I have written a server with this heuristic.

(By the way, initially I thought 100-continue meant what Daniel
thought it meant).

I would say the NTLM authentication examples, which started this
investigation of libcurl, where the body is 308 bytes long fall into
this category. That small number of bytes is much less than the
packet overhead of closing and opening a new connection.

I hope that has explained with the IIS6 implementation of HTTP is
correct, and libcurl's was (is?) incorrect.

Now to the fix done in April, which was to use HEAD prior to POST/PUT
to check for authentication. I thought of a problem with that, and it
appears Daniel had already thought of it as well:

Daniel wrote:
> I suspect that some servers will require different authentication for POST
> than for HEAD on a specific URL.

I agree, and the same for PUT. I can also imagine that some resources
don't need any authentication for GET but do for POST/PUT.

There's also the question of performance with --anyauth. If the
resource requires no authentication, then doing HEAD first actually
increases the overhead.

All of this suggests that another solution mentioned on the mailing
list is quite a good idea:

Alan Pinstein wrote:
> One other suggestion on solving this issue generally... I got
> another note from the MS engineer. He thinks that the cleanest way to
> deal with this is to use chunked transfer-encoding. This way, during
> the negotiation of the security, libcurl can tell the server to move
> on to the next request simply by sending a zero-length chunk (which
> according to the RFC specifies the END of the chunk) if the server
> returns a 401 status, and if it returns a 100-continue then you can
> just send the data, with transfer-encoding as chunked.

This is quite nice, although it does have the problem that it's only
safe to use with servers that accept chunked requests.

All HTTP/1.1 servers MUST accept chunked requests, but unfortunately
I've seen at least one which reports itself as HTTP/1.1 and does not
support chunked requests :(. Fortunately that was just a minor one.
I hope it isn't a widely existing flaw, but I don't know, not having
tried chunked requests in the wild.

It may turn out to be unnecessary, as (a) IIS6 might be smart enough
to close connections if it would have to discard a large request body
(not a small one) -- I've not tested, but someone could; (b) for the
specific problem of NTLM authentication, aren't the request bodies
always small anyway?

Finally, it is clear that whatever fix is in libcurl for NTLM
authentication to work with IIS6, there is still the general issue of
libcurl's HTTP/1.1 compliance. It is clear to me that libcurl MUST
transmit a request body if it sends a request with "Expect:
100-continue" and receives a response without the "100 Continue" -- or
alternatively close that connection. Hopefully this message has
explained why. I have the impression from the mailing list threads in
March that this remains a general bug in libcurl which was not fixed,
but the use of HEAD for authentication hid it in those cases.

Enjoy!
-- Jamie
Received on 2004-08-01