Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Curl_client_write, disambiguate flag semantics #11885

Closed
wants to merge 2 commits into from

Conversation

icing
Copy link
Contributor

@icing icing commented Sep 19, 2023

  • use CLIENTWRITE_BODY only when data is actually body data
  • add CLIENTWRITE_INFO for meta data that is not a HEADER
  • debug assertions that BODY/INFO/HEADER is not used mixed
  • move data->set.include_header check into Curl_client_write so protocol handlers no longer have to care
  • add special in FTP for data->set.include_header for historic, backward compatible reasons
  • move unpausing of client writes from easy.c to sendf.c, so that code is in one place and can forward flags correctly

- use CLIENTWRITE_BODY *only* when data is actually body data
- add CLIENTWRITE_INFO for meta data that is *not* a HEADER
- debug assertions that BODY/INFO/HEADER is not used mixed
- move `data->set.include_header` check into Curl_client_write
  so protocol handlers no longer have to care
- add special in FTP for `data->set.include_header` for historic,
  backward compatible reasons
- move unpausing of client writes from easy.c to sendf.c, so that
  code is in one place and can forward flags correctly
* output. */
CURLcode result;
int save = data->set.include_header;
data->set.include_header = TRUE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where within Curl_client_write() is data->set.include_header checked? I can't find it, so I'm not following why it needs to be set like this!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind. I see it now, this patch adds the check.

Would it not be nicer to pass in the "include_header" bool as a new argument to Curl_client_write instead of changing the variable like this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind. I see it now, this patch adds the check.

Would it not be nicer to pass in the "include_header" bool as a new argument to Curl_client_write instead of changing the variable like this?

My thinking is that FTP is the sole exception here and that every other protocol does not have to care about this. I'd rather spare all others the "burden" to think about it than make FTP look nicer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right. Sounds fair.

* confusion on how to interpret/format/convert the data.
*/
#define CLIENTWRITE_BODY (1<<0) /* non-meta information, BODY */
#define CLIENTWRITE_INFO (1<<1) /* meta information, not a HEADER */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only user of this (CLIENTWRITE_INFO) now is response lines in the pingpong handling. Does it really need to be set special? It has been set a "header" up until now and it will be sent to the header callback...

Copy link
Contributor Author

@icing icing Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is what I thought first as well. But then test cases failed. There are cases, where data->req.include_header is set, but pingpong never wrote with `CLIENTWRITE_BODY´. Other protocol handler parts did, though. I believe IMAP is such a case, if memory serves me.

So, before this PR, we had calls to Curl_client_write() that:

  1. added CLIENTWRITE_BODY when data->req.include_header was TRUE
  2. did not add CLIENTWRITE_BODY although data->req.include_header was TRUE
  3. added CLIENTWRITE_BODY irregardless of data->req.include_header

Case 1 is now automatically handles in Curl_client_write(). Case 2 is now changed to CLIENTWRITE_INFO. Case 3 is handled via the special flag set in FTP (and only there because FTP via HTTP PROXY does not want to see the CONNECT headers).

Why do all this? Because with this PR, the flags say what the written data is - and no longer to which callback it shall be passed to. So CLIENTWRITE_BODY really is only used for body data.

When this holds true, it is possible to move several things into Curl_client_write():

  • content encoding writers, e.g. data->req.write_stack
  • chunked decoding
  • progress updates

making life for transfer loop and protocol handlers easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To explain the case some more: we have this nice writer stack in content_encoding.[ch]. That is also called for Transfer-Encoding headers. But chunked is not implemented there. Why not?

Well, chunked changes the amount of bytes in a response. The chunk framing is not counted against Content-Length and download progress updates. In order to count the bytes correctly, it needs to know what of its buffer is really the content.

So, when writing received transfer data, one needs to if() check for chunked encoding, call it, get back an updated buffer position and length and use that for client writes and progress updates.

If chunked were a content_encoding writer, we could just write the received data through the writer stack and it would take care of all this. Add a "progress" writer after "chunked" and updates just work for everyone.

@bagder bagder closed this in 8898257 Sep 21, 2023
@bagder
Copy link
Member

bagder commented Sep 21, 2023

Thanks!

ptitSeb pushed a commit to wasix-org/curl that referenced this pull request Sep 25, 2023
- use CLIENTWRITE_BODY *only* when data is actually body data
- add CLIENTWRITE_INFO for meta data that is *not* a HEADER
- debug assertions that BODY/INFO/HEADER is not used mixed
- move `data->set.include_header` check into Curl_client_write
  so protocol handlers no longer have to care
- add special in FTP for `data->set.include_header` for historic,
  backward compatible reasons
- move unpausing of client writes from easy.c to sendf.c, so that
  code is in one place and can forward flags correctly

Closes curl#11885
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants