curl / Mailing Lists / curl-library / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: WebSocket feature request: is it possible to call write function when full frame is loaded only?

From: Timothe Litt <litt_at_acm.org>
Date: Fri, 3 Feb 2023 09:35:54 -0500


On 03-Feb-23 03:27, Daniel Stenberg via curl-library wrote:
> On Fri, 3 Feb 2023, Vitalii B. Avramenko via curl-library wrote:
>
>>  Such partial data may be OK for HTTP protocol when we know for sure
>> that we have "request/response" pattern and we can detect the end of
>> data by HTTP protocol itself, for example, with `Content-Length`
>> header. But with websocket generally speaking we don't have any way
>> to know where is end of frame with `CURLOPT_WRITEFUNCTION`.
>
> Yes we do: curl_ws_meta() is provided to give you exactly that
> information!
>
>> we need a guarantee that `CURLOPT_WRITEFUNCTION` will call our
>> callback when full frame is downloaded only, or at least we need the
>> option that will allow us to request such behavior (something like
>> `CURLOPT_WEBSOCKET_FULL_FRAMES_ONLY`).
>
> I have been thinking about adding a mode for the websocket API that
> delivers full frames only, but I have hesitated a bit since frames can
> be up to 2^63 bytes big we need to decide on how to handle (too) big
> frames for such a mode.
>
> What do you think is a reasonable behavior for a full-frame mode when
> it receives (ridiculously) large frames?
>
There's always an upper bound - no one has 2^63 bytes of swap space,
memory, or disk space to store an extremely large frame. And it's not
likely in the foreseeable future.

I think it's up to the application to decide what it's willing to
handle.   I don't think there's a universal answer of how.  Maybe it
calls getrlimit (RLIMIT_DATA) - or RLIMIT_FSIZE.  Or it looks at free
space on its output disk.  Or bases it on estimated processing time.  Or ...

For full frames, if you can't set an upper bound, your protocol user
needs to rethink its usage.  If your application really can deal with
huge (beyond practical VM sized) data, it pretty much has to handle in
in a stream - so FULL_FRAMES would be inappropriate.

So, here's a simple answer:  Provide a setting for the maximum
acceptable full frame size.    On a FULL_FRAMES_ONLY connection, curl
buffers any frame up to that size and provides it in the callback. 
Anything bigger (or curl can't allocate the buffer memory, times out
waiting for it, etc) and curl returns an error (FRAME_TOO_BIG), aborts
the connection and calls writefunction with  NULL in the *ptr argument,
and the actual size in 'size".

This provides the application with sufficient information to log the
failure or even retry the request.

And to simplify the API, perhaps the setting should be
"CURLOPT_WEBSOCKET_FULL_FRAMES_UPTO, <size>", and let zero be the
current incremental delivery mode.

Vitalii can set <size> to a few GB if he can handle it.  Or if he is
willing to go until the OOM killer hits him, he can set size to 2^63-1
and see where fate takes him.  Having lived thru "32 bits is so big that
limits aren't necessary", I don't think that's a wise approach...

Timothe Litt

ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.


-- 
Unsubscribe: https://lists.haxx.se/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html
Received on 2023-02-03