curl / Mailing Lists / curl-library / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: Total http/2 concurrency for multiplexed multi-handle

From: Jeroen Ooms via curl-library <curl-library_at_lists.haxx.se>
Date: Thu, 9 Feb 2023 12:55:56 +0100

On Thu, Feb 9, 2023 at 9:31 AM Daniel Stenberg <daniel_at_haxx.se> wrote:
>
> On Wed, 8 Feb 2023, Jeroen Ooms via curl-library wrote:
>
> > HTTP/2 stream 20139 was not closed cleanly before end of the underlying stream
> >
> > So either they introduced a server bug, or perhaps GitHub is deliberately
> > blocking abusive behavior due to high concurrency.
>
> My guess: they probably just cut off something (like the connection) instead
> of closing down the streams nicely, hence this error.
>
> > I am using a multi handle with CURLPIPE_MULTIPLEX and otherwise default
> > settings. Am I correct that this means libcurl starts 100 concurrent streams
> > (CURLMOPT_MAX_CONCURRENT_STREAMS), and still make 6 concurrent connections
> > (CURLMOPT_MAX_HOST_CONNECTIONS) per host, i.e. download 600 files in
> > parallel? I can imagine that could be considered abusive.
>
> If you have added >= 600 transfers to the multi handle, libcurl will attempt
> to do them like that.
>
> This said, we also have a bug in HTTP/2 that makes it not always "fill up" the
> connections properly after the first one is saturated. (Fixed in master)
>
> > Should I set CURLMOPT_MAX_HOST_CONNECTIONS to 1 in case of http/2
> > multiplexing? Or is CURLMOPT_MAX_HOST_CONNECTIONS ignored in case of
> > multiplexing?
>
> It is independent of multiplexing.

OK, I had expected multiplexing to replace the need for
multi-connections. Do browsers still make multiple connections to
hosts that support http/2 multiplex? Perhaps a desirable default would
be to do one or another, but not both? Although I can this complicates
things because we don't know if a server will support multiplex,
before we actually made a connection.



> > One other thing I noticed is that GitHub does not seem to set any
> > MAX_CONCURRENT_STREAMS, or at least I am not seeing any. For example on
> > httpbin I see this:
> >
> > curl -v 'https://httpbin.org/get' --http2
> > * Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
> >
> > However for GitHub I don't see such a thing:
> >
> > curl -v 'https://raw.githubusercontent.com/curl/curl/master/README' --http2
> >
> > So does this mean libcurl will assume 100 streams is OK?
>
> Yes I believe so.
>
> > Is there a way to debug this, and monitor how many active downloads a
> > multi-handle is making in total (summed over all connections)?
>
> I think libcurl lacks that ability. The 'running_handles' counter the
> *perform() function returns include transfers that are queued up. You
> basically want running_handles - num_pending.
>
> Maybe we should provide a curl_multi_getinfo() for things like this?

I think so. At least some way to list requests (easy-handles) from a
multi, and their state (pending, active, done), may be useful.

Thanks!
-- 
Unsubscribe: https://lists.haxx.se/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html
Received on 2023-02-09