Buy commercial curl support from WolfSSL. We help you work
out your issues, debug your libcurl applications, use the API, port to new
platforms, add new features and more. With a team lead by the curl founder
himself.
Re: Total http/2 concurrency for multiplexed multi-handle
- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]
From: Jeroen Ooms via curl-library <curl-library_at_lists.haxx.se>
Date: Thu, 9 Feb 2023 12:55:56 +0100
On Thu, Feb 9, 2023 at 9:31 AM Daniel Stenberg <daniel_at_haxx.se> wrote:
>
> On Wed, 8 Feb 2023, Jeroen Ooms via curl-library wrote:
>
> > HTTP/2 stream 20139 was not closed cleanly before end of the underlying stream
> >
> > So either they introduced a server bug, or perhaps GitHub is deliberately
> > blocking abusive behavior due to high concurrency.
>
> My guess: they probably just cut off something (like the connection) instead
> of closing down the streams nicely, hence this error.
>
> > I am using a multi handle with CURLPIPE_MULTIPLEX and otherwise default
> > settings. Am I correct that this means libcurl starts 100 concurrent streams
> > (CURLMOPT_MAX_CONCURRENT_STREAMS), and still make 6 concurrent connections
> > (CURLMOPT_MAX_HOST_CONNECTIONS) per host, i.e. download 600 files in
> > parallel? I can imagine that could be considered abusive.
>
> If you have added >= 600 transfers to the multi handle, libcurl will attempt
> to do them like that.
>
> This said, we also have a bug in HTTP/2 that makes it not always "fill up" the
> connections properly after the first one is saturated. (Fixed in master)
>
> > Should I set CURLMOPT_MAX_HOST_CONNECTIONS to 1 in case of http/2
> > multiplexing? Or is CURLMOPT_MAX_HOST_CONNECTIONS ignored in case of
> > multiplexing?
>
> It is independent of multiplexing.
OK, I had expected multiplexing to replace the need for
multi-connections. Do browsers still make multiple connections to
hosts that support http/2 multiplex? Perhaps a desirable default would
be to do one or another, but not both? Although I can this complicates
things because we don't know if a server will support multiplex,
before we actually made a connection.
> > One other thing I noticed is that GitHub does not seem to set any
> > MAX_CONCURRENT_STREAMS, or at least I am not seeing any. For example on
> > httpbin I see this:
> >
> > curl -v 'https://httpbin.org/get' --http2
> > * Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
> >
> > However for GitHub I don't see such a thing:
> >
> > curl -v 'https://raw.githubusercontent.com/curl/curl/master/README' --http2
> >
> > So does this mean libcurl will assume 100 streams is OK?
>
> Yes I believe so.
>
> > Is there a way to debug this, and monitor how many active downloads a
> > multi-handle is making in total (summed over all connections)?
>
> I think libcurl lacks that ability. The 'running_handles' counter the
> *perform() function returns include transfers that are queued up. You
> basically want running_handles - num_pending.
>
> Maybe we should provide a curl_multi_getinfo() for things like this?
I think so. At least some way to list requests (easy-handles) from a
multi, and their state (pending, active, done), may be useful.
Thanks!
Date: Thu, 9 Feb 2023 12:55:56 +0100
On Thu, Feb 9, 2023 at 9:31 AM Daniel Stenberg <daniel_at_haxx.se> wrote:
>
> On Wed, 8 Feb 2023, Jeroen Ooms via curl-library wrote:
>
> > HTTP/2 stream 20139 was not closed cleanly before end of the underlying stream
> >
> > So either they introduced a server bug, or perhaps GitHub is deliberately
> > blocking abusive behavior due to high concurrency.
>
> My guess: they probably just cut off something (like the connection) instead
> of closing down the streams nicely, hence this error.
>
> > I am using a multi handle with CURLPIPE_MULTIPLEX and otherwise default
> > settings. Am I correct that this means libcurl starts 100 concurrent streams
> > (CURLMOPT_MAX_CONCURRENT_STREAMS), and still make 6 concurrent connections
> > (CURLMOPT_MAX_HOST_CONNECTIONS) per host, i.e. download 600 files in
> > parallel? I can imagine that could be considered abusive.
>
> If you have added >= 600 transfers to the multi handle, libcurl will attempt
> to do them like that.
>
> This said, we also have a bug in HTTP/2 that makes it not always "fill up" the
> connections properly after the first one is saturated. (Fixed in master)
>
> > Should I set CURLMOPT_MAX_HOST_CONNECTIONS to 1 in case of http/2
> > multiplexing? Or is CURLMOPT_MAX_HOST_CONNECTIONS ignored in case of
> > multiplexing?
>
> It is independent of multiplexing.
OK, I had expected multiplexing to replace the need for
multi-connections. Do browsers still make multiple connections to
hosts that support http/2 multiplex? Perhaps a desirable default would
be to do one or another, but not both? Although I can this complicates
things because we don't know if a server will support multiplex,
before we actually made a connection.
> > One other thing I noticed is that GitHub does not seem to set any
> > MAX_CONCURRENT_STREAMS, or at least I am not seeing any. For example on
> > httpbin I see this:
> >
> > curl -v 'https://httpbin.org/get' --http2
> > * Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
> >
> > However for GitHub I don't see such a thing:
> >
> > curl -v 'https://raw.githubusercontent.com/curl/curl/master/README' --http2
> >
> > So does this mean libcurl will assume 100 streams is OK?
>
> Yes I believe so.
>
> > Is there a way to debug this, and monitor how many active downloads a
> > multi-handle is making in total (summed over all connections)?
>
> I think libcurl lacks that ability. The 'running_handles' counter the
> *perform() function returns include transfers that are queued up. You
> basically want running_handles - num_pending.
>
> Maybe we should provide a curl_multi_getinfo() for things like this?
I think so. At least some way to list requests (easy-handles) from a
multi, and their state (pending, active, done), may be useful.
Thanks!
-- Unsubscribe: https://lists.haxx.se/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.htmlReceived on 2023-02-09