Buy commercial curl support from WolfSSL. We help you work
out your issues, debug your libcurl applications, use the API, port to new
platforms, add new features and more. With a team lead by the curl founder
himself.
Re: feature request: expected payload size command-line flag
- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]
From: Ray Satiro via curl-users <curl-users_at_lists.haxx.se>
Date: Wed, 2 Nov 2022 01:02:33 -0400
On 11/1/2022 3:27 PM, Danny McClanahan via curl-users wrote:
> I was recently looking to download my twitter user data archive via curl since my browser was shorting out. The file size was quite large, and twitter fails to provide an exact Content-Length for some reason, except in their own custom header e.g. "x-ton-expected-size: 8274859056", which means the default curl progress output was unable to estimate the remaining time for the download. This of course looks like:
> % Total % Received % Xferd Average Speed Time Time Time Current
> Dload Upload Total Spent Left Speed
> 100 9.9M 0 9.9M 0 0 1820k 0 --:--:-- 0:00:05 --:--:-- 1981k
>
> As it turns out, even when the download completes successfully (in either the browser or curl), the zip file twitter provides for my account is corrupt, but that's not curl's problem. I'm mostly interested in whether someone has already considered adding a way to provide an expected Content-Length to curl in order to obtain the benefits of the progress bar, such as estimating remaining time.
>
> I have tried setting --max-filesize, but that doesn't work for my purposes for two reasons:
> 1. It doesn't affect the progress output ("Time Left" remains at "--:--:--"), so it does not solve the problem.
> 2. It would cut off the download after that many bytes, whereas my use case does not expect to know the precise number of bytes in advance, and I need to ensure I download the complete file (instead, --max-filesize would complement this proposed feature by setting an upper bound for payload size so I can avoid downloading more than I have space for).
>
> In searching archives of this mailing list, I found this issue (https://github.com/curl/curl/issues/2158), which provides an easier repro case of a download missing a Content-Length:"https://github.com/torvalds/linux/archive/v4.14-rc1.tar.gz", but wasn't immediately able to find discussion about hard-coding an expected payload length when not provided.
>
> I'd like to know whether this feature has already been considered already, or whether there are likely to be any blockers. I'm not yet too familiar with how curl communicates with libcurl, but if libcurl produces the progress output, and libcurl requires a precise (instead of estimated) Content-Length to produce the progress estimate, I could see this requiring a change to libcurl. But I'm hoping this can be implemented purely in the curl command-line tool.
>
> I'm planning to take a stab at implementing this change now from my checkout of the curl git repo, but would love to hear any objections to this feature as well. I was thinking this would be a command-line flag that accepts the same type of size specification that --max-filesize does. I was also planning to print out a warning and ignore the value of this flag if the response provides its own Content-Length, in cases such as described inhttps://github.com/curl/curl/issues/2158 above, where the Content-Length may or may not be set.
I think an expected content length option is too niche to add to the
curl tool. I would likely vote against it. If the server chooses chunked
encoding or otherwise does not supply the length then there's no
accepted way to measure the length, so I think working off something
like x-ton-expected-size (which AFAICT is specific to twitter) is too
niche as well.
Date: Wed, 2 Nov 2022 01:02:33 -0400
On 11/1/2022 3:27 PM, Danny McClanahan via curl-users wrote:
> I was recently looking to download my twitter user data archive via curl since my browser was shorting out. The file size was quite large, and twitter fails to provide an exact Content-Length for some reason, except in their own custom header e.g. "x-ton-expected-size: 8274859056", which means the default curl progress output was unable to estimate the remaining time for the download. This of course looks like:
> % Total % Received % Xferd Average Speed Time Time Time Current
> Dload Upload Total Spent Left Speed
> 100 9.9M 0 9.9M 0 0 1820k 0 --:--:-- 0:00:05 --:--:-- 1981k
>
> As it turns out, even when the download completes successfully (in either the browser or curl), the zip file twitter provides for my account is corrupt, but that's not curl's problem. I'm mostly interested in whether someone has already considered adding a way to provide an expected Content-Length to curl in order to obtain the benefits of the progress bar, such as estimating remaining time.
>
> I have tried setting --max-filesize, but that doesn't work for my purposes for two reasons:
> 1. It doesn't affect the progress output ("Time Left" remains at "--:--:--"), so it does not solve the problem.
> 2. It would cut off the download after that many bytes, whereas my use case does not expect to know the precise number of bytes in advance, and I need to ensure I download the complete file (instead, --max-filesize would complement this proposed feature by setting an upper bound for payload size so I can avoid downloading more than I have space for).
>
> In searching archives of this mailing list, I found this issue (https://github.com/curl/curl/issues/2158), which provides an easier repro case of a download missing a Content-Length:"https://github.com/torvalds/linux/archive/v4.14-rc1.tar.gz", but wasn't immediately able to find discussion about hard-coding an expected payload length when not provided.
>
> I'd like to know whether this feature has already been considered already, or whether there are likely to be any blockers. I'm not yet too familiar with how curl communicates with libcurl, but if libcurl produces the progress output, and libcurl requires a precise (instead of estimated) Content-Length to produce the progress estimate, I could see this requiring a change to libcurl. But I'm hoping this can be implemented purely in the curl command-line tool.
>
> I'm planning to take a stab at implementing this change now from my checkout of the curl git repo, but would love to hear any objections to this feature as well. I was thinking this would be a command-line flag that accepts the same type of size specification that --max-filesize does. I was also planning to print out a warning and ignore the value of this flag if the response provides its own Content-Length, in cases such as described inhttps://github.com/curl/curl/issues/2158 above, where the Content-Length may or may not be set.
I think an expected content length option is too niche to add to the
curl tool. I would likely vote against it. If the server chooses chunked
encoding or otherwise does not supply the length then there's no
accepted way to measure the length, so I think working off something
like x-ton-expected-size (which AFAICT is specific to twitter) is too
niche as well.
-- Unsubscribe: https://lists.haxx.se/listinfo/curl-users Etiquette: https://curl.se/mail/etiquette.htmlReceived on 2022-11-02