Buy commercial curl support from WolfSSL. We help you work
out your issues, debug your libcurl applications, use the API, port to new
platforms, add new features and more. With a team lead by the curl founder
himself.
Re: cloudflare
- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]
From: Timothe Litt <litt_at_acm.org>
Date: Mon, 28 Feb 2022 17:35:40 -0500
Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.
On 28-Feb-22 17:08, Dan Fandrich via curl-users wrote:
> On Mon, Feb 28, 2022 at 04:45:19PM -0500, Dennis Nezic wrote:
>> On Mon, 28 Feb 2022 12:37:28 -0800, Dan Fandrich via curl-users wrote:
>>> On Mon, Feb 28, 2022 at 01:29:33PM -0500, Dennis Nezic via curl-users
>>> wrote:
>>>> Any idea why I'm not able to fetch
>>>> https://grandtheftworld.com/feed/podcast/
>>>> using curl, but I am with wget?
>>>>
>>>> *Even when the http headers are exactly the same!?*
>>> In a quick test, curl puts the Host: header first whereas wget puts it
>>> second-to-last. That's a difference that could be used for browser
>>> fingerprinting.
>> I said the headers were exactly the same, byte-for-byte.
> Ok, just confirming. In both cases the results were byte-for-byte the same,
> just in a slightly different order :-)
>
> Are curl and wget using the same TLS library & version? That would be another
> source of differences.
>
>> What version of wget were you testing with?
> 1.21.1
FWIW, using wget, I get a 404 with user-agent curl or Wget.
With wget, I get data.
Clearly, cloudflare is blocking curl and Wget, and giving unknown
user-agents a pass.
As for the captcha, you need a packet trace to ensure that you're
talking to the same server, and to see how TLS is negotiated. Cloudflare
is clearly trying to outsmart someone, but may not be as clever as they
think they are.
OpenSSL in both cases..
wget https://grandtheftworld.com/feed/podcast/ --user-agent curl/7.79.1
-O zz.t
--2022-02-28 17:24:41-- https://grandtheftworld.com/feed/podcast/
Resolving grandtheftworld.com (grandtheftworld.com)...
2606:4700:20::681a:702, 2606:4700:20::681a:602, 2606:4700:20::ac43:46b6, ...
Connecting to grandtheftworld.com
(grandtheftworld.com)|2606:4700:20::681a:702|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2022-02-28 17:24:41 ERROR 403: Forbidden.
wget https://grandtheftworld.com/feed/podcast/ --user-agent wget/7.79.1
-O zz.t
--2022-02-28 17:25:14-- https://grandtheftworld.com/feed/podcast/
Resolving grandtheftworld.com (grandtheftworld.com)...
2606:4700:20::681a:602, 2606:4700:20::681a:702, 2606:4700:20::ac43:46b6, ...
Connecting to grandtheftworld.com
(grandtheftworld.com)|2606:4700:20::681a:602|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 59388 (58K) [application/rss+xml]
wget https://grandtheftworld.com/feed/podcast/ --user-agent
Wget/7.79.1 -O zz.t
--2022-02-28 17:26:25-- https://grandtheftworld.com/feed/podcast/
Resolving grandtheftworld.com (grandtheftworld.com)...
2606:4700:20::ac43:46b6, 2606:4700:20::681a:602, 2606:4700:20::681a:702, ...
Connecting to grandtheftworld.com
(grandtheftworld.com)|2606:4700:20::ac43:46b6|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2022-02-28 17:26:25 ERROR 403: Forbidden.
Received on 2022-02-28
Date: Mon, 28 Feb 2022 17:35:40 -0500
Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.
On 28-Feb-22 17:08, Dan Fandrich via curl-users wrote:
> On Mon, Feb 28, 2022 at 04:45:19PM -0500, Dennis Nezic wrote:
>> On Mon, 28 Feb 2022 12:37:28 -0800, Dan Fandrich via curl-users wrote:
>>> On Mon, Feb 28, 2022 at 01:29:33PM -0500, Dennis Nezic via curl-users
>>> wrote:
>>>> Any idea why I'm not able to fetch
>>>> https://grandtheftworld.com/feed/podcast/
>>>> using curl, but I am with wget?
>>>>
>>>> *Even when the http headers are exactly the same!?*
>>> In a quick test, curl puts the Host: header first whereas wget puts it
>>> second-to-last. That's a difference that could be used for browser
>>> fingerprinting.
>> I said the headers were exactly the same, byte-for-byte.
> Ok, just confirming. In both cases the results were byte-for-byte the same,
> just in a slightly different order :-)
>
> Are curl and wget using the same TLS library & version? That would be another
> source of differences.
>
>> What version of wget were you testing with?
> 1.21.1
FWIW, using wget, I get a 404 with user-agent curl or Wget.
With wget, I get data.
Clearly, cloudflare is blocking curl and Wget, and giving unknown
user-agents a pass.
As for the captcha, you need a packet trace to ensure that you're
talking to the same server, and to see how TLS is negotiated. Cloudflare
is clearly trying to outsmart someone, but may not be as clever as they
think they are.
OpenSSL in both cases..
wget https://grandtheftworld.com/feed/podcast/ --user-agent curl/7.79.1
-O zz.t
--2022-02-28 17:24:41-- https://grandtheftworld.com/feed/podcast/
Resolving grandtheftworld.com (grandtheftworld.com)...
2606:4700:20::681a:702, 2606:4700:20::681a:602, 2606:4700:20::ac43:46b6, ...
Connecting to grandtheftworld.com
(grandtheftworld.com)|2606:4700:20::681a:702|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2022-02-28 17:24:41 ERROR 403: Forbidden.
wget https://grandtheftworld.com/feed/podcast/ --user-agent wget/7.79.1
-O zz.t
--2022-02-28 17:25:14-- https://grandtheftworld.com/feed/podcast/
Resolving grandtheftworld.com (grandtheftworld.com)...
2606:4700:20::681a:602, 2606:4700:20::681a:702, 2606:4700:20::ac43:46b6, ...
Connecting to grandtheftworld.com
(grandtheftworld.com)|2606:4700:20::681a:602|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 59388 (58K) [application/rss+xml]
wget https://grandtheftworld.com/feed/podcast/ --user-agent
Wget/7.79.1 -O zz.t
--2022-02-28 17:26:25-- https://grandtheftworld.com/feed/podcast/
Resolving grandtheftworld.com (grandtheftworld.com)...
2606:4700:20::ac43:46b6, 2606:4700:20::681a:602, 2606:4700:20::681a:702, ...
Connecting to grandtheftworld.com
(grandtheftworld.com)|2606:4700:20::ac43:46b6|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2022-02-28 17:26:25 ERROR 403: Forbidden.
-- Unsubscribe: https://lists.haxx.se/listinfo/curl-users Etiquette: https://curl.haxx.se/mail/etiquette.html
- application/pgp-signature attachment: OpenPGP digital signature