curl / Mailing Lists / curl-library / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: Getting CAPTCHA response when download a webpage

From: Mah. E. via curl-library <curl-library_at_cool.haxx.se>
Date: Mon, 20 Jul 2020 12:18:09 +0200

> I set useragent to “Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36” and got 9393 bytes of output.

This 9393 bytes is an html contains captcha response and not the actual web page which should be around 130 kb
To make sure open the downloaded content with any editor and search for word “capcha” or “Cloudflare”
Also you will get same result with curl without setting any headers or user agent

I tried user agent and cookie file from a browser and nothing seems to work

Regards,
Mahmoud

> On Jul 20, 2020, at 9:55 AM, Jiahao XU <Jiahao_XU_at_outlook.com> wrote:
>
> IMHO it could be user agent and cookies affecting the output.
>
> I set useragent to “Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36” and got 9393 bytes of output.
>
> However, when I tried with wget, I got 403 forbidden.
>
> Get Outlook for iOS
> From: curl-library <curl-library-bounces_at_cool.haxx.se> on behalf of Mah. E. via curl-library <curl-library_at_cool.haxx.se>
> Sent: Monday, July 20, 2020 8:38:47 AM
> To: curl-users_at_cool.haxx.se <curl-users_at_cool.haxx.se>
> Cc: Mah. E. <mahmoud.aboghazala_at_gmail.com>; curl-library_at_cool.haxx.se <curl-library_at_cool.haxx.se>
> Subject: Getting CAPTCHA response when download a webpage
>
> First, Thanks for this awesome tool.
>
> is there anyway to download this web page using curl
>
> https://www.crunchyroll.com/en-gb/blood-blockade-battlefront/episode-6-get-the-lock-out-754075
>
> because i get Cloudflare CAPTCHA html response "8 kb file"
> but if i use a different downloaders like KGet or idm i get the actual page "131 Kb file size"
>
> i can open this link on any browser and never ask for recaptcha
> i tried also to use the full headers from the browser with curl command with no success
>
> any help would be appreciated
>

-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://curl.haxx.se/mail/etiquette.html
Received on 2020-07-20