curl / Mailing Lists / curl-users / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: Can server tell I'm using curl?

From: bruce via curl-users <curl-users_at_cool.haxx.se>
Date: Tue, 6 Apr 2021 19:21:43 -0400

Hi..

When trying to crawl a site using one of the recaptcha processes, your
crawl can get ugly.

However, if you break up the process, the crawl might be reasonable.

1) When youre doing the crawl, does the crawl have to be totally
automated? Or, can it be automated after you get past the recaptch
process!

If you're only crawling a site with a few pages, then you may as well
do it manually.

However, devil is in the details!

If the target site generates a cookie after the recaptcha is handled,
then you might be able to use the "cookie" from the browser in your
curl process. This would allow you to continue on with the crawl
process. If you're lucky, the timeout of the cookie will last for a
good portion of the day.

I've kind of discovered, the sites I target, can usually be managed
using curl and a bit of clever thinking. This gets by a bunch of
avascript stuff by examining/implementing curl processes that manage
to track the required processes for the site.

Your Mileage might vary!

good luck

On Tue, Apr 6, 2021 at 5:22 PM Paul Gilmartin via curl-users
<curl-users_at_cool.haxx.se> wrote:
>
>
>
> > On 2021-04-06, at 14:57:56, Dan Fandrich wrote:
> >
> > On Tue, Apr 06, 2021 at 01:37:42PM +0200, Gilles wrote:
> >> Is the command wrong, or is the server somehow able to tell I'm using curl to
> >> forbid its use
> >
> > I've heard that some sites check things like the order of headers being sent as
> > well as various HTTP/2 options and even use TCP fingerprinting to try to ferret
> > out robots masquerading as browsers. curl can't hide itself from sites doing
> > that sort of client detection, but fortunately, that seems to be rare.
> >
> What about reCAPTCHA?:
> https://en.wikipedia.org/wiki/ReCAPTCHA#No_CAPTCHA_reCAPTCHA
>
> -- gil
>
>
> -----------------------------------------------------------
> Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
> Etiquette: https://curl.haxx.se/mail/etiquette.html
-----------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
Etiquette: https://curl.haxx.se/mail/etiquette.html
Received on 2021-04-07