cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: cURL starting questions

From: Doug McNutt <douglist_at_macnauchtan.com>
Date: Sun, 19 Apr 2009 14:29:06 -0600

At 21:25 -0700 4/18/09, Jason Todd Slack-Moehrle wrote:
>I want to start at Dmoz.org and follow links for entertainment (like
>concerts, art gallery events, etc) and examine the link to see if I
>should get data back about it and from it.

There are some extensions to Firefox that might be useful but they
want you to understand JavaScript. Greasemonkey and iMacros are
installed on my box but I really haven't done much with them. My
problem has been a provision in Firefox itself that, the last time I
checked, disables automated execution of a submit button in an HTML
<form>.

I have spent a bunch of time using perl as a main program with
backticked escapes to ask for help from curl. There are perl modules
on CPAN (Comprehensive perl archive network) that can do the same
thing without the backticks.

<ftp://ftp.macnauchtan.com/Software/FinpMod/>

is a directory in which I once posted some perl code that uses curl
to access my banking and brokerage sites. I needed to look into the
html and recover such things as one-time random numbers and cookies
set by JavaScript that don't show in the headers. It is truly a PITA
because the sites keep changing things in the interest of
"security". I am a registered user, dammit. Actually I really think
they all want to ensure that I read the ads rather than just
downloading some financial data.

Anyway, the items in that directory can provide a start for what you
want to do. If you're not a perl type there will be a learning curve.

And. . . Whatever you do it's quite likely that the links you're
looking for will not be simple text that you can pass on to curl.
AJAX and other content management systems demand JavaScript and it is
almost a sure thing that clicking on a link goes to a downloaded
JavaScript file that prepares a POST request for a page that will get
generated on the fly for delivery. It's also likely that the request
will generate tracking information that will put your IP address into
a database.

-- 
--> From the U S of A, the only socialist country that refuses to admit it. <--
-------------------------------------------------------------------
List admin: http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-users
FAQ:        http://curl.haxx.se/docs/faq.html
Etiquette:  http://curl.haxx.se/mail/etiquette.html
Received on 2009-04-19