curl-library
Re: [Semi-OT] Any perl tools to parse the client side HTML?
Date: Mon, 26 Nov 2001 15:14:31 -0700
Nick Chirca wrote:
>>I'm writing some perl scripts to do some automated web server
>>testing. Basically, I need to read a URL, parse out the HTML
>>returned to me (specifically, a form with input fields), modify
>>some input fields according to my test script's values, and then
>>POST the resulting request back to the server. In other words,
>>I want my script to appear as a real user/browser...
>>
>
> You'll have to modify the HTTP headers, the user agent name, manage
> cookies, redirects and stuff like this. I tried to achieve that through an
> older version of libwww also known as lwp. You can find out more about it
> on cpan site. I gave up using/researching on lwp after I discorvered,
> installed curl/libcurl. With Daniel's help I was able to do what I was
> looking for (log in to a website, manage cookies, redirects, fill in forms
> and stuff like this).
Hrm, I'm testing a very controlled system, so I may not need
to get this elaborate...but we'll see! I haven't tried
posting quite yet, as I have to parse first...
>
>
>>I currently have the perl HTTP package working enough to download
>>the URL, and it can post as well,
>>
>
> Can you show/send me some code/script exemples ? I am still a beginner in
> this Internet agent/crawler stuff and I could use any exemple/help.
>
>
This is using the HTTP::* stuff I found on CPAN.
my $ua = new LWP::UserAgent;
my $req = new HTTP::Request GET => "$my_url";
my $response = $ua->request($req);
if ($response->is_success) {
print "RESPONSE -:" . $response->content . ":-\n";
# This parse stuff is experimental, and I think I'll just write
# it myself...
my @inputs = parseHtmlForm($response->content);
print joint("\n", @inputs);
} else {
print "RESPONSE (error) -:" . $response->error_as_HTML . ":-\n";
}
-- Ben Greear <greearb_at_candelatech.com> <Ben_Greear AT excite.com> President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greearReceived on 2001-11-26