cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: PERL libcurl html into variable

From: Cris Bailiff <c.bailiff_at_awayweb.com>
Date: Thu, 10 Jul 2003 11:26:50 +1000

Maarten,

You didn't say, but I'm assuming you're using the WWW::Curl perl interface to
libcurl.

By default, libcurl sets up the body and headers of a request to be 'written'
to stdout.

If you have followed the examples and test scripts in the WWW::Curl
distribution, you would have seen that it is possible to choose a different
function for outputing the data - i.e. you can have a perl subroutine called
for each 'chunk' of data, rather than libcurl just using the perl stdio. The
'filehandle' that can be given to libcurl can actually be any perl data - a
reference, a list, a string etc. By using references, you can have the same
subroutine called for both headers and body, but have the subroutine store
the output in different places.

Here's a 'minimal' WWW::Curl 'get a page into variables' example. (It's like
the 'basicfirst.pl' example in the distribution, but it's absolutely the most
cut down version:

---------------------

#!/usr/bin/perl -w
use strict;

use WWW::Curl::easy;

sub write_callback {
    my ($chunk,$variable)=@_;
    # store each chunk/line separately
    # This should be faster than using $$varable .= $chunk;
    push @{$variable}, $chunk;
    return length($chunk);
}

my $curl = WWW::Curl::easy->new();

# set up the callback for headers and body
$curl->setopt(CURLOPT_HEADERFUNCTION, \&write_callback);
$curl->setopt(CURLOPT_WRITEFUNCTION, \&write_callback);

# set up the variables for headers and body
my (@head,@body);
$curl->setopt(CURLOPT_HTTPHEADER, \@head);
$curl->setopt(CURLOPT_FILE, \@body);

# do the request
$curl->setopt(CURLOPT_URL, "http://example.com");

if ($curl->perform() != 0) { die "Curl perform failed"; }

# We leave @head alone - nicer to have one header per element
# but we join the body output into one long string
my $body=join("",@body);

print $body;

------------------

You should be able to see that the key to your 'problem' is to make sure you
output the headers into a different variable than the body.

If you don't want the headers at all, you could just call a subroutine which
'does nothing' (just returns the length for 'successful') when a header is
received. For example

$curl->setopt(CURLOPT_HEADERFUNCTION, sub { return length(shift) });

Cris

On Thu, 10 Jul 2003 08:04 am, maarten_at_datastorm.nl wrote:
> Hey all,
>
> I have a perl script doing something like this:
> $html=`curl -K config/stap1.cfg $proxy`;
> the outcome of curl can be found in the variable $html
> No output to the screen!
>
> Now I want to replace the backticks stuff bij using libcurl.
>
> But how do I get the same result, get the html data into $html.
> But without: HTTP/1.1 200 OK
> Date: Wed, 09 Jul 2003 21:52:51 GMT
> Server: Apache/1.3.22 (Unix) (Red-Hat/Linux)
> Chili!Soft-ASP/3.6.2 mod_perl/1.24_01 PHP/4.2.2 FrontPage/5.0.2
> mod_ssl/2.8.5 OpenSSL/0.9.6b Last-Modified: Tue, 24 Sep 2002 14:08:07 GMT
> ETag: "3ed848-123-3d9071c7"
> Accept-Ranges: bytes
> Content-Length: 291
> Content-Type: text/html
>
> I only need the html content back into the $html.
>
> How can I do that?
>
> Maarten_at_datastorm.nl
>
>
>
> -------------------------------------------------------
> This SF.Net email sponsored by: Parasoft
> Error proof Web apps, automate testing & more.
> Download & eval WebKing and get a free book.
> www.parasoft.com/bulletproofapps

-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps
Received on 2003-07-10