cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: PERL libcurl html into variable

From: Cris Bailiff <c.bailiff+curl_at_devsecure.com>
Date: Thu, 10 Jul 2003 11:30:21 +1000

Maarten,

You didn't say, but I'm assuming you're using the WWW::Curl perl interface to
libcurl.

By default, libcurl sets up the body and headers of a request to both be
'written' to stdout. You can change 'stdout' to an internal variable, but
then you get both headers and body, as you seem to have found.

If you have followed the examples and test scripts in the WWW::Curl
distribution, you would have seen that it is possible to choose a different
function for outputing the data - i.e. you can have a perl subroutine called
for each 'chunk' of data, rather than libcurl just using the perl stdio. The
'filehandle' that can be given to libcurl can actually be any perl data - a
reference, a list, a string etc. By using references, you can have the same
subroutine called for both headers and body, but have the subroutine store
the output in different places (variables).

Here's a 'minimal' WWW::Curl 'get a page into variables' example. (It's like
the 'basicfirst.pl' example in the distribution, but it's absolutely the most
cut down version:

---------------------

#!/usr/bin/perl -w
use strict;

use WWW::Curl::easy;

sub write_callback {
    my ($chunk,$variable)=@_;
    # store each chunk/line separately
    # This should be faster than using $$varable .= $chunk;
    push @{$variable}, $chunk;
    return length($chunk);
}

my $curl = WWW::Curl::easy->new();

# set up the callback for headers and body
$curl->setopt(CURLOPT_HEADERFUNCTION, \&write_callback);
$curl->setopt(CURLOPT_WRITEFUNCTION, \&write_callback);

# set up the variables for headers and body
my (@head,@body);
$curl->setopt(CURLOPT_HTTPHEADER, \@head);
$curl->setopt(CURLOPT_FILE, \@body);

# do the request
$curl->setopt(CURLOPT_URL, "http://example.com");

if ($curl->perform() != 0) { die "Curl perform failed"; }

# We leave @head alone - nicer to have one header line per element
# but we join the body output into one long string
my $body=join("",@body);

print $body;

------------------

You should be able to see that the key to your 'problem' is to make sure you
output the headers into a different variable than the body.

If you don't want the headers at all, you could just call a subroutine which
'does nothing' (just returns the length for 'successful') when a header is
received. For example

$curl->setopt(CURLOPT_HEADERFUNCTION, sub { return length(shift) });

Cris

On Thu, 10 Jul 2003 08:04 am, maarten_at_datastorm.nl wrote:
> Hey all,
>
> I have a perl script doing something like this:
> $html=`curl -K config/stap1.cfg $proxy`;
> the outcome of curl can be found in the variable $html
> No output to the screen!
>
> Now I want to replace the backticks stuff bij using libcurl.
>
> But how do I get the same result, get the html data into $html.
> But without: HTTP/1.1 200 OK
> Date: Wed, 09 Jul 2003 21:52:51 GMT
> Server: Apache/1.3.22 (Unix) (Red-Hat/Linux)
> Chili!Soft-ASP/3.6.2 mod_perl/1.24_01 PHP/4.2.2 FrontPage/5.0.2
> mod_ssl/2.8.5 OpenSSL/0.9.6b Last-Modified: Tue, 24 Sep 2002 14:08:07 GMT
> ETag: "3ed848-123-3d9071c7"
> Accept-Ranges: bytes
> Content-Length: 291
> Content-Type: text/html
>
> I only need the html content back into the $html.
>
> How can I do that?
>
> Maarten_at_datastorm.nl
>
>
>
> -------------------------------------------------------
> This SF.Net email sponsored by: Parasoft
> Error proof Web apps, automate testing & more.
> Download & eval WebKing and get a free book.
> www.parasoft.com/bulletproofapps

-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps
Received on 2003-07-10