cURL / Mailing Lists / curl-and-php / Single Mail

curl-and-php

Re: CURLOPT_FOLLOWLOCATION not redirecting

From: John Miedema <mail_at_johnmiedema.ca>
Date: Thu, 12 Feb 2009 22:26:45 -0500

Thanks Stephen. However, I am trying to find something a little more
generic. In some situations, the javascript redirects will be created
dynamically using constants and method calls.

I wonder if someone who knows more than I can validate my understanding
of the problem, i.e., getting the html (or URL) of a page, mediated by a
javascript redirect.

1. It seems that cURL and other HTTP tools (like file_get_contents)
perform very well for what they are, HTTP request tools. They can handle
HTTP header redirects because that is part of the HTTP request.

2. These tools cannot handle JavaScript redirects because they are
initiated *outside* the HTTP request. A browser (FF, IE, whatever) has a
javascript interpreter that can detect the various forms of a location
redirect, construct the URL (if built dynamically in javascript), then
issue another HTTP request.

To solve my problem, I need the equivalent of a browser, with HTTP
request capability and a javascript interpreter.

If I am all wrong, I'm grateful for any correction.

If I am right, I have not yet found a *scriptable*, open source browser
that also has a JavaScript interpreter. Any suggestions are welcome.

Much appreciated, John

On Mon, 2009-02-02 at 15:27 -0500, Stephen Pynenburg wrote:
> Should be pretty easy to look in the result using regular expressions
> (look on PHP docs for preg).
> Once you grab the link, you can follow it with a new curl connect.
> -Stephen
>
> On Mon, Feb 2, 2009 at 11:07 AM, John Miedema <mail_at_johnmiedema.ca>
> wrote:
> Thanks Daniel. I realize this is a curl forum, but is there
> another way
> to get contents of the HTML or javascript redirects? Perhaps
> using the
> PEAR HTTP_Client or file_get_contents? Much appreciated. John
>
>
> On Mon, 2009-02-02 at 16:51 +0100, Daniel Stenberg wrote:
> > On Mon, 2 Feb 2009, John Miedema wrote:
> >
> > > FILE: test1.htm - uses a meta REFRESH to redirect to
> test2.htm. Body
> > > contains some text: 'abc'. Redirects properly if opened in
> a browser.
> >
> > libcurl only follows pure HTTP redirects. It doesn't deal
> with HTML or
> > javascript or other kinds of redirects.
> >
>
>
>
> _______________________________________________
> http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-php
>
>
> _______________________________________________
> http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-php

_______________________________________________
http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-php
Received on 2009-02-13