cURL / Mailing Lists / curl-users / Single Mail

curl-users

PHPCrawler or CURL library can't get content

From: xNokia <xnokia_at_nokiagate.com>
Date: Wed, 7 Jan 2015 14:53:19 +0300

Hi There,

I'm using PHPCrawler class to get product titles from different stores such
as eBay, the library does well with all stores I'm supporting in my
application except Blink store website(
http://blink.com.kw/search-result.aspx?text=mobile&searchfor=all) the
website's search page is not normally initiated like other store websites,
when I have followed the website's requests through Chrome Debugger I found
that it is initiated by script, though the request url is identical to the
original url I enter to the address bar on Chrome and the url I set in the
class to crawl.

So is there any way for the crawler class to fetch the page that I'm
redirected to? I've used the setFollowRedirects methods but with no luck,
because the redirect is done on client side through javascript not in the
headers. Besides I've found an extra post request made after the normal get
request, I've tried to add post data too but I get the same result an empty
result set, and when I output the fetched page I get it without the
products listed.

Side Note: Blink store website is an ASP.net site, is this the cause that I
can't crawl its pages?

UPDATE

I've tried to fetch the page using the standard php cURL function and
echoed the response, the page is echoed incomplete and keeps refreshing.

-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-users
FAQ: http://curl.haxx.se/docs/faq.html
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2015-01-07