curl-and-php
decoding iso-8859-1 characters
Date: Mon, 20 Nov 2006 22:36:39 -0500
Hello All,
I am currently stumped, and have been looking around for an answer but
haven't found it. I hope i'm not wasting time with this one. I have an html
page im scraping this is in the header of the page..
<meta http-equiv="Content-Type" content="text/html; Charset=iso-8859-1" />
Unfortunatly after grabing the data of the page with this line of code....
<?php
$url = "http://www.artbaselmiamibeach.com/ca/en/fpg/";
$ch = curl_init ();
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
$data = curl_exec ($ch);
curl_close ($ch);
echo $data;
preg_match("/<td valign=\"top\" nowrap=\"nowrap\">Artists at Art Basel
Miami Beach(.*<\/b>)<\/td>/s",$data, $artists);
preg_match("/<b>(.*)<\/b>/s",$artists[1],$artists);
$pattern = '/(<br\/>.)/s';
$replacement = ',';
echo preg_replace($pattern, $replacement, $artists[1]);
echo ",";
?>
All of the valuable foreign characters have been changed to question marks
for example of getting "Joan Miró" i get "Joan Mir?", and that is not good
for a project based on gathering data on a name.
Any suggestions?
Thanks again!
-tim
_______________________________________________
http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-php
Received on 2006-11-21