cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: How to deal with special characters / character encoding?

From: Dan Fandrich <dan_at_coneharvesters.com>
Date: Thu, 15 Nov 2007 10:20:41 -0800

On Thu, Nov 15, 2007 at 11:27:51AM +0100, Peter Wullinger wrote:
> [...]
> Content-Type: text/html; charset=utf-8
> [...]
>
> Which means, your terminal character set is to something else than
> utf-8 (depends on the operating system and your current user's settings),
> and the current character set interprets the hexadecimal
> character "0xe2" in the character set, it has been set up to display.
>
> Since you get "â", your character set is most likely ISO 8859-1 (see e.g.
> http://de.wikipedia.org/wiki/ISO_8859-1), since "â" is the correct for the
> character
> code "0xe2" there. But since you are being sent a two-byte encoded character
> from
> utf-8, namely ('), this does not get displayed correctly.

If you have a recent version of iconv installed, you can use it to get the
results you expect with the command:

  curl http://www.msnbc.msn.com/id/21773401/ | iconv -f utf-8 -t iso-8859-1//TRANSLIT | grep '<title>'

>>> Dan

-- 
http://www.MoveAnnouncer.com              The web change of address service
          Let webmasters know that your web site has moved
Received on 2007-11-15