curl-users
Re: How to deal with special characters / character encoding?
Date: Thu, 15 Nov 2007 10:20:41 -0800
On Thu, Nov 15, 2007 at 11:27:51AM +0100, Peter Wullinger wrote:
> [...]
> Content-Type: text/html; charset=utf-8
> [...]
>
> Which means, your terminal character set is to something else than
> utf-8 (depends on the operating system and your current user's settings),
> and the current character set interprets the hexadecimal
> character "0xe2" in the character set, it has been set up to display.
>
> Since you get "â", your character set is most likely ISO 8859-1 (see e.g.
> http://de.wikipedia.org/wiki/ISO_8859-1), since "â" is the correct for the
> character
> code "0xe2" there. But since you are being sent a two-byte encoded character
> from
> utf-8, namely ('), this does not get displayed correctly.
If you have a recent version of iconv installed, you can use it to get the
results you expect with the command:
curl http://www.msnbc.msn.com/id/21773401/ | iconv -f utf-8 -t iso-8859-1//TRANSLIT | grep '<title>'
>>> Dan
-- http://www.MoveAnnouncer.com The web change of address service Let webmasters know that your web site has movedReceived on 2007-11-15