cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: How to get HTTP charset? I wanna do charset conversion(or maybe libcurl already has this feature)

From: Daniel Stenberg <daniel_at_haxx.se>
Date: Thu, 10 Jun 2010 12:15:22 +0200 (CEST)

On Thu, 10 Jun 2010, kartwall wrote:

> I wanna convert all http responses to UTF-8 because, you know, not all
> web pages are written in UTF-8. I skimmed the manual of "curl_easy_setopt",
> seems "CURLOPT_CONV_TO_NETWORK_FUNCTION",
> "CURLOPT_CONV_FROM_NETWORK_FUNCTION" do helps.

Not really. The purpose of that functionality is for platforms that do not
speak ASCII natively to provide a way to make the protocols we use that are
ascii-based to still work fine.

> But here is a big question: How can I know the charset of html file which I
> received?

HTML is contents that libcurl may deliver. How to deal with that data is
beyond what libcurl knows or cares about. You would need to read up on how
HTML works to figure this out. Of course, there may be HTTP headers in some or
many cases that help you out.

-- 
  / daniel.haxx.se
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette:  http://curl.haxx.se/mail/etiquette.html
Received on 2010-06-10