curl / Mailing Lists / curl-library / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: Last-Modified header

From: James Read via curl-library <curl-library_at_cool.haxx.se>
Date: Thu, 21 May 2020 22:00:10 +0100

On Thu, May 21, 2020 at 8:58 PM James Read <jamesread5737_at_gmail.com> wrote:

>
>
> On Thu, May 21, 2020 at 4:18 PM Dan Fandrich via curl-library <
> curl-library_at_cool.haxx.se> wrote:
>
>> On Thu, May 21, 2020 at 03:46:33PM +0100, James Read via curl-library
>> wrote:
>> > I'm implementing a simple web crawler with curl and want to retrieve the
>> > Last-Modified header so I can implement a sensible recrawl policy. I've
>> found
>> > https://curl.haxx.se/libcurl/c/getinfo.html which is a nice easy way to
>> > retrieve the Content-Type header. Is there a similarly easy way to
>> retrieve the
>> > Last-Modified header? Or I do I need to parse the header myself?
>> >
>> > If I need to parse the header myself I found
>> https://curl.haxx.se/libcurl/c/
>> > sepheaders.html which prints headers to a file. Is there a way of just
>> storing
>> > the headers in memory so I can parse them there? I don't want to have
>> to write
>> > a file just to read it again.
>>
>> You can use that example as a basis, then set CURLOPT_HEADERFUNCTION with
>> a
>> function like WriteMemoryCallback() in the getinmemory.c example to store
>> the
>> headers in memory instead. Or, do something more intelligent since you're
>> only
>> interested in a single header. libcurl writes to a file by default, so by
>> setting your own header callback function you can process them however
>> you want.
>>
>>
> OK, This is as far as I got:
>
> static size_t
> write_cb(void *contents, size_t size, size_t nmemb, void *p)
> {
> ConnInfo *conn = (ConnInfo *)p;
> size_t realsize = size * nmemb;
>
> conn->data = realloc(conn->data, conn->size + realsize + 1);
> if (conn->data == NULL) {
> /* out of memory! */
> printf("not enough memory (realloc returned NULL)\n");
> return 0;
> }
>
> memcpy(&(conn->data[conn->size]), contents, realsize);
> conn->size += realsize;
> conn->data[conn->size] = 0;
>
> return realsize;
> }
>
> When I print out conn->data it just prints out the body. How do I get the
> header?
>

I forgot to add :

curl_easy_setopt(conn->easy, CURLOPT_HEADERDATA, conn);

It's working now.

Thanks

>
>
>> Dan
>> -------------------------------------------------------------------
>> Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
>> Etiquette: https://curl.haxx.se/mail/etiquette.html
>
>

-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://curl.haxx.se/mail/etiquette.html
Received on 2020-05-21