curl-library
Re: c++ curl
Date: Sun, 4 Oct 2009 15:26:34 +0000 (GMT)
Thanks Frank.
I have made the changes you suggested and added this while loop too (after the curl easy options) and it seems to have done the trick:
while (mh != NULL) {
curl_multi_perform(mh, &running);
if (running == 0) {
msg = curl_multi_info_read(mh, &running);
if (msg == NULL) break;
}
}
and for cleanup I do:
for (int i = 0; i < 5; ++i) curl_multi_remove_handle(mh, h[i]);
for (int i = 0; i < 5; ++i) curl_easy_cleanup(h[i]);
curl_multi_cleanup(mh);
The verbose says the connections are left intact afterwards. I read that curl does this so it can reuse them, but I just want to make sure that I shouldnt be closing them or that my cleaning up isnt incorrect?
Will
PS. Here is the full verbose:
* About to connect() to chemrefer.com port 80 (#0)
* Trying 74.86.118.20... * About to connect() to www.chemrefer.com port 80 (#1)
* Trying 74.86.118.20... * About to connect() to www.chemrefer.com port 80 (#2)
* Trying 74.86.118.20... * About to connect() to www.chemrefer.com port 80 (#3)
* Trying 74.86.118.20... * About to connect() to www.chemrefer.com port 80 (#4)
* Trying 74.86.118.20... * Connected to chemrefer.com (74.86.118.20) port 80 (#0)
> GET /index.html HTTP/1.1
Host: chemrefer.com
Accept: */*
* Connected to www.chemrefer.com (74.86.118.20) port 80 (#1)
> GET /chemistry_search.php?page_title=news&submit HTTP/1.1
Host: www.chemrefer.com
Accept: */*
* Connected to www.chemrefer.com (74.86.118.20) port 80 (#2)
> GET /chemistry_search.php?page_title=toolbar&submit HTTP/1.1
Host: www.chemrefer.com
Accept: */*
* Connected to www.chemrefer.com (74.86.118.20) port 80 (#3)
> GET /chemistry_search.php?page_title=about&submit HTTP/1.1
Host: www.chemrefer.com
Accept: */*
* Connected to www.chemrefer.com (74.86.118.20) port 80 (#4)
> GET /chemistry_search.php?page_title=contact&submit HTTP/1.1
Host: www.chemrefer.com
Accept: */*
< HTTP/1.1 200 OK
< Content-Type: text/html
< Content-Length: 8687
< Accept-Ranges: bytes
< Connection: Keep-Alive
< Date: Sun, 04 Oct 2009 14:53:51 GMT
< ETag: "43e13e3-21ef-46ab78dc68640"
< Keep-Alive: timeout=5, max=100
< Last-Modified: Mon, 25 May 2009 07:28:17 GMT
< Server: Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8e-fips-rhel5 mod_auth_passthrough/2.1 mod
_bwlimited/1.4 FrontPage/5.0.2.2635
<
< HTTP/1.1 200 OK
< Content-Type: text/html
< Transfer-Encoding: chunked
< Connection: Keep-Alive
< Date: Sun, 04 Oct 2009 14:53:52 GMT
< Keep-Alive: timeout=5, max=100
< Server: Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8e-fips-rhel5 mod_auth_passthrough/2.1 mod
_bwlimited/1.4 FrontPage/5.0.2.2635
< X-Powered-By: PHP/5.2.9
< Accept-Ranges: bytes
<
* Connection #1 to host www.chemrefer.com left intact
* Connection #0 to host chemrefer.com left intact
< HTTP/1.1 200 OK
< Content-Type: text/html
< Transfer-Encoding: chunked
< Connection: Keep-Alive
< Date: Sun, 04 Oct 2009 14:53:52 GMT
< Keep-Alive: timeout=5, max=100
< Server: Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8e-fips-rhel5 mod_auth_passthrough/2.1 mod
_bwlimited/1.4 FrontPage/5.0.2.2635
< X-Powered-By: PHP/5.2.9
< Accept-Ranges: bytes
<
< HTTP/1.1 200 OK
< Content-Type: text/html
< Transfer-Encoding: chunked
< Connection: Keep-Alive
< Date: Sun, 04 Oct 2009 14:53:52 GMT
< Keep-Alive: timeout=5, max=100
< Server: Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8e-fips-rhel5 mod_auth_passthrough/2.1 mod
_bwlimited/1.4 FrontPage/5.0.2.2635
< X-Powered-By: PHP/5.2.9
< Accept-Ranges: bytes
<
< HTTP/1.1 200 OK
< Content-Type: text/html
< Transfer-Encoding: chunked
< Connection: Keep-Alive
< Date: Sun, 04 Oct 2009 14:53:52 GMT
< Keep-Alive: timeout=5, max=100
< Server: Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8e-fips-rhel5 mod_auth_passthrough/2.1 mod
_bwlimited/1.4 FrontPage/5.0.2.2635
< X-Powered-By: PHP/5.2.9
< Accept-Ranges: bytes
<
* Connection #2 to host www.chemrefer.com left intact
* Connection #4 to host www.chemrefer.com left intact
* Connection #3 to host www.chemrefer.com left intact
________________________________
From: Frank McGeough <fmcgeough@mac.com>
To: libcurl development <curl-library@cool.haxx.se>
Sent: Saturday, 3 October, 2009 20:01:12
Subject: Re: c++ curl
Just looking at your code briefly in-between college football Saturday but it looks like you are closing your file inside your loop where you are adding easy handles to the multi. I think that's probably the source of your crash since the FILE* is invalid when the getpage call is done. Make an array of FILE*'s. Keep them around until you're done. Have each of them point to a different file (which looks like its also an issue in the code). Good luck!
On Oct 3, 2009, at 12:10 PM, will griffiths wrote:
Hello,
>
>I have written a full text indexing program in C++ and am now building a spidering program and I intend to use libcurl for the download side of things.
>
>I am struggling a bit to transfer my curl multi code from PHP to C++ as I relied on the "getcontent" function quite heavily.
>
>I have included the full source code below. When i turned verbose on I got this output:
>* Closing connection #0
>* Closing connection #1
>* Closing connection #2
>followed by a crash. I probably need help understanding how the curl functions work in C++. I have read the docs for the functions "individually" and then strung them together by looking at egs in Google Code Search.
>
>Any help appreciated :)
>Will
>
>//#pragma comment(lib, "libcurl.lib") // now appending /link <DIR>\libcurl.lib to cmd args instead
>
>#include <string>
>#include <iostream>
>#include <map>
>#include "curl/curl.h"
>
>using namespace std;
>
>FILE *fd;
>
>static size_t getpage(void *incoming, size_t size, size_t nmemb, void *page)
>{
>return fwrite(incoming, size, nmemb, (FILE *)page);
>}
>
>int main()
>{
>
>map<int, string> pass;
>pass[0] = "http://chemrefer.com/index.html";
>pass[1] = "http://www.chemrefer.com/chemistry_search.php?page_title=news&submit";
>pass[2] = "http://www.chemrefer.com/chemistry_search.php?page_title=toolbar&submit";
>pass[3] = "http://www.chemrefer.com/chemistry_search.php?page_title=about&submit";
>pass[4] = "http://www.chemrefer.com/chemistry_search.php?page_title=contact&submit";
>
>CURL *h[5];
>
>CURLM *mh;
>mh = curl_multi_init();
>
>for (int i = 0; i < 5; ++i) {
>
>char *urlbuf = new char[pass[i].length()];
>strcpy(urlbuf, pass[i].c_str());
>
>h[i] = curl_easy_init();
>fd = fopen("download.txt", "ab");
>
>curl_easy_setopt(h[i], CURLOPT_URL, urlbuf);
>curl_easy_setopt(h[i], CURLOPT_WRITEFUNCTION, getpage);
>curl_easy_setopt(h[i], CURLOPT_WRITEDATA, fd);
>curl_easy_setopt(h[i], CURLOPT_HEADER, 0);
>
>curl_multi_add_handle(mh, h[i]);
>
>delete[] urlbuf;
>
>fclose(fd);
>
>}
>pass.clear();
>
>int running;
>CURLMcode result;
>do {
>result = curl_multi_perform(mh, &running);
>if (result != CURLM_CALL_MULTI_PERFORM) break;
>}
>while (running > 0);
>
>for (int i = 0; i < 5; ++i) curl_multi_remove_handle(mh, h[i]);
>for (int i = 0; i < 5; ++i) curl_easy_cleanup(h[i]);
>curl_multi_cleanup(mh);
>
>}
>
>
>-------------------------------------------------------------------
>List admin: http://cool.haxx.se/list/listinfo/curl-library
>Etiquette: http://curl.haxx.se/mail/etiquette.html
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2009-10-04