cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: c++ curl

From: will griffiths <willgriffiths1_at_yahoo.co.uk>
Date: Sun, 4 Oct 2009 15:26:34 +0000 (GMT)

Thanks Frank. I have made the changes you suggested and added this while loop too (after the curl easy options) and it seems to have done the trick: while (mh != NULL) { curl_multi_perform(mh, &running); if (running == 0) { msg = curl_multi_info_read(mh, &running); if (msg == NULL) break; } } and for cleanup I do: for (int i = 0; i < 5; ++i) curl_multi_remove_handle(mh, h[i]); for (int i = 0; i < 5; ++i) curl_easy_cleanup(h[i]); curl_multi_cleanup(mh); The verbose says the connections are left intact afterwards. I read that curl does this so it can reuse them, but I just want to make sure that I shouldnt be closing them or that my cleaning up isnt incorrect? Will PS. Here is the full verbose: * About to connect() to chemrefer.com port 80 (#0) * Trying 74.86.118.20... * About to connect() to www.chemrefer.com port 80 (#1) * Trying 74.86.118.20... * About to connect() to www.chemrefer.com port 80 (#2) * Trying 74.86.118.20... * About to connect() to www.chemrefer.com port 80 (#3) * Trying 74.86.118.20... * About to connect() to www.chemrefer.com port 80 (#4) * Trying 74.86.118.20... * Connected to chemrefer.com (74.86.118.20) port 80 (#0) > GET /index.html HTTP/1.1 Host: chemrefer.com Accept: */* * Connected to www.chemrefer.com (74.86.118.20) port 80 (#1) > GET /chemistry_search.php?page_title=news&submit HTTP/1.1 Host: www.chemrefer.com Accept: */* * Connected to www.chemrefer.com (74.86.118.20) port 80 (#2) > GET /chemistry_search.php?page_title=toolbar&submit HTTP/1.1 Host: www.chemrefer.com Accept: */* * Connected to www.chemrefer.com (74.86.118.20) port 80 (#3) > GET /chemistry_search.php?page_title=about&submit HTTP/1.1 Host: www.chemrefer.com Accept: */* * Connected to www.chemrefer.com (74.86.118.20) port 80 (#4) > GET /chemistry_search.php?page_title=contact&submit HTTP/1.1 Host: www.chemrefer.com Accept: */* < HTTP/1.1 200 OK < Content-Type: text/html < Content-Length: 8687 < Accept-Ranges: bytes < Connection: Keep-Alive < Date: Sun, 04 Oct 2009 14:53:51 GMT < ETag: "43e13e3-21ef-46ab78dc68640" < Keep-Alive: timeout=5, max=100 < Last-Modified: Mon, 25 May 2009 07:28:17 GMT < Server: Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8e-fips-rhel5 mod_auth_passthrough/2.1 mod _bwlimited/1.4 FrontPage/5.0.2.2635 < < HTTP/1.1 200 OK < Content-Type: text/html < Transfer-Encoding: chunked < Connection: Keep-Alive < Date: Sun, 04 Oct 2009 14:53:52 GMT < Keep-Alive: timeout=5, max=100 < Server: Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8e-fips-rhel5 mod_auth_passthrough/2.1 mod _bwlimited/1.4 FrontPage/5.0.2.2635 < X-Powered-By: PHP/5.2.9 < Accept-Ranges: bytes < * Connection #1 to host www.chemrefer.com left intact * Connection #0 to host chemrefer.com left intact < HTTP/1.1 200 OK < Content-Type: text/html < Transfer-Encoding: chunked < Connection: Keep-Alive < Date: Sun, 04 Oct 2009 14:53:52 GMT < Keep-Alive: timeout=5, max=100 < Server: Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8e-fips-rhel5 mod_auth_passthrough/2.1 mod _bwlimited/1.4 FrontPage/5.0.2.2635 < X-Powered-By: PHP/5.2.9 < Accept-Ranges: bytes < < HTTP/1.1 200 OK < Content-Type: text/html < Transfer-Encoding: chunked < Connection: Keep-Alive < Date: Sun, 04 Oct 2009 14:53:52 GMT < Keep-Alive: timeout=5, max=100 < Server: Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8e-fips-rhel5 mod_auth_passthrough/2.1 mod _bwlimited/1.4 FrontPage/5.0.2.2635 < X-Powered-By: PHP/5.2.9 < Accept-Ranges: bytes < < HTTP/1.1 200 OK < Content-Type: text/html < Transfer-Encoding: chunked < Connection: Keep-Alive < Date: Sun, 04 Oct 2009 14:53:52 GMT < Keep-Alive: timeout=5, max=100 < Server: Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8e-fips-rhel5 mod_auth_passthrough/2.1 mod _bwlimited/1.4 FrontPage/5.0.2.2635 < X-Powered-By: PHP/5.2.9 < Accept-Ranges: bytes < * Connection #2 to host www.chemrefer.com left intact * Connection #4 to host www.chemrefer.com left intact * Connection #3 to host www.chemrefer.com left intact ________________________________ From: Frank McGeough <fmcgeough@mac.com> To: libcurl development <curl-library@cool.haxx.se> Sent: Saturday, 3 October, 2009 20:01:12 Subject: Re: c++ curl Just looking at your code briefly in-between college football Saturday but it looks like you are closing your file inside your loop where you are adding easy handles to the multi. I think that's probably the source of your crash since the FILE* is invalid when the getpage call is done. Make an array of FILE*'s. Keep them around until you're done. Have each of them point to a different file (which looks like its also an issue in the code). Good luck! On Oct 3, 2009, at 12:10 PM, will griffiths wrote: Hello, > >I have written a full text indexing program in C++ and am now building a spidering program and I intend to use libcurl for the download side of things. > >I am struggling a bit to transfer my curl multi code from PHP to C++ as I relied on the "getcontent" function quite heavily. > >I have included the full source code below. When i turned verbose on I got this output: >* Closing connection #0 >* Closing connection #1 >* Closing connection #2 >followed by a crash. I probably need help understanding how the curl functions work in C++. I have read the docs for the functions "individually" and then strung them together by looking at egs in Google Code Search. > >Any help appreciated :) >Will > >//#pragma comment(lib, "libcurl.lib") // now appending /link <DIR>\libcurl.lib to cmd args instead > >#include <string> >#include <iostream> >#include <map> >#include "curl/curl.h" > >using namespace std; > >FILE *fd; > >static size_t getpage(void *incoming, size_t size, size_t nmemb, void *page) >{ >return fwrite(incoming, size, nmemb, (FILE *)page); >} > >int main() >{ > >map<int, string> pass; >pass[0] = "http://chemrefer.com/index.html"; >pass[1] = "http://www.chemrefer.com/chemistry_search.php?page_title=news&submit"; >pass[2] = "http://www.chemrefer.com/chemistry_search.php?page_title=toolbar&submit"; >pass[3] = "http://www.chemrefer.com/chemistry_search.php?page_title=about&submit"; >pass[4] = "http://www.chemrefer.com/chemistry_search.php?page_title=contact&submit"; > >CURL *h[5]; > >CURLM *mh; >mh = curl_multi_init(); > >for (int i = 0; i < 5; ++i) { > >char *urlbuf = new char[pass[i].length()]; >strcpy(urlbuf, pass[i].c_str()); > >h[i] = curl_easy_init(); >fd = fopen("download.txt", "ab"); > >curl_easy_setopt(h[i], CURLOPT_URL, urlbuf); >curl_easy_setopt(h[i], CURLOPT_WRITEFUNCTION, getpage); >curl_easy_setopt(h[i], CURLOPT_WRITEDATA, fd); >curl_easy_setopt(h[i], CURLOPT_HEADER, 0); > >curl_multi_add_handle(mh, h[i]); > >delete[] urlbuf; > >fclose(fd); > >} >pass.clear(); > >int running; >CURLMcode result; >do { >result = curl_multi_perform(mh, &running); >if (result != CURLM_CALL_MULTI_PERFORM) break; >} >while (running > 0); > >for (int i = 0; i < 5; ++i) curl_multi_remove_handle(mh, h[i]); >for (int i = 0; i < 5; ++i) curl_easy_cleanup(h[i]); >curl_multi_cleanup(mh); > >} > > >------------------------------------------------------------------- >List admin: http://cool.haxx.se/list/listinfo/curl-library >Etiquette: http://curl.haxx.se/mail/etiquette.html

-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2009-10-04