cURL / Mailing Lists / curl-library / Single Mail

curl-library

Fetching More than 500 files from Server

From: Swamy Mudhbasalar <swamydbit_at_gmail.com>
Date: Thu, 14 Jun 2012 10:49:09 +0530

Hello,

>>Using which libcurl version? Using which API? On what operating system?

LibCurl Version : 7.21.0
Operating System : Microsoft Windows Server 2003

Below APIs are used.
main()
{
        ptCurlHandle = curl_easy_init();
        while(NoofFiles != 0)
        {
                DownloadFiles();
                NoofFiles--;
        }
        curl_easy_cleanup(ptCurlHandle);
}

Void DownloadFiles()
{
        curl_easy_setopt(ptCurlHandle, CURLOPT_ERRORBUFFER, acErrorBuffer);
        curl_easy_setopt(ptCurlHandle, CURLOPT_URL, pstrURL);
        curl_easy_setopt(ptCurlHandle, CURLOPT_WRITEFUNCTION, VmwGetCB);
        curl_easy_setopt(ptCurlHandle, CURLOPT_WRITEDATA, (void *)&tContext);
        curl_easy_setopt(ptCurlHandle, CURLOPT_USERAGENT, pstrUserAgent);
        curl_easy_setopt(ptCurlHandle, CURLOPT_SSL_VERIFYPEER, 0L);
        curl_easy_setopt(ptCurlHandle, CURLOPT_SSL_VERIFYHOST, 0L);
        curl_easy_setopt(ptCurlHandle, CURLOPT_USERPWD, pstrUserPass);
        curl_easy_perform(ptCurlHandle);
}

The ptCurlHandle is a gobal variable. curl_easy_init is called initially.
DownloadFiles() is called in a loop, the loop terminates when all
files are downloaded.
pstrURL will contains the path of the file to be downloaded.

We are try to fetch files of Linux machine (On Which ESX is installed)
from Window machines.

>> In which sense is this connection then still "alive" ? How do you see it being
kept in 30 minutes?

When I run the netstat command from the command prompt, i can see many
connection established. For every file there is one connection
estalished. It can also be seen in vSphere client.

For example:
>> netsats
Active Connections

  Proto Local Address Foreign Address State
  TCP dev-vcbproxy dev-vcenter-4-2 ESTABLISHED
  TCP dev-vcbproxy dev-vcenter-4-2 ESTABLISHED
  .
  .
  .
  TCP dev-vcbproxy dev-vcenter-4-2 ESTABLISHED

All this connection will get closed after some time approximately 30 mins.
Since we are fetching more than 500 files we can see more than 500
Active Connections.

Regards,
SAM

===============================================================
On 6/13/12, curl-library-request_at_cool.haxx.se
<curl-library-request_at_cool.haxx.se> wrote:
> Send curl-library mailing list submissions to
> curl-library_at_cool.haxx.se
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-library
> or, via email, send a message with subject or body 'help' to
> curl-library-request_at_cool.haxx.se
>
> You can reach the person managing the list at
> curl-library-owner_at_cool.haxx.se
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of curl-library digest..."
>
>
> Today's Topics:
>
> 1. Re: Fetching More than 500 files from Server (Daniel Stenberg)
> 2. RE: SFTP "File already completely downloaded" but the file is
> empty (NEDJARI Hafed)
> 3. Re: curl_schannel.c and realloc() (Marc Hoersken)
> 4. Re: curl_schannel.c and realloc() (Daniel Stenberg)
> 5. RE: SFTP "File already completely downloaded" but the file is
> empty (Daniel Stenberg)
> 6. RE: Windows SSPI Schannel implementation ready (Steve Holme)
> 7. Re: Windows SSPI Schannel implementation ready (Yang Tse)
> 8. Re: Windows SSPI Schannel implementation ready (Marc Hoersken)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 13 Jun 2012 12:00:32 +0200 (CEST)
> From: Daniel Stenberg <daniel_at_haxx.se>
> To: libcurl development <curl-library_at_cool.haxx.se>
> Subject: Re: Fetching More than 500 files from Server
> Message-ID: <alpine.DEB.2.00.1206131155050.14878_at_tvnag.unkk.fr>
> Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII
>
> On Wed, 13 Jun 2012, Swamy Mudhbasalar wrote:
>
>> We are try to fetch files from the ESX server 4.0 through HTTP request
>> using
>> curl library in C program.
>
> Using which libcurl version? Using which API? On what operating system?
>
>> I am using a single easy curl handle to fetch all the files to take
>> advantage of persistent connection. But, When tried to fetch more than 500
>>
>> files from the ESX 4.0 using curl library fails.
>
> ...
>
>> Each time while fetching a file a new connection is established. So when
>> more than 500 connection is established and fetching of files stops. The
>> established connections are not getting closed and they remain idle for 30
>>
>> mins.
>
> Why is there new ones created and if new ones are created, how come the old
>
> ones are still kept around? That's not something libcurl makes happen.
> libcurl
> tries very hard to re-use connections as much as possible.
>
>> HTTP Client request Sent:
>>
>> Host: dev-vcenter-4-2
>> User-Agent: gSOAP/2.7
>> Content-Type: text/xml; charset=utf-8
>> Content-Length: 504
>> Connection: close
>> Cookie:
>> vmware_soap_session=C762CF10-B1DF-427A-9E0C-9EB851FFDE3D;$Domain="dev-vcenter-4-2"
>> Cookie: ;$Path="/sdk";$Domain="dev-vcenter-4-2"
>> SOAPAction: "urn:vim25/4.1"
>
> ... "Connection: close" is something you make libcurl send, as it wouldn't
> select to send that by itself.
>
>> Connection: close
>
> In which sense is this connection then still "alive" ? How do you see it
> being
> kept in 30 minutes?
>
> I have a hard time to accept this problem description.
>
>> 1. In all HTTP request sent and recieved observed Connection: close ?
>
> That's caused by your program's use of libcurl.
>
>> 2. How to make Connection : Alive and use the same connection to fetch all
>>
>> the files? What need to added specificalling in the code to do this ?
>
> HTTP 1.1 doesn't need that header to be persistant, it is so by default!
>
>> 3.Why connections are not getting closed and they remain idle for 30
>> mins?
>
> You haven't explained with enough details for us to understand what exactly
>
> you're seeing and how you're using libcurl so this really isn't possible to
>
> answer to without a lot of guessing.
>
> --
>
> / daniel.haxx.se
>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 13 Jun 2012 10:12:10 +0000
> From: NEDJARI Hafed <hnedjari_at_generixgroup.com>
> To: libcurl development <curl-library_at_cool.haxx.se>
> Subject: RE: SFTP "File already completely downloaded" but the file is
> empty
> Message-ID:
> <64726B4819DE6D408BC3F387173DC2FDA4C1_at_S-VM-EXCHDAG-01.generixgroup.com>
>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Many thanks for you quickly reply. As the file I try to download is not
> empty, I have to redirect my research to libssh2.
>
> Do you think I can believe to get an answer from the libssh2 Team ?
>
> For me, you can close my request to libcurl development.
>
> Hafed
>
> -----Message d'origine-----
> De?: curl-library-bounces_at_cool.haxx.se
> [mailto:curl-library-bounces_at_cool.haxx.se] De la part de Daniel Stenberg
> Envoy??: mercredi 13 juin 2012 00:41
> ??: libcurl development
> Objet?: RE: SFTP "File already completely downloaded" but the file is empty
>
> On Tue, 12 Jun 2012, NEDJARI Hafed wrote:
>
>> * File already completely downloaded
>
> ... this originates from this source code:
>
> /* Setup the actual download */
> if(data->req.size == 0) {
> /* no data to transfer */
> Curl_setup_transfer(conn, -1, -1, FALSE, NULL, -1, NULL);
> infof(data, "File already completely downloaded\n");
> state(conn, SSH_STOP);
> break;
> }
>
> It looks like libssh2_sftp_stat_ex() returned and said the file size is zero
>
> bytes... I suggest you research that a bit closer.
>
> --
>
> / daniel.haxx.se
> -------------------------------------------------------------------
> List admin: http://cool.haxx.se/list/listinfo/curl-library
> Etiquette: http://curl.haxx.se/mail/etiquette.html
>
>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 13 Jun 2012 12:28:40 +0200
> From: Marc Hoersken <info_at_marc-hoersken.de>
> To: libcurl development <curl-library_at_cool.haxx.se>
> Subject: Re: curl_schannel.c and realloc()
> Message-ID:
> <CAO1VcVXCS25D-nKZy-Dm9_NwceqAYQffaqTCEEWaaQQCw-8szQ_at_mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> Hi Daniel,
>
> 2012/6/13 Daniel Stenberg <daniel_at_haxx.se>
>>
>> On Wed, 13 Jun 2012, Marc Hoersken wrote:
>>
>>> If this is the case, we can probably just set
>>> CURL_SCHANNEL_BUFFER_INIT_SIZE to BUFSIZE in order to reduce the number
>>> of realloc() calls. I would personally keep the code to gracefully handle
>>> the need for more buffer space.
>>
>>
>> I was first going to agree and then a second thought struck me. Why would
>> it ever need to handle more data? If it gets called asking for 20K of
>> data, there's nothing in the API that says the function must return that
>> much. We can safely just make BUFSIZE the maximum amount of data the
>> schannel_recv() function can return without it breaking any properly
>> written code!
>>
>> It would simplify the code without breaking anything...
>
> That's a good plan.
>
>>
>>
>> Then, as a follow-up improvement the code could probably use the 'buf'
>> buffer immediately instead of separately storing received data in a
>> malloc'ed buffer that is then memcpy()'ed to 'buf'. It would save one
>> malloc/free of a 16K buffer, and also skip memcpy()ing all data that is
>> received. (It will be copied at least once afterwards anyway.)
>>
>
> The problem is that we still need to buffer the received encrypted and
> unencrypted data since such data can already arive during the initial
> SSL/TLS handshake handled by the step-functions. Therefore we need the
> buffer to have the data available for the next read. And we also need
> this buffer between reads, because there might be an incomplete
> encrpyted SSL/TLS packet in the queue. Also there might be more data
> decrypted by DecryptMessage than the user wants to read, so we need to
> store that already decrypted data, too.
>
> Basically we have to do all the buffering around the DecryptMessage
> function as it is not guaranteed that it decrypts the exact amount of
> data we want to read or that all encrypted data passed into it is
> really decrypted afterwards.
>
> Best regards,
> Marc
>
>
>
> ------------------------------
>
> Message: 4
> Date: Wed, 13 Jun 2012 13:01:52 +0200 (CEST)
> From: Daniel Stenberg <daniel_at_haxx.se>
> To: libcurl development <curl-library_at_cool.haxx.se>
> Subject: Re: curl_schannel.c and realloc()
> Message-ID: <alpine.DEB.2.00.1206131300190.14878_at_tvnag.unkk.fr>
> Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
>
> On Wed, 13 Jun 2012, Marc Hoersken wrote:
>
>>> Then, as a follow-up improvement the code could probably use the 'buf'
>>> buffer immediately instead of separately storing received data in a
>>> malloc'ed buffer that is then memcpy()'ed to 'buf'.
>>
>> The problem is that we still need to buffer the received encrypted and
>> unencrypted data since such data can already arive during the initial
>> SSL/TLS handshake handled by the step-functions. Therefore we need the
>> buffer to have the data available for the next read. And we also need this
>>
>> buffer between reads, because there might be an incomplete encrpyted
>> SSL/TLS
>> packet in the queue. Also there might be more data decrypted by
>> DecryptMessage than the user wants to read, so we need to store that
>> already
>> decrypted data, too.
>
> Oh right. Sorry for suggsting such things without actually checking out the
>
> code carefully enough to say if it is indeed possible to do.
>
> Thanks for the details!
>
> --
>
> / daniel.haxx.se
>
>
> ------------------------------
>
> Message: 5
> Date: Wed, 13 Jun 2012 13:09:00 +0200 (CEST)
> From: Daniel Stenberg <daniel_at_haxx.se>
> To: libcurl development <curl-library_at_cool.haxx.se>
> Subject: RE: SFTP "File already completely downloaded" but the file is
> empty
> Message-ID: <alpine.DEB.2.00.1206131306360.14878_at_tvnag.unkk.fr>
> Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
>
> On Wed, 13 Jun 2012, NEDJARI Hafed wrote:
>
>> Many thanks for you quickly reply. As the file I try to download is not
>> empty, I have to redirect my research to libssh2.
>
> Yes, and I would then recommend that you get the latest libssh2, build it
> with
> --enable-debug and then post a full debug trace (showing the STAT call that
>
> gets the zero file size) to the libssh2-devel list for analysis.
>
>> Do you think I can believe to get an answer from the libssh2 Team ?
>
> I'm the maintainer of libssh2 as well. I believe you'll get some attention
> to
> your problem yes, but until we can repeat your problem in our ends we rely
> on
> you to get more details and to do the actual debugging...
>
> --
>
> / daniel.haxx.se
>
>
> ------------------------------
>
> Message: 6
> Date: Wed, 13 Jun 2012 12:13:39 +0100
> From: Steve Holme <steve_holme_at_hotmail.com>
> To: "'libcurl development'" <curl-library_at_cool.haxx.se>
> Subject: RE: Windows SSPI Schannel implementation ready
> Message-ID: <BAY164-ds32B493144A369DB1991F882F50_at_phx.gbl>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi all,
>
>> Get well soon.
>
> Thank you for the well wishes - I got a good night's sleep but am still
> feeling pretty crap this morning :(
>
> I'm dosed up on Ibuprofen at the moment to try and reduce the fever but I
> might have to nip out in a bit and grab some paracetamol based cold and
> flue
> type remedies.
>
>> As G?n sugested "SSL-Windows-native" or "WinSSL"
>> without version number information.
>> But, only when actually using Windows SSL
>> implementation. Unless I'm wrong, that is when
>> schannel is in use not simply when SSPI is used.
>
> I'm not opposed to not including the version number - this would be
> consistent to what WinIDN displays, however, I also think including the
> version number would also be consistent with all the other libraries that
> we
> display. I know this is a community effort but perhaps some direction from
> Daniel and whether the version number, as a rule of thumb, should be
> included for all items in the version string or not is needed here.
>
> I also think, as per the discussion I started 6 weeks ago which I thought
> we
> had decided to do, hence my work here, was that the package name "WinSSPI",
> "Windows SSPI" or "SSPI-Windows-native" should be displayed for the other
> features that SSPI offers not just the SChannel SSL support - again this is
> synonymous to the other Security Providers that curl uses and provides
> consistency.
>
> The inclusion of SSPI in curl has obviously changed a fair amount since it
> was first included as a feature and that was why I recommended including it
> in the version string back in April and moving to a scenario where SSPI
> isn't listed in the features list ;-) For API compatibility we have decided
> to keep it in the features list as well as listing it in the package /
> version string. If this is something that you had a strong opinion on, and
> I'm still not sure if it is or not, then why didn't you let us know 6 weeks
> ago before I spent two days doing the rework and learning curl's makefiles
> when I didn't need to.
>
>> Mostly library version number given for a system library.
>
> I don't have an issue with that, whilst others Guenter, Marc, Gisle et all
> don't seem to mind either.
>
> The only possible objection I have is... that version.lib is currently
> being
> linked to statically and if that is a problem for running curl on 12-year
> old+ versions of Windows then we *should* consider dynamically including it
> like we do with secur32.dll for example.
>
> S.
>
>
>
>
> ------------------------------
>
> Message: 7
> Date: Wed, 13 Jun 2012 16:40:03 +0200
> From: Yang Tse <yangsita_at_gmail.com>
> To: libcurl development <curl-library_at_cool.haxx.se>
> Subject: Re: Windows SSPI Schannel implementation ready
> Message-ID:
> <CAH23gUTc7iN3R8+7EitKt=oNzkW=BKWJc7=M9AQiFP9wOhzRCw_at_mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Steve Holme <steve_holme_at_hotmail.com> wrote:
>
>> I'm not opposed to not including the version number - this would be
>> consistent to what WinIDN displays, [...]
>
> Ok, then we have consensus then.
>
>> I also think, as per the discussion I started 6 weeks ago which I thought
>> we
>> had decided to do, hence my work here, was that the package name
>> "WinSSPI",
>> "Windows SSPI" or "SSPI-Windows-native" should be displayed for the other
>> features that SSPI offers not just the SChannel SSL support - again this
>> is
>> synonymous to the other Security Providers that curl uses and provides
>> consistency.
>
> I asked for a patch april 23.
> http://curl.haxx.se/mail/lib-2012-04/0259.html
>
> It has not been provided until june 10.
> http://curl.haxx.se/mail/lib-2012-06/0111.html
>
> One of the reasons for which I personally dislike big patches is that
> these usually hide changes which are not properly discussed, or
> discussed with such lengthy threads that no one knows finally what's
> going on, making it necessary to fully analyze resulting patch in
> order to know what changes it really introduces. This is what I'm
> doing now. I might have further objections, so don't be surprised if I
> mention or fix something else.
>
> Given that it seems we've reached consensus on that you can live
> without the numeric part of the string, and that I can live with some
> schannel specific identifier, I'm pushing right now a patch with the
> following commit message:
>
> schannel: remove version number and identify its use with 'schannel'
> literal
>
> Version number is removed in order to make this info consistent with
> how we do it with other MS and Linux system libraries for which we
> don't provide this info.
>
> Identifier changed from 'WinSSPI' to 'schannel' given that this is the
> actual provider of the SSL/TLS support. libcurl can still be built
> with SSPI and without SCHANNEL support.
>
> --
> -=[Yang]=-
>
>
> ------------------------------
>
> Message: 8
> Date: Wed, 13 Jun 2012 17:04:47 +0200
> From: Marc Hoersken <info_at_marc-hoersken.de>
> To: libcurl development <curl-library_at_cool.haxx.se>
> Subject: Re: Windows SSPI Schannel implementation ready
> Message-ID:
> <CAO1VcVWp-ag-xiRhJ8zPuVMs=QoHucjW+Lf7V1Ne29xAMkZkXg_at_mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> Hello everyone,
>
> 2012/6/13 Yang Tse <yangsita_at_gmail.com>:
>> One of the reasons for which I personally dislike big patches is that
>> these usually hide changes which are not properly discussed, or
>> discussed with such lengthy threads that no one knows finally what's
>> going on, making it necessary to fully analyze resulting patch in
>> order to know what changes it really introduces. This is what I'm
>> doing now. I might have further objections, so don't be surprised if I
>> mention or fix something else.
>
> This is why I posted the full commit history during the early
> development stages and especially separated even squashed commits to
> avoid mixing up changes to different areas. While you say that an
> updated patch wasn't provided until June, the other changes have been
> around since April and the full commit history including separate
> small patches was available from the beginning. The new changes
> introducing the updated version information were also separated and
> can be easily identified in the history.
>
>> Given that it seems we've reached consensus on that you can live
>> without the numeric part of the string, and that I can live with some
>> schannel specific identifier, I'm pushing right now a patch with the
>> following commit message:
>>
>> schannel: remove version number and identify its use with 'schannel'
>> literal
>>
>> Version number is removed in order to make this info consistent with
>> how we do it with other MS and Linux system libraries for which we
>> don't provide this info.
>>
>> Identifier changed from 'WinSSPI' to 'schannel' given that this is the
>> actual provider of the SSL/TLS support. libcurl can still be built
>> with SSPI and without SCHANNEL support.
>
> Actually "schannel" is not the correct identifier. It's either
> "Schannel" or "Secure Schannel". Please take a look at the MSDN
> documentation before doing such changes:
> http://msdn.microsoft.com/en-us/library/windows/desktop/ms678421.aspx
>
> I see that you changed the winbuild scripts to automatically set
> USE_SSPI to yes and USE_SCHANNEL to true if USE_SSL is false. Why did
> you do that? That makes it impossible to build a Windows version
> without SSL support. Before your changes Schannel was only enabled if
> SSPI was enabled and OpenSSL was not.
> Besides that, I liked the idea of having the low level Windows
> libaries in a separate WINLIBS variable in the makefile. Why did you
> revert this cleanup change, too?
>
> Best regards,
> Marc
>
>
> ------------------------------
>
> _______________________________________________
> curl-library mailing list
> curl-library_at_cool.haxx.se
> http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-library
>
>
> End of curl-library Digest, Vol 82, Issue 33
> ********************************************
>
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2012-06-14