cURL / Mailing Lists / curl-library / Single Mail

curl-library

Unnecessary delay downloading multiple files (via SFTP)?

From: Patrik Thunström <patrik.thunstrom_at_bassetlabs.com>
Date: Tue, 17 Feb 2009 15:52:10 +0100

Hi!

Having just started out using libcURL for transferring files I haven't
yet gotten entirely acquainted with the source code, so please bear with
me if I haven't got the full grip of the inner workings of libcURL.

For now, the only protocol I'm using for transfers is SFTP, together
with the curl_easy-interface, so I cannot say if this "issue" is
affecting other protocols too. I've got the picture that this issue is
not affecting the multi-interface by reading the mailing list archive.

What I'm suspicious about is the performance when downloading multiple
files from the same SFTP connection. The transfer itself is nice and
fast, but for each of the transfers I noticed there seemed to be a fixed
delay in between transferring the next file. Tracking it down I found
out that there is a constant timeout value given at line 1856 of
Transfer.c, set at 1000ms.

To get a better understanding, and verifying whether this constant was
the culprit for things, I tried experimenting with this value, to see
what effect it would give. Lowering this indeed lowers the delay in
between the transfers. I'm not sure if this is protocol bound
restrictions, but from experimenting with SFTP transfers, obviously on a
good network, there were no issues noticed lowering this to such a low
value as 5 ms.

Further analyzing the source code makes it look to me like it is
unconditionally awaiting the constant timeout, to ensure that the
sockets are ready before trying to do any transfer.

What got me a bit more stumped though is the behavior when doing the
exactly same operation but instead uploading files, the same delay does
not exist.

By reading the mailing list archives one does find a lot of information
about blocking libssh2 operations, and that the easy-interface does have
a bit of blocking code left in it, but it's hard to get a clear picture
of the current situation about the code, since the archived messages
seems to be from 2006 up to today it's hard to filter out what is still
correct information.

Mainly, I'm curious why come the large timeout is there in the first
place, and whether it would be safe to use a lower value if the usage of
libcURL is restricted to SFTP transfers, or if this is in general a bad
operation?

I still don't feel quite like I find my way around the libcURL source
code good enough to be able to take a swing at fixing this behavior, if
it's even possible, but if someone has any tips, workarounds or ideas
this would be very welcome. As I'll be transferring quite a lot of small
files it is quite the performance hit with the delay as it is now, even
if it will work.

Anyhow, I'll have to thank everyone contributing to this great library,
as I've found it very useful and easy to work with!

Best regards
Patrik Thunström
Received on 2009-02-17