curl-library
Re: Poor HTTP POST upload performance
Date: Mon, 22 Jun 2015 03:31:36 -0400
On 5/20/2015 11:35 AM, Bryan Christ wrote:
> Ray,
>
> Here is a sample program that illustrates the problem. I tested the
> performance with an 11MB file. This sample program consistently takes
> 11-17 seconds to complete. If I upload the same file through Firefox,
> it takes about 4.5 seconds.
>
> http://www.mediafire.com/view/9rb4aac4hnhma47/curl_filedrop_upload.c
>
>
> On Tue, Apr 14, 2015 at 9:14 PM, Bryan Christ <bryan.christ_at_gmail.com
> <mailto:bryan.christ_at_gmail.com>> wrote:
>
> Ray,
>
> Thanks for the reply. It would be quite difficult to create an
> isolated test case due the inherit cost of setting up a RESTful
> POST to the server.
>
> The problem is very much reproducible. Several users have
> reported this issue. It's not hard to see the problem if you
> download and build the MediaFire FUSE client over at GitHub.
>
> I have seen the notes about the TODO item and I have seen the
> posts that seem to regard this as a SFTP only problem. I suspect
> that if I build libcurl from source and change the define,
> the performance will go. If that be the case, would you accept a
> patch for curl_easy_setopt() to allow this to be configured at
> run-time?
>
> As for the server, it doesn't support compression or the
> user-agent header. Also, I have a direct connection to our data
> center. Those are definitely not issues.
>
>
> On 4/13/2015 10:01 PM, Bryan Christ wrote:
> > I've been trying to figure out why http POST uploads are so slow with
> > libcurl. Upload speeds continually perform at about 1/10th of the
> > expected performance (or less). Many users have reported this
> behavior
> > on our forum. I suspect it has a lot to do with CURL_MAX_WRITE_SIZE
> > being set to 16k. Uploads to these same servers through other means
> > (JavaScript for example) reach their expected throughput. The
> code in
> > question can be seen here:
> >
> >https://github.com/MediaFire/mediafire-fuse/blob/master/utils/http.c
> > (at approx line 314)
> >
> > Assuming the issue is the 16K buffer limit, are there any other
> > options? Asking users to recompile a custom libcurl with a larger
> > buffer size is not very palatable.
>
> If you want help on the list your best bet is a self contained example
> that can be used to reproduce and the details at [1]. The buffer issue
> is in the TODO [2] but from what I see there and elsewhere the
> significance is SFTP related.
>
> Continually or continuously? Is it 100% reproducible? A few ideas:
> Maybe your uploads are compressed when they go through the
> browser, but
> they are not compressed when uploaded through libcurl. There is no
> compression built in libcurl upload (as far as I know), you would have
> to do it manually and attach the header for content encoding gzip.
> A different user agent (or lack of one -- the default) when you use
> libcurl causes different treatment by the server. This applies to any
> header, really.
> The IP address returned to the browser is different than the IP
> address
> returned to curl via DNS because the DNS request was made differently.
> Or you just got a different IP address because they rotate.
> I/O in your program. eg posting a FILE but the I/O is backed up.
> Proxy setting is different in the browser than it is in libcurl.
>
> Once you have a way to reproduce try using the curl tool and see
> if you
> get the same result. Also try the latest version.
>
>
> [1]: http://curl.haxx.se/docs/bugs.html#What_to_report
> <http://curl.haxx.se/docs/bugs.html#What_to_report>
> [2]:
> http://curl.haxx.se/docs/todo.html#Modified_buffer_size_approach
> <http://curl.haxx.se/docs/todo.html#Modified_buffer_size_approach>
>
>
> On Tue, Apr 14, 2015 at 3:52 AM, Aleksandar Lazic
> <al-curllibrary_at_none.at <mailto:al-curllibrary_at_none.at>> wrote:
>
> Dear Bryan
>
> Am 14-04-2015 04:01, schrieb Bryan Christ:
>
> I've been trying to figure out why http POST uploads are
> so slow with libcurl. Upload speeds continually perform at
> about 1/10th of the expected performance (or less). Many
> users have reported this behavior on our forum. I suspect
> it has a lot to do with CURL_MAX_WRITE_SIZE being set to
> 16k. Uploads to these same servers through other means
> (JavaScript for example) reach their expected throughput.
> The code in question can be seen here:
>
> https://github.com/MediaFire/mediafire-fuse/blob/master/utils/http.c
> [1] (at approx line 314)
>
> Assuming the issue is the 16K buffer limit, are there any
> other options? Asking users to recompile a custom libcurl
> with a larger buffer size is not very palatable.
>
>
> Which version of libcurl is used?
>
> http://curl.haxx.se/libcurl/c/curl_version.html
>
> BR Aleks
>
>
Usually I limit what I quote but it's been a while so I've left most of
it to give everyone enough context. It would be helpful in the future if
you would not top post!
Bryan, thanks for the example. I had some SSL problem with mediafire a
while ago (I may have pinged you about that) and I put this on hold. The
SSL issue now solved and the release out I have tested the example.
Unfortunately the TL;DR here is I cannot reproduce what you describe in
any of several versions curl and openssl in Ubuntu 14.04 LTS x64.
I tried fully updated and also went back 6 months:
Linux ubuntu1404-x64-vm 3.13.0-45-generic #74-Ubuntu SMP Tue Jan 13
19:36:28 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Linux ubuntu1404-x64-vm 3.13.0-55-generic #94-Ubuntu SMP Thu Jun 18
00:27:10 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
For the upload file I used 10MB of random data [1].
Tested 5 times each, always shared libs:
- packaged libcurl 7.35 and OpenSSL 1.0.1f (libcurl4-openssl-dev)
- libcurl 7.41.0 and OpenSSL 1.0.1f
- libcurl 7.44.0-DEV (master f44b803 2015-06-21) and OpenSSL 1.0.2c
The average was 19 seconds and the maximum deviation half a second from
that.
Tested 5 times each (upload was slightly different, more on that later):
- Firefox 31 ESR
- Firefox 35.0.1
- Firefox 38.0
The average was 33 seconds (from the time it started to the time it went
to 100% 'Waiting for notification') and the maximum deviation ~2
seconds. Also I tried in Windows 7 x64 for comparison with 31ESR and
Nightly and the average was the same.
So as you can see Firefox was slower by a significant amount for me. I
used debugging proxy Fiddler [2] to investigate why Firefox was taking
longer and it looks like it's because when uploading through the browser
mediafire js breaks the file into chunks of 1MB: upload 1MB (takes about
~2 sec), wait for confirmation, and repeat.
X-Filename: block0.rng
X-Filesize: 10485767
X-Filehash: 117ff99fec590418e5880512895940ed86f05ac2b2cc25d1dbd888083f632ab0
X-Unit-Id: 0
X-Unit-Hash:
1fd1ad3c9c83baa7f4005edef10834dd60087588398b6844f4cb2b4e283430ba
X-Unit-Size: 1048576
Also, there is one thing I had to do differently when uploading via
Firefox. Because mediafire uses deduplication I could not upload the
random data file (block0.rng) as is. When I tried to do that it wouldn't
work because mediafire's javascript sends the hash of the file to the
server and since it already exists it the data isn't actually sent, even
if I delete all references to the file. Instead what I did was append a
few random bytes to the end of the file before each upload so that I
would have a unique hash for each upload:
dd if=/dev/urandom count=10 bs=1 >> block0.rng
I wonder if what you are seeing has something to do with deduplication?
I would use a debugging proxy and wireshark or something to see if all
the data is actually uploaded to the server and get a better idea what
is happening. Also I'd try the random data file at [1] for both curl and
Firefox and see if your results are different. If you still can't figure
out the problem then next I'd try what you were going to do to change
the size of the curl buffer from 16384. Also play with SO_SNDBUF.
About your example, please note if you post more than 2GB, use
CURLOPT_POSTFIELDSIZE_LARGE. It uses a curl_off_t which is 64 bits if
you have a platform that supports it and large file support enabled.
Right now you are using CURLOPT_POSTFIELDSIZE which is documented to
take a long [3] not uint64_t and long size will vary. Another thing is
you have your read callback returning size * ret which is wrong even
though it works (because libcurl passes size as 1 but that's
undocumented I believe so it could change). What I would return is ret
if it's valid or in this case the abort if not. See CURLOPT_READFUNCTION
[4] for more.
For anyone else who read through all this and is curious enough to try
Bryan's example [5] you have to create a mediafire account (free) and
get a filedrop key. It took me a while to figure out what filedrop_key
was. I searched the API docs but I never found it would say where to get
one. What you do is click the Create Folder button to the right of the
upload button and choose 'Make this folder a FileDrop'. Then there is a
popup 'This folder is not enabled as a FileDrop. Would you like to
enable the folder as a FileDrop? ' click OK for that. Then it will show
'Create Customized FileDrop'. Scroll down to 'Deploy Your FileDrop' and
the filedrop_key is in 'Hosted FileDrop' after drop=. Replace the
MF_FILEDROP key in the example with that one. Click Save and Close on
the filedrop.
[1]: http://www.rngresearch.com/download/block0.rng
[2]: http://www.telerik.com/fiddler
[3]: http://curl.haxx.se/libcurl/c/CURLOPT_POSTFIELDSIZE.html
[4]: http://curl.haxx.se/libcurl/c/CURLOPT_READFUNCTION.html
[5]: http://www.mediafire.com/view/9rb4aac4hnhma47/curl_filedrop_upload.c
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2015-06-22