curl-and-python

Re: Tricks to optimizing PUT performance?

From: Mark Seger <mjseger_at_gmail.com>
Date: Fri, 25 Jan 2013 09:30:34 -0500

dima - thanks for the reply. sorry for not getting back yesterday but I
was offline and don't want you to think this isn't important to me. I see
where you're getting a pycul run in <1sec so should I assume you're on the
same system as the target of the PUT? I'm going over a wire...

My issue with large data isn't so much the speed, it's the CPU load and
it's sustained at very high levels for a single upload. This is what a
200MB upload looks like when I monitor it with collectl:

#<--------CPU--------><----------Disks-----------><----------Network---------->
#cpu sys inter ctxsw KBRead Reads KBWrit Writes KBIn PktIn KBOut
 PktOut
   6 0 36 18 0 0 0 0 1 6 1
  6
   3 0 129 51 0 0 0 0 4 107 790
 53
  20 0 627 95 0 0 0 0 30 771 7913
158
  76 7 2557 116 0 0 0 0 67 1708 31656
 1009
  98 10 6757 127 0 0 0 0 178 4564 38086
 2357
  69 9 4715 107 0 0 0 0 122 3117 25116
 1573
   0 0 10 14 0 0 0 0 0 1 0
  1

and as you can see the load it quite high which on a small core system
means you can't get much else done if you want to multi-thread. What I'm
trying to figure out is where all the CPU is being spend and if it's
possible to reduce it. It's certainly possibly I'm doing something wrong
in my code. Does this look ok?

    c = pycurl.Curl()
    c.setopt(c.URL, '%s' % url)
    c.setopt(c.HTTPHEADER, [auth_token])
    c.setopt(c.UPLOAD, 1)

    c.setopt(pycurl.READFUNCTION, read_callback(1).callback)
    c.setopt(pycurl.INFILESIZE, objsize)
    c.perform()

where the url and auth_token are build independent of this connection. My
read_callback simply pulls data out of a big string and returns it in 16384
size chunks. While I don't think it would do any thing to improve the CPU
load, is there a way to increase the size of the chunks? Maybe some other
setopt call?

But my other issue is when I run with objects as little as 1k, the PUT
takes over 1 full second just to execute the perform() call and that
doesn't sound right either. I can do many more small object uploads with
other libraries and I've gotta believe it's something wrong on the way I've
written the code.

-mark

On Thu, Jan 24, 2013 at 5:43 AM, Dima Tisnek <dimaqq_at_gmail.com> wrote:
>
> I went ahead an tried to reproduce your workload, sent 100M data in 10K
reads over http and then https aes128 sha1 over localhost
>
> air:~ dima$ time openssl s_server -msg -debug -nocert -cipher
'ADH-AES128-SHA' -accept 8080 > somefile.ssl
> ^C
>
> real 0m5.425s
> user 0m1.316s
> sys 0m0.429s
>
> air:~ dima$ time ./test-pycurl-put.py
> [snip]
> real 0m4.078s
> user 0m1.810s
> sys 0m0.284s
>
> Well I get a spike of 100% cpu usage for individual processes, but that's
all for the good cause, according to openssl speed, aes-128-cbc crunches up
to 120MB/s and sha1 some 300MB/s, in other words, ~60MB/s I get is not
superb, but quite acceptable.
>
> For comparison, http pycurl time output:
> real 0m0.946s
> user 0m0.175s
> sys 0m0.177s
>
> yes it takes 1 second to push 100MB through, but it hardly taxes the
processor, namely a tenth of a single core.
>
> If you get much lower throughput than this, perhaps it's down to how you
process the data you send in python, e.g. if you keep reallocating or
"resizing" large strings, that could lead to O(N^2).
>
> d.
>
>
>
> On 24 January 2013 01:35, Mark Seger <mjseger_at_gmail.com> wrote:
>>
>> I've managed to get to the point where I can now upload in-memory
strings of data, via a REST interface. Very cool stuff. In fact the good
news I can hit very high network rates with strings on the order of 100MB
or more. The bad news is smaller strings upload very slowly and I have no
idea why.
>>
>> To try to figure out what's going on I surrounded the perform() call
with time.time() to measure the delay and I'm finding that even with
payloads on the order of 32KB it's always taking over a second to execute
the upload call whereas other interfaces go much faster on the order of
under 0.1 sec/upload. Has anyone else every observed this behavior?
>>
>> Digging a little deeper I've observed a few things:
>> - when my callback is called for data, it is passed a chunk size of
16384 and I wonder if asking for bigger chunks would result in fewer calls
which in turn could speed things up
>> - another thing I noticed is very high CPU loads, not for the small
strings but for the larger ones I'm seeing close to 100% of a single CPU
being saturated. Is this caused by encryption? is there any way to speed
it up or choose a faster algorithm. Or is it something totally different?
>> - I'm also guessing the overhead is not caused by data compression
because I'm intentionally sending a string of all spaces which are highly
compressible and I do see the full 100MB go over the network and if it were
compressed I'd expect to see far less.
>>
>> I know pycurl is very heavily used everywhere and that this could simply
be a case of operator error on my part. If anyone would like to see my
code I'd be happy to send it along, but for now I thought I'd just keep it
to a couple of simple questions in case the answer is an obvious one.
>>
>> -mark
>>
>>
>> _______________________________________________
>> http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-python
>>
>
>
> _______________________________________________
> http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-python
>

_______________________________________________
http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-python
Received on 2013-01-25