cURL / Mailing Lists / curl-library / Single Mail

curl-library

CURLOPT_READFUNCTION performance issue

From: Michael Dowling <mtdowling_at_gmail.com>
Date: Mon, 22 Jul 2013 20:07:12 -0700

I've noticed that there appears to be a significant performance hit when using
CURLOPT_READFUNCTION. This issue seems to be platform dependent as I've only
been able to get poor performance on Linux (Amazon Linux m1.large 64-bit
ami-0358ce33) across multiple versions of cURL and PHP. I've not seen any
performance issues on my Mac running PHP 5.3.15 and cURL 7.21.4.

When sending PUT requests containing a 10 byte body (testing123) to a node.js
server (others have reported issues with Jetty as well) using
CURLOPT_READFUNCTION, the read and write times returned from
CURLINFO_SPEED_UPLOAD and CURLINFO_SPEED_DOWNLOAD are very poor: ~833 upload and
1333 download.

If you send the same request using CURLOPT_CUSTOMREQUEST => PUT and send the
body using CURLOPT_POSTFIELDS then the transfer times are significantly
improved: ~34,000 upload and ~55,000 download.

Note: In both tests, I disabled the Expect: 100-Continue header by setting an
"Expect:" header in CURLOPT_HTTPHEADER. I am also utilizing persistent HTTP
connections by using the same multi handle and computing the average upload and
download times across many different requests.

I wrote a very simple test script that demonstrates the performance issue:
https://gist.github.com/mtdowling/6059009. You'll need to have a node.js server
running to handle the requests. I've written up a simple bash script that will
install PHP, node.js, start the test server, and run the performance test:
https://gist.github.com/anonymous/6059035.

Thinking that this might be an issue with a specific version of cURL or PHP,
I manually compiled different versions of PHP and cURL and ran the performance
tests. There was no improvement using the version combination I had success with
on my mac or using the latest version of cURL (7.31) and PHP (5.5.1). This does
not appear to be version dependent. Here are the results of that testing:
https://github.com/guzzle/guzzle/issues/349#issuecomment-21284834

I ran strace on the PHP script and found that using CURLOPT_POSTFIELDS appears
to send the headers and the entire payload before receiving anything from the
server, while CURLOPT_READFUNCTION appears to send the request
headers, receive the
response headers, then sends the body afterwards.

I've provided the strace output below. For brevity and easier comprehension, I
removed the various calls to
"clock_gettime(CLOCK_MONOTONIC, {17579, 343661534}) = 0".

CURLOPT_READFUNCTION strace:

sendto(3, "PUT /guzzle-server/perf HTTP/1.1"..., 78, MSG_NOSIGNAL, NULL, 0) = 78
poll([{fd=3, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=3,
events=POLLOUT|POLLWRNORM}], 2, 0) = 1 ([{fd=3,
revents=POLLOUT|POLLWRNORM}])
sendto(3, "testing123", 10, MSG_NOSIGNAL, NULL, 0) = 10
select(4, [3], [], [], {1, 0}) = 1 (in [3], left {0, 964661})
poll([{fd=3, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 1
([{fd=3, revents=POLLIN|POLLRDNORM}])
recvfrom(3, "HTTP/1.1 200 OK\r\nContent-Length:"..., 16384, 0, NULL, NULL) = 116
poll([{fd=3, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 0 (Timeout)

CURLOPT_POSTFIELDS strace:

sendto(3, "PUT /guzzle-server/perf HTTP/1.1"..., 137, MSG_NOSIGNAL,
NULL, 0) = 137
poll([{fd=3, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 1
([{fd=3, revents=POLLIN|POLLRDNORM}])
recvfrom(3, "HTTP/1.1 200 OK\r\nContent-Length:"..., 16384, 0, NULL, NULL) = 116
poll([{fd=3, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 0 (Timeout)

The loop used to execute the curl_multi handles is very simple and can be found
in the test script at
https://gist.github.com/mtdowling/6059009#file-readfuction_perf-php-L5.

Does anyone have any insight on why I'm seeing such a performance hit? Is there
some way I can get better performance, perhaps by rearranging my CURLOPT_*
options or changing my loop that executes the cURL handles? Based on the strace
output, I would assume that this is a cURL issue and not a PHP issue.

Please let me know if I can supply any additional information to help
troubleshoot.

Thanks,
Michael
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2013-07-23