Buy commercial curl support from WolfSSL. We help you work
out your issues, debug your libcurl applications, use the API, port to new
platforms, add new features and more. With a team lead by the curl founder
himself.
Re: Fewer mallocs is better, episode #47
- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]
From: James Read via curl-library <curl-library_at_lists.haxx.se>
Date: Thu, 27 Jan 2022 12:47:22 +0000
On Thu, Jan 27, 2022 at 12:06 PM Henrik Holst <henrik.holst_at_millistream.com>
wrote:
> I wonder if the results that you see from that example is due to the short
> life of each connection, aka most of the time there is spent on the
> tcp-handshake (and possible the tls handshake on top of it all) combined
> with a small initial tcp window that never gets large enough before the
> page has been downloaded in full.
>
> I have personally no experience pushing curl to its bw limits (I have no
> such use case) but I do use epoll for raw tcp/ip solutions where I easily
> push Gbps per connection but that is for very long lived connections (days
> and months) so a whole different ballpark.
>
> Try to download a much larger file than just the front page and see if
> that changes things.
>
I have experimented with wrk. I have modified the codebase to connect to
multiple servers rather than a single server. I am able to saturate the
bandwidth as long as connections are reused. As soon as I force to program
to make single use of a connection the throughput falls off a cliff edge. I
am beginning to think there is a problem with the TCP/IP stack
implementation that limits usage with many connections.
James Read
>
> /HH
>
> Den tors 27 jan. 2022 kl 12:55 skrev James Read <jamesread5737_at_gmail.com>:
>
>>
>>
>> On Thu, Jan 27, 2022 at 11:48 AM Henrik Holst via curl-library <
>> curl-library_at_lists.haxx.se> wrote:
>>
>>> depends on architecture, AFAIK if you compile for 64-bit Windows then
>>> __fastcall is completely ignored since the MS compiler uses the "Microsoft
>>> x64 calling convention" there regardless of what one types according to
>>> https://en.wikipedia.org/wiki/X86_calling_conventions
>>>
>>> /HH
>>>
>>> Den tors 27 jan. 2022 kl 12:40 skrev Gisle Vanem via curl-library <
>>> curl-library_at_lists.haxx.se>:
>>>
>>>> Henrik Holst wrote:
>>>>
>>>> > strlen() is one clear candidate for some optimizations, often however
>>>> it is declared as __attribute_pure__ so the
>>>>
>>>> Another candidate for MSVC would be 'cl -Gr'.
>>>> (build for fastcalls internally). But that's not
>>>> possible now due to things like:
>>>> cookie.c(1433): error C2440: 'function':
>>>> cannot convert from 'int (__fastcall *)(const void *,const void *)'
>>>> to '_CoreCrtNonSecureSearchSortCompareFunction'
>>>>
>>>> It would be interesting to compare the speed of
>>>> a '__cdecl' vs '__fastcall' libcurl.dll.
>>>>
>>>>
>> Just my two cents worth. But while we're talking about optimizations it
>> seems to me that cURL project needs to work on optimizing bandwidth usage
>> above all else. My experiments with cURL with epoll show that there is
>> little to no performance gain when using above 1024 concurrent connections.
>> This is not strictly a cURL only issue however as my experiments without
>> cURL have shown the same results.
>> https://stackoverflow.com/questions/70584121/why-doesnt-my-epoll-based-program-improve-performance-by-increasing-the-number
>>
>> It seems to me that we need to work on saturating available bandwidth
>> above all else as this is the true hardware bottleneck.
>>
>> James Read
>>
>>
>>> --
>>>> --gv
>>>> --
>>>> Unsubscribe: https://lists.haxx.se/listinfo/curl-library
>>>> Etiquette: https://curl.haxx.se/mail/etiquette.html
>>>>
>>> --
>>> Unsubscribe: https://lists.haxx.se/listinfo/curl-library
>>> Etiquette: https://curl.haxx.se/mail/etiquette.html
>>>
>>
Date: Thu, 27 Jan 2022 12:47:22 +0000
On Thu, Jan 27, 2022 at 12:06 PM Henrik Holst <henrik.holst_at_millistream.com>
wrote:
> I wonder if the results that you see from that example is due to the short
> life of each connection, aka most of the time there is spent on the
> tcp-handshake (and possible the tls handshake on top of it all) combined
> with a small initial tcp window that never gets large enough before the
> page has been downloaded in full.
>
> I have personally no experience pushing curl to its bw limits (I have no
> such use case) but I do use epoll for raw tcp/ip solutions where I easily
> push Gbps per connection but that is for very long lived connections (days
> and months) so a whole different ballpark.
>
> Try to download a much larger file than just the front page and see if
> that changes things.
>
I have experimented with wrk. I have modified the codebase to connect to
multiple servers rather than a single server. I am able to saturate the
bandwidth as long as connections are reused. As soon as I force to program
to make single use of a connection the throughput falls off a cliff edge. I
am beginning to think there is a problem with the TCP/IP stack
implementation that limits usage with many connections.
James Read
>
> /HH
>
> Den tors 27 jan. 2022 kl 12:55 skrev James Read <jamesread5737_at_gmail.com>:
>
>>
>>
>> On Thu, Jan 27, 2022 at 11:48 AM Henrik Holst via curl-library <
>> curl-library_at_lists.haxx.se> wrote:
>>
>>> depends on architecture, AFAIK if you compile for 64-bit Windows then
>>> __fastcall is completely ignored since the MS compiler uses the "Microsoft
>>> x64 calling convention" there regardless of what one types according to
>>> https://en.wikipedia.org/wiki/X86_calling_conventions
>>>
>>> /HH
>>>
>>> Den tors 27 jan. 2022 kl 12:40 skrev Gisle Vanem via curl-library <
>>> curl-library_at_lists.haxx.se>:
>>>
>>>> Henrik Holst wrote:
>>>>
>>>> > strlen() is one clear candidate for some optimizations, often however
>>>> it is declared as __attribute_pure__ so the
>>>>
>>>> Another candidate for MSVC would be 'cl -Gr'.
>>>> (build for fastcalls internally). But that's not
>>>> possible now due to things like:
>>>> cookie.c(1433): error C2440: 'function':
>>>> cannot convert from 'int (__fastcall *)(const void *,const void *)'
>>>> to '_CoreCrtNonSecureSearchSortCompareFunction'
>>>>
>>>> It would be interesting to compare the speed of
>>>> a '__cdecl' vs '__fastcall' libcurl.dll.
>>>>
>>>>
>> Just my two cents worth. But while we're talking about optimizations it
>> seems to me that cURL project needs to work on optimizing bandwidth usage
>> above all else. My experiments with cURL with epoll show that there is
>> little to no performance gain when using above 1024 concurrent connections.
>> This is not strictly a cURL only issue however as my experiments without
>> cURL have shown the same results.
>> https://stackoverflow.com/questions/70584121/why-doesnt-my-epoll-based-program-improve-performance-by-increasing-the-number
>>
>> It seems to me that we need to work on saturating available bandwidth
>> above all else as this is the true hardware bottleneck.
>>
>> James Read
>>
>>
>>> --
>>>> --gv
>>>> --
>>>> Unsubscribe: https://lists.haxx.se/listinfo/curl-library
>>>> Etiquette: https://curl.haxx.se/mail/etiquette.html
>>>>
>>> --
>>> Unsubscribe: https://lists.haxx.se/listinfo/curl-library
>>> Etiquette: https://curl.haxx.se/mail/etiquette.html
>>>
>>
-- Unsubscribe: https://lists.haxx.se/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.htmlReceived on 2022-01-27