Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi: implement waiting and wakeup using winsock events #6245

Closed
wants to merge 4 commits into from

Conversation

mback2k
Copy link
Member

@mback2k mback2k commented Nov 24, 2020

Updated PR description:

This is the 3rd attempt to implement multi-wait and wake-up based upon WinSock events instead of socketpairs.

The 2nd attempt fixed issues with FD_WRITE and FD_CLOSE not being detected on re-entry, this 3rd attempt fixes issues with FD_WRITE not being reset on re-entry.

Ref: #5397
Ref: #5631
Ref: #5634

Original PR description:

This is an attempt to restructure the code in Curl_multi_wait
in such a way that less syscalls are made by removing individual
calls to Curl_socket_check via SOCKET_READABLE/SOCKET_WRITABLE.

Bug: #6146

@mback2k
Copy link
Member Author

mback2k commented Nov 24, 2020

@rcombs @jay @bagder please take a look at my first attempt to fix/improve #6146 based upon #6146 (comment).

@mback2k mback2k self-assigned this Nov 24, 2020
@mback2k mback2k added the Windows Windows-specific label Nov 24, 2020
@mback2k
Copy link
Member Author

mback2k commented Nov 25, 2020

Seems like this currently breaks test 1564. I will work on this (hopefully) this evening.

lib/multi.c Outdated Show resolved Hide resolved
lib/multi.c Show resolved Hide resolved
@mback2k mback2k force-pushed the multi-consolidate-syscalls branch 3 times, most recently from b475b87 to 5bdfe15 Compare December 6, 2020 20:38
@mback2k
Copy link
Member Author

mback2k commented Dec 6, 2020

@tpodom would you mind helping test the performance with this PR, related to #6146?

@tpodom
Copy link

tpodom commented Dec 7, 2020

@mback2k sorry for the delay getting back to you. The timeout delay still occurs. I see the pollrc = 0 but then the WSAWaitForMultipleEvents delays for 1000 ms.

@tpodom
Copy link

tpodom commented Dec 7, 2020

I did a bit more playing around and I understand why I'm not hitting the WSAEWOULDBLOCK error. The select checks either in the multi-wait or the socket check prior to send that the socket is ready and if not, the send is not attempted. Since the send is never attempted to trigger WSAEWOULDBLOCK, the FD_WRITE is never signaled for the event so the WSAWaitForMultipleEvents just delays.

I played around with moving the polling after the transfer call and I do see a couple of times where the would block is hit on the send if I modify the send to ignore the socket check but then the timeout_ms delay kicks in and each iteration that gives it enough time to clear up to no longer block.

I think what would really be needed is to set some flag when the WSAEWOUDLBLOCK occurs and then check if that's been set before calling WSAWaitForMultipleEvents. That would require ignoring the socket readiness before send though in transfer.c:

  /* If we still have writing to do, we check if we have a writable socket. */
  if ((k->keepon & KEEP_SEND) && (select_res & CURL_CSELECT_OUT)) {

Basically, try to replicate the logic as described by MS:

"Therefore, an application can assume that sends are possible starting from the first FD_WRITE network event setting and lasting until a send returns WSAEWOULDBLOCK. After such a failure the application will find out that sends are again possible when an FD_WRITE network event is recorded and the associated event object is set."

@mback2k
Copy link
Member Author

mback2k commented Dec 7, 2020

Basically, try to replicate the logic as described by MS:

"Therefore, an application can assume that sends are possible starting from the first FD_WRITE network event setting and lasting until a send returns WSAEWOULDBLOCK. After such a failure the application will find out that sends are again possible when an FD_WRITE network event is recorded and the associated event object is set."

Makes sense. So an actual fix would require general behavior changes in curl outside of multi.c. One question would be: could this be the behavior regardless of platform or would we need to adjust the behavior just on Windows (with WinSock)? cc @bagder @jay

@tpodom
Copy link

tpodom commented Dec 7, 2020

I was thinking about this a bit more and what I'm thinking may not really work at all since I don't have a good grasp on how all of this fits together but what about something along the following:

Curl_send_plain returns CURLE_AGAIN which is handled in Curl_write. Instead of just returning 0, what if Curl_write updates the connection to indicate the last write failed with CURLE_AGAIN.

Then, in Curl_multi_wait, only perform the WSAWaitForMultipleEvents if there is no data to write (can we know that in Curl_multi_wait?) or if the write socket hit CURLE_AGAIN. Because if we have data to write and have not hit CURLE_AGAIN then Windows says we should assume we can keep writing.

Then, for the socket check in Curl_readwrite, could Curl_multi_wait set the conn->cselect_bits so the check is skipped, or could Curl_poll be updated to skip the select unless we previously hit CURLE_AGAIN?

mback2k added a commit to mback2k/curl that referenced this pull request Dec 13, 2020
Reset FD_WRITE by sending zero bytes which is permissible
and will be treated by implementations as successful.

Bug: curl#6146
Closes curl#6245
@mback2k
Copy link
Member Author

mback2k commented Dec 13, 2020

@tpodom thanks for the input. I may have found an easier way to reset FD_WRITE by just sending zero bytes before trying to wait on a socket for write readiness again. Would you mind testing the new implementation in this updated PR? cc @jay @rcombs

mback2k added a commit to mback2k/curl that referenced this pull request Dec 15, 2020
Reset FD_WRITE by sending zero bytes which is permissible
and will be treated by implementations as successful.

Bug: curl#6146
Closes curl#6245
@jay
Copy link
Member

jay commented Dec 15, 2020

Using the same test from #6146:

master (796c068 2020-12-14)
Transfer rate: 3992 KB/sec (52429310 bytes in 13 seconds)
Transfer rate: 4124 KB/sec (52429310 bytes in 12 seconds)
Transfer rate: 4148 KB/sec (52429310 bytes in 12 seconds)

mback2k:multi-consolidate-syscalls (c502f6a 2020-12-15)
Transfer rate: 4107 KB/sec (52429310 bytes in 12 seconds)
Transfer rate: 4059 KB/sec (52429310 bytes in 13 seconds)
Transfer rate: 4141 KB/sec (52429310 bytes in 12 seconds)

lib/multi.c Show resolved Hide resolved
@mback2k mback2k marked this pull request as ready for review December 15, 2020 12:15
@tpodom
Copy link

tpodom commented Dec 15, 2020

@mback2k sorry for the delay getting back to you. The latest change looks to be working great for me. I tested with auto tuning both enabled and disabled on the server and did not experience any significant delays and the time to upload was consistent. Good find on using the 0 byte send to trigger the FD_WRITE!

@mback2k
Copy link
Member Author

mback2k commented Dec 16, 2020

@tpodom no problem, thanks for testing. @jay thanks for verifying the performance, would you mind reviewing the PR?

Can I also get you @Togtja, @ngg, @rmja, @tmkk, @RadMission to test this PR regarding up- and download performance issues you experienced with past iterations?

@jay
Copy link
Member

jay commented Mar 15, 2021

Using the same postupload example to the closest test server (ny2 for me) the results are very similar.

master (f83d4ea 2021-03-14)
Transfer rate: 4123 KB/sec (52429310 bytes in 12 seconds)
Transfer rate: 3984 KB/sec (52429310 bytes in 13 seconds)
Transfer rate: 4043 KB/sec (52429310 bytes in 13 seconds)
Transfer rate: 4101 KB/sec (52429310 bytes in 12 seconds)
Transfer rate: 4206 KB/sec (52429310 bytes in 12 seconds)
Transfer rate: 3951 KB/sec (52429310 bytes in 13 seconds)

mback2k:multi-consolidate-syscalls (ca6ce77 2021-03-12)
Transfer rate: 4013 KB/sec (52429310 bytes in 13 seconds)
Transfer rate: 4045 KB/sec (52429310 bytes in 13 seconds)
Transfer rate: 4047 KB/sec (52429310 bytes in 13 seconds)
Transfer rate: 4030 KB/sec (52429310 bytes in 13 seconds)
Transfer rate: 4079 KB/sec (52429310 bytes in 13 seconds)
Transfer rate: 4018 KB/sec (52429310 bytes in 13 seconds)

master (what multi-consolidate-syscall is currently branched off of) is overall a few KB faster but that is not of significance due to network conditions.

@mback2k
Copy link
Member Author

mback2k commented Mar 15, 2021

@jay thanks a lot for testing this again! I would really appreciate your approval on this PR then if you feel like it is ready to go in during the next feature window. I will also perform some additional upload and download performance tests before merge.

@jay jay added the feature-window A merge of this requires an open feature window label Mar 15, 2021
Copy link
Member

@jay jay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for next-feature-window. suggest squash whatever you can to make it easier in case we have to revert.

mback2k added a commit to mback2k/curl that referenced this pull request Apr 5, 2021
This reverts commit 2260e0e,
also restoring previous follow up changes which were reverted.

Authored-by: rcombs on github
Authored-by: Marc Hörsken

Restores curl#5634
Reverts curl#6281
Part of curl#6245
mback2k added a commit to mback2k/curl that referenced this pull request Apr 5, 2021
1. Consolidate pre-checks into a single Curl_poll call:

This is an attempt to restructure the code in Curl_multi_wait
in such a way that less syscalls are made by removing individual
calls to Curl_socket_check via SOCKET_READABLE/SOCKET_WRITABLE.

2. Avoid resetting the WinSock event multiple times:

We finally call WSAResetEvent anyway, so specifying it as
an optional parameter to WSAEnumNetworkEvents is redundant.

3. Wakeup directly in case no sockets are being monitoring:

Fix the WinSock based implementation to skip extra waiting by
not sleeping in case no sockets are to be waited on and just
the WinSock event is being monitored for wakeup functionality.

Assisted-by: Tommy Odom
Assisted-by: Jay Satiro

Bug: curl#6146
Part of curl#6245
mback2k added a commit to mback2k/curl that referenced this pull request Apr 5, 2021
Reset FD_WRITE by sending zero bytes which is permissible
and will be treated by implementations as successful send.

Without this we won't be notified in case a socket is still
writable if we already received such a notification and did
not send any data afterwards on the socket. This would lead
to waiting forever on a writable socket being writable again.

Assisted-by: Tommy Odom
Assisted-by: Jay Satiro
Tested-by: Tommy Odom
Tested-by: tmkk on github

Bug: curl#6146
Part of curl#6245
mback2k added a commit to mback2k/curl that referenced this pull request Apr 5, 2021
@mback2k mback2k force-pushed the multi-consolidate-syscalls branch from 2070783 to 1838649 Compare April 5, 2021 17:52
@mback2k
Copy link
Member Author

mback2k commented Apr 5, 2021

Rebased to current master in order to get ready for merge during this feature window.

@jay
Copy link
Member

jay commented Apr 6, 2021

Rebased to current master in order to get ready for merge during this feature window.

bugfixes only until the patch release on the 14th

mback2k added a commit to mback2k/curl that referenced this pull request Apr 20, 2021
This reverts commit 2260e0e,
also restoring previous follow up changes which were reverted.

Authored-by: rcombs on github
Authored-by: Marc Hörsken

Restores curl#5634
Reverts curl#6281
Part of curl#6245
mback2k added a commit to mback2k/curl that referenced this pull request Apr 20, 2021
1. Consolidate pre-checks into a single Curl_poll call:

This is an attempt to restructure the code in Curl_multi_wait
in such a way that less syscalls are made by removing individual
calls to Curl_socket_check via SOCKET_READABLE/SOCKET_WRITABLE.

2. Avoid resetting the WinSock event multiple times:

We finally call WSAResetEvent anyway, so specifying it as
an optional parameter to WSAEnumNetworkEvents is redundant.

3. Wakeup directly in case no sockets are being monitoring:

Fix the WinSock based implementation to skip extra waiting by
not sleeping in case no sockets are to be waited on and just
the WinSock event is being monitored for wakeup functionality.

Assisted-by: Tommy Odom
Assisted-by: Jay Satiro

Bug: curl#6146
Part of curl#6245
mback2k added a commit to mback2k/curl that referenced this pull request Apr 20, 2021
Reset FD_WRITE by sending zero bytes which is permissible
and will be treated by implementations as successful send.

Without this we won't be notified in case a socket is still
writable if we already received such a notification and did
not send any data afterwards on the socket. This would lead
to waiting forever on a writable socket being writable again.

Assisted-by: Tommy Odom
Assisted-by: Jay Satiro
Tested-by: Tommy Odom
Tested-by: tmkk on github

Bug: curl#6146
Part of curl#6245
mback2k added a commit to mback2k/curl that referenced this pull request Apr 20, 2021
@mback2k
Copy link
Member Author

mback2k commented Apr 20, 2021

Rebased again to current master in order to get ready for merge during this feature window.

This reverts commit 2260e0e,
also restoring previous follow up changes which were reverted.

Authored-by: rcombs on github
Authored-by: Marc Hörsken

Restores curl#5634
Reverts curl#6281
Part of curl#6245
1. Consolidate pre-checks into a single Curl_poll call:

This is an attempt to restructure the code in Curl_multi_wait
in such a way that less syscalls are made by removing individual
calls to Curl_socket_check via SOCKET_READABLE/SOCKET_WRITABLE.

2. Avoid resetting the WinSock event multiple times:

We finally call WSAResetEvent anyway, so specifying it as
an optional parameter to WSAEnumNetworkEvents is redundant.

3. Wakeup directly in case no sockets are being monitoring:

Fix the WinSock based implementation to skip extra waiting by
not sleeping in case no sockets are to be waited on and just
the WinSock event is being monitored for wakeup functionality.

Assisted-by: Tommy Odom
Assisted-by: Jay Satiro

Bug: curl#6146
Part of curl#6245
Reset FD_WRITE by sending zero bytes which is permissible
and will be treated by implementations as successful send.

Without this we won't be notified in case a socket is still
writable if we already received such a notification and did
not send any data afterwards on the socket. This would lead
to waiting forever on a writable socket being writable again.

Assisted-by: Tommy Odom
Assisted-by: Jay Satiro
Tested-by: Tommy Odom
Tested-by: tmkk on github

Bug: curl#6146
Part of curl#6245
@mback2k
Copy link
Member Author

mback2k commented Apr 21, 2021

Rebased again due to previously failing test 1660 which was fixed with 6b97f13.

mback2k added a commit that referenced this pull request Apr 21, 2021
This reverts commit 2260e0e,
also restoring previous follow up changes which were reverted.

Authored-by: rcombs on github
Authored-by: Marc Hörsken
Reviewed-by: Jay Satiro
Reviewed-by: Marcel Raad

Restores #5634
Reverts #6281
Part of #6245
@mback2k mback2k closed this in e92998a Apr 21, 2021
mback2k added a commit that referenced this pull request Apr 21, 2021
Reset FD_WRITE by sending zero bytes which is permissible
and will be treated by implementations as successful send.

Without this we won't be notified in case a socket is still
writable if we already received such a notification and did
not send any data afterwards on the socket. This would lead
to waiting forever on a writable socket being writable again.

Assisted-by: Tommy Odom
Reviewed-by: Jay Satiro
Reviewed-by: Marcel Raad
Tested-by: tmkk on github

Bug: #6146
Closes #6245
mback2k added a commit that referenced this pull request Apr 21, 2021
Suggested-by: Gergely Nagy
Reviewed-by: Jay Satiro
Reviewed-by: Marcel Raad

Closes #6245
@mback2k
Copy link
Member Author

mback2k commented Apr 21, 2021

Thanks everyone! @rcombs @tpodom @tmkk @jay @MarcelRaad

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-window A merge of this requires an open feature window Windows Windows-specific
Development

Successfully merging this pull request may close these issues.

None yet

7 participants