cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: problem with NSS backend and threads

From: K. Frank <kfrank29.c_at_gmail.com>
Date: Wed, 7 Sep 2016 21:07:25 -0400

Hi Kamil!

On Mon, Aug 22, 2016 at 9:27 AM, Kamil Dudka <kdudka_at_redhat.com> wrote:
> On Thursday, August 18, 2016 11:45:24 Peter Wang wrote:
>> ...
>> which is due to this branch in nssSlot_IsTokenPresent()
>>
>> /* avoid repeated calls to check token status within set interval */
>> if (within_token_delay_period(slot)) {
>> return ((slot->ckFlags & CKF_TOKEN_PRESENT) != 0);
>> }
>>
>> Peter
>
> Thank you for analyzing the cause of it! After having a brief look at the
> code, I believe the race condition is obvious:
>
> PK11_FindSlotByName() acquires only a "read" lock, which does not ensure
> mutual exclusion of readers. So multiple calls of PK11_FindSlotByName()
> can be executed in parallel. As there is no additional lock on the way,
> nssSlot_IsTokenPresent() can also be executed in parallel.
>
> If there is a schedule where one thread has assigned slot->lastTokenPing
> but not yet set the CKF_TOKEN_PRESENT flag in slot->ckFlags, all other
> threads will keep returning SEC_ERROR_NO_TOKEN (without actually checking
> presence of the token) until the first thread continues (or the specified
> timeout elapses).
>
> In order to fix it, we need to add a mutex to nssSlot_IsTokenPresent() or
> at least reorder mutation of slot's state such that slot->lastTokenPing is
> not updated before the CKF_TOKEN_PRESENT flag in slot->ckFlags is updated.

A quick caution:

(First, a disclaimer: I have not looked at the code, and I am making some
assumptions about the details of the code and what you mean by "reorder
mutation." That is, I am guessing what you mean, and I apologize if I have
guessed wrongly.)

Simply requiring -- in the source code -- that (pseudocode):

   slot->ckFlags.CKF_TOKEN_PRESENT = 1; // this happens first
   slot->lastTokenPing = new_value; // this happens second

will not eliminate your race condition in a multithreaded environment. This
is because a second thread is not guaranteed to see these two updates in
the desired order.

In more detail, this is because, although the sequence points at the end of
these statements "guarantee" that the first statement happens before the
second, the as-if rule permits the compiler (and hardware) to reorder or even
optimize away these statements as long as the "observable behavior" of
the program doesn't change ***when observed from the standpoint of the
single-threaded "abstract machine"***.

The language does not guarantee that a second thread will see this same
ordering, and, in the real world, many compilers will emit code (and hardware
will process that code in a way) such that a second thread will see a different
ordering of the two assignments.

The only (portable) way to fix such a race condition is to use some sort of
explicit thread synchronization, be it a mutex, finer-grained locks, memory
barriers, etc.

The simplest way to do this would be your suggestion of using a mutex.
Unless this code is in a highly efficiency-critical location, that's
what I would
do. If efficiency is a big concern, you could probably eliminate some of the
overhead of a mutex by using memory barriers, but that tends to be trickier.

I don't mean to belabor this, and I apologize if I've misunderstood what you
were saying. I just wanted to point out that merely reordering in the source
code the updates of the variables that make up the slot's state won't fix the
race condition (absent some additional thread synchronization). And, as I'm
sure you know, these kinds of race-condition bugs can lay dormant for quite
a while and be missed by even rather thorough testing, so it's worth taking
care to not let the bug sneak back in.

> ...
> Kamil

Happy Multi-Threaded Hacking!

K. Frank
-------------------------------------------------------------------
List admin: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://curl.haxx.se/mail/etiquette.html
Received on 2016-09-08