lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090923190337.GA16983@Krystal>
Date:	Wed, 23 Sep 2009 15:03:37 -0400
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	Chris Friesen <cfriesen@...tel.com>
Cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC] Userspace RCU: (ab)using futexes to save cpu cycles and
	energy

* Chris Friesen (cfriesen@...tel.com) wrote:
> On 09/23/2009 11:48 AM, Mathieu Desnoyers wrote:
> 
> > Here are the primitives I've created. I'd like to have feedback on my
> > futex use, just to make sure I did not do any incorrect assumptions.
> 
> > /*
> >  * Wake-up any waiting defer thread. Called from many concurrent threads.
> >  */
> > static void wake_up_defer(void)
> > {
> >         if (unlikely(atomic_read(&defer_thread_futex) == -1))
> >                 atomic_set(&defer_thread_futex, 0);
> >                 futex(&defer_thread_futex, FUTEX_WAKE,
> >                       0, NULL, NULL, 0);
> > }
> 
> Is it a problem if multiple threads all hit the defer_thread_futex==-1
> case simultaneously?

It should not be, since what we'll have is, e.g.:

thread 1 calls wakeup
thread 2 calls wakeup
thread 3 calls wait

(thread 3 is waiting on the futex, defer_thread_futex = -1)
- thread 1 sees defer_thread_futex==-1
- thread 2 sees defer_thread_futex==-1
- thread 1 sets defer_thread_futex = 0
- thread 2 sets defer_thread_futex = 0
- thread 1 calls futex() to wake up the waiter, expect 0
- thread 2 calls futex() to wake up the waiter, expect 0

Basically, what happens in this scenario is that the first futex()
call will wake up any waiter, and the second will be a no-op.

Let's complicate this, if we have thread 3 running wait_defer()
concurrently:

- thread 3 decrements defer_thread_futex
- thread 1 sees defer_thread_futex==-1
- thread 2 sees defer_thread_futex==-1
- thread 1 sets defer_thread_futex = 0
- thread 2 sets defer_thread_futex = 0
- thread 1 calls futex() to wake up the waiter, expect 0
- thread 2 calls futex() to wake up the waiter, expect 0
- thread 3 calls futex() to wait, expect -1
  Returns immediately because defer_thread_futex == 0

Other scenario, where thread decrements defer_thread_futex a bit later:

- thread 1 sees defer_thread_futex==0
- thread 2 sees defer_thread_futex==0
- thread 3 decrements defer_thread_futex
- thread 3 tests defer_thread_futex==-1
- thread 3 calls futex() to wait, expect -1

In this scenario, we have to notice that if threads 1/2 enqueued tasks
to do before checking defer_thread_futex, these tasks would not be seen
by the waiter thread.

So correct memory ordering of:

- wake_up_defer:
  * queue callbacks to perform (1)
  * wake up (2)

- wait_defer:
  * for (;;)
    * wait for futex (3)
    * sleep 100ms (wait for more callbacks to be enqueued)
    * dequeue callbacks, execute them (4)


actually matters. I'll have to be really careful about that (unless we
just accept that tasks to perform could be queued for a while, however,
I'd like to give an upper bound to the delay between batch callback
execution).

Ensuring that 1 is written before 2, and that 4 is done before 3 seems a
bit racy. (I have to got out for lunch now, so I'll have to review the
ordering afterward)


>  If so, maybe this should use an atomic
> test-and-set operation so that only one thread actually calls futex().

It's not a matter if many threads wake up the waiter, so I don't think
the test-and-set is required. The benefit of using a simple test here is
that we don't have to bring the cache-line in exclusive mode to the
local CPU to perform the test. It can stay shared.

> 
> > /*
> >  * Defer thread waiting. Single thread.
> >  */
> > static void wait_defer(void)
> > {
> >         atomic_dec(&defer_thread_futex);
> >         if (atomic_read(&defer_thread_futex) == -1)
> >                 futex(&defer_thread_futex, FUTEX_WAIT, -1,
> >                       NULL, NULL, 0);
> > }
> 
> Is it a problem if the value of defer_thread_futex changes to zero after
> the dec but before the test?

No. That's not a problem, because this means there is a concurrent "wake
up". Seeing a value of "0" here will skip over the futex wait and go on.
The concurrent futex wakeup call will simply be a no-op in that case.

Thanks for the comments,

Mathieu

> 
> Chris

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ