lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180423205514.GA5876@andrea>
Date:   Mon, 23 Apr 2018 22:55:14 +0200
From:   Andrea Parri <andrea.parri@...rulasolutions.com>
To:     Waiman Long <longman@...hat.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
        Dave Chinner <david@...morbit.com>,
        Eric Sandeen <sandeen@...hat.com>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: [PATCH] locking/rwsem: Synchronize task state & waiter->task of
 readers

Hi Waiman,

On Mon, Apr 23, 2018 at 12:46:12PM -0400, Waiman Long wrote:
> On 04/10/2018 01:22 PM, Waiman Long wrote:
> > It was observed occasionally in PowerPC systems that there was reader
> > who had not been woken up but that its waiter->task had been cleared.

Can you provide more details about these observations?  (links to LKML
posts, traces, applications used/micro-benchmarks, ...)


> >
> > One probable cause of this missed wakeup may be the fact that the
> > waiter->task and the task state have not been properly synchronized as
> > the lock release-acquire pair of different locks in the wakeup code path
> > does not provide a full memory barrier guarantee.

I guess that by the "pair of different locks" you mean (sem->wait_lock,
p->pi_lock), right?  BTW, __rwsem_down_write_failed_common() is calling
wake_up_q() _before_ releasing the wait_lock: did you intend to exclude
this callsite? (why?)


> So smp_store_mb()
> > is now used to set waiter->task to NULL to provide a proper memory
> > barrier for synchronization.

Mmh; the patch is not introducing an smp_store_mb()... My guess is that
you are thinking at the sequence:

	smp_store_release(&waiter->task, NULL);
	[...]
	smp_mb(); /* added with your patch */

or what am I missing?


> >
> > Signed-off-by: Waiman Long <longman@...hat.com>
> > ---
> >  kernel/locking/rwsem-xadd.c | 17 +++++++++++++++++
> >  1 file changed, 17 insertions(+)
> >
> > diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
> > index e795908..b3c588c 100644
> > --- a/kernel/locking/rwsem-xadd.c
> > +++ b/kernel/locking/rwsem-xadd.c
> > @@ -209,6 +209,23 @@ static void __rwsem_mark_wake(struct rw_semaphore *sem,
> >  		smp_store_release(&waiter->task, NULL);
> >  	}
> >  
> > +	/*
> > +	 * To avoid missed wakeup of reader, we need to make sure
> > +	 * that task state and waiter->task are properly synchronized.
> > +	 *
> > +	 *     wakeup		      sleep
> > +	 *     ------		      -----
> > +	 * __rwsem_mark_wake:	rwsem_down_read_failed*:
> > +	 *   [S] waiter->task	  [S] set_current_state(state)
> > +	 *	 MB		      MB
> > +	 * try_to_wake_up:
> > +	 *   [L] state		  [L] waiter->task
> > +	 *
> > +	 * For the wakeup path, the original lock release-acquire pair
> > +	 * does not provide enough guarantee of proper synchronization.
> > +	 */
> > +	smp_mb();
> > +
> >  	adjustment = woken * RWSEM_ACTIVE_READ_BIAS - adjustment;
> >  	if (list_empty(&sem->wait_list)) {
> >  		/* hit end of list above */
> 
> Ping!
> 
> Any thought on this patch?
> 
> I am wondering if there is a cheaper way to apply the memory barrier
> just on architectures that need it.

try_to_wake_up() does:

	raw_spin_lock_irqsave(&p->pi_lock, flags);
	smp_mb__after_spinlock();
	if (!(p->state & state))

My understanding is that this smp_mb__after_spinlock() provides us with
the guarantee you described above.  The smp_mb__after_spinlock() should
represent a 'cheaper way' to provide such a guarantee.

If this understanding is correct, the remaining question would be about
whether you want to rely on (and document) the smp_mb__after_spinlock()
in the callsite in question (the comment in wake_up_q()

   /*
    * wake_up_process() implies a wmb() to pair with the queueing
    * in wake_q_add() so as not to miss wakeups.
    */

does not appear to be suffient...).

  Andrea


> 
> Cheers,
> Longman
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ