lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210415152820.GB26439@willie-the-truck>
Date:   Thu, 15 Apr 2021 16:28:21 +0100
From:   Will Deacon <will@...nel.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Ali Saidi <alisaidi@...zon.com>, linux-kernel@...r.kernel.org,
        catalin.marinas@....com, steve.capper@....com,
        benh@...nel.crashing.org, stable@...r.kernel.org,
        Ingo Molnar <mingo@...hat.com>,
        Waiman Long <longman@...hat.com>,
        Boqun Feng <boqun.feng@...il.com>
Subject: Re: [PATCH] locking/qrwlock: Fix ordering in
 queued_write_lock_slowpath

On Thu, Apr 15, 2021 at 05:03:58PM +0200, Peter Zijlstra wrote:
> On Thu, Apr 15, 2021 at 02:25:52PM +0000, Ali Saidi wrote:
> > While this code is executed with the wait_lock held, a reader can
> > acquire the lock without holding wait_lock.  The writer side loops
> > checking the value with the atomic_cond_read_acquire(), but only truly
> > acquires the lock when the compare-and-exchange is completed
> > successfully which isn’t ordered. The other atomic operations from this
> > point are release-ordered and thus reads after the lock acquisition can
> > be completed before the lock is truly acquired which violates the
> > guarantees the lock should be making.
> 
> Should be per who? We explicitly do not order the lock acquire store.
> qspinlock has the exact same issue.
> 
> If you look in the git history surrounding spin_is_locked(), you'll find
> 'lovely' things.
> 
> Basically, if you're doing crazy things with spin_is_locked() you get to
> keep the pieces.
> 
> > Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwloc")
> > Signed-off-by: Ali Saidi <alisaidi@...zon.com>
> > Cc: stable@...r.kernel.org
> > ---
> >  kernel/locking/qrwlock.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
> > index 4786dd271b45..10770f6ac4d9 100644
> > --- a/kernel/locking/qrwlock.c
> > +++ b/kernel/locking/qrwlock.c
> > @@ -73,8 +73,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
> >  
> >  	/* When no more readers or writers, set the locked flag */
> >  	do {
> > -		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> > -	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
> > +		atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING);
> > +	} while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING,
> >  					_QW_LOCKED) != _QW_WAITING);
> >  unlock:
> >  	arch_spin_unlock(&lock->wait_lock);
> 
> This doesn't make sense, there is no such thing as a store-acquire. What
> you're doing here is moving the acquire from one load to the next. A
> load we know will load the exact same value.
> 
> Also see Documentation/atomic_t.txt:
> 
>   {}_acquire: the R of the RMW (or atomic_read) is an ACQUIRE
> 
> 
> If anything this code wants to be written like so.
> 
> ---
> 
> diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
> index 4786dd271b45..22aeccc363ca 100644
> --- a/kernel/locking/qrwlock.c
> +++ b/kernel/locking/qrwlock.c
> @@ -60,6 +60,8 @@ EXPORT_SYMBOL(queued_read_lock_slowpath);
>   */
>  void queued_write_lock_slowpath(struct qrwlock *lock)
>  {
> +	u32 cnt;
> +
>  	/* Put the writer into the wait queue */
>  	arch_spin_lock(&lock->wait_lock);
>  
> @@ -73,9 +75,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
>  
>  	/* When no more readers or writers, set the locked flag */
>  	do {
> -		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> -	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
> -					_QW_LOCKED) != _QW_WAITING);
> +		cnt = atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);

I think the issue is that >here< a concurrent reader in interrupt context
can take the lock and release it again, but we could speculate reads from
the critical section up over the later release and up before the control
dependency here...

> +	} while (!atomic_try_cmpxchg_relaxed(&lock->cnts, &cnt, _QW_LOCKED));

... and then this cmpxchg() will succeed, so our speculated stale reads
could be used.

*HOWEVER*

Speculating a read should be fine in the face of a concurrent _reader_,
so for this to be an issue it implies that the reader is also doing some
(atomic?) updates.

Ali?

Will

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ