linux-kernel - Re: [PATCH] locking/qrwlock: Fix ordering in queued_write_lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YHhhtuVsM4FkINLM@hirez.programming.kicks-ass.net>
Date:   Thu, 15 Apr 2021 17:54:30 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Will Deacon <will@...nel.org>
Cc:     Ali Saidi <alisaidi@...zon.com>, linux-kernel@...r.kernel.org,
        catalin.marinas@....com, steve.capper@....com,
        benh@...nel.crashing.org, stable@...r.kernel.org,
        Ingo Molnar <mingo@...hat.com>,
        Waiman Long <longman@...hat.com>,
        Boqun Feng <boqun.feng@...il.com>
Subject: Re: [PATCH] locking/qrwlock: Fix ordering in
 queued_write_lock_slowpath

On Thu, Apr 15, 2021 at 04:28:21PM +0100, Will Deacon wrote:
> On Thu, Apr 15, 2021 at 05:03:58PM +0200, Peter Zijlstra wrote:
> > @@ -73,9 +75,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
> >  
> >  	/* When no more readers or writers, set the locked flag */
> >  	do {
> > -		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> > -	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
> > -					_QW_LOCKED) != _QW_WAITING);
> > +		cnt = atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
> 
> I think the issue is that >here< a concurrent reader in interrupt context
> can take the lock and release it again, but we could speculate reads from
> the critical section up over the later release and up before the control
> dependency here...

OK, so that was totally not clear from the original changelog.

> > +	} while (!atomic_try_cmpxchg_relaxed(&lock->cnts, &cnt, _QW_LOCKED));
> 
> ... and then this cmpxchg() will succeed, so our speculated stale reads
> could be used.
> 
> *HOWEVER*
> 
> Speculating a read should be fine in the face of a concurrent _reader_,
> so for this to be an issue it implies that the reader is also doing some
> (atomic?) updates.

So we're having something like:

	CPU0				CPU1

	queue_write_lock_slowpath()
	  atomic_cond_read_acquire() == _QW_WAITING

					queued_read_lock_slowpath()
					  atomic_cond_read_acquire()
					  return;
   ,--> (before X's store)
   |					X = y;
   |
   |					queued_read_unlock()
   |					  (void)atomic_sub_return_release()
   |
   |	  atomic_cmpxchg_relaxed(.old = _QW_WAITING)
   |
    `--	r = X;


Which as Will said is a cmpxchg ABA, however for it to be ABA, we have
to observe that unlock-store, and won't we then also have to observe the
whole critical section?

Ah, the issue is yet another load inside our own (CPU0)'s critical
section. which isn't ordered against the cmpxchg_relaxed() and can be
issued before.

So yes, then making the cmpxchg an acquire, will force all our own loads
to happen later.