linux-kernel - Re: [PATCH 1/1] Fix: trace sched switch start/stop racy updates

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20190821132310.GC28441@linux.ibm.com>
Date:   Wed, 21 Aug 2019 06:23:10 -0700
From:   "Paul E. McKenney" <paulmck@...ux.ibm.com>
To:     Will Deacon <will@...nel.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Joel Fernandes <joel@...lfernandes.org>,
        Alan Stern <stern@...land.harvard.edu>,
        rostedt <rostedt@...dmis.org>,
        Valentin Schneider <valentin.schneider@....com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Boqun Feng <boqun.feng@...il.com>,
        Will Deacon <will.deacon@....com>,
        David Howells <dhowells@...hat.com>
Subject: Re: [PATCH 1/1] Fix: trace sched switch start/stop racy updates

On Wed, Aug 21, 2019 at 11:32:01AM +0100, Will Deacon wrote:
> On Tue, Aug 20, 2019 at 01:29:32PM -0700, Paul E. McKenney wrote:
> > On Tue, Aug 20, 2019 at 03:56:12PM +0200, Peter Zijlstra wrote:
> > > On Sat, Aug 17, 2019 at 01:08:02AM -0700, Linus Torvalds wrote:
> > > 
> > > > The data tearing issue is almost a non-issue. We're not going to add
> > > > WRITE_ONCE() to these kinds of places for no good reason.
> > > 
> > > Paulmck actually has an example of that somewhere; ISTR that particular
> > > case actually got fixed by GCC, but I'd really _love_ for some compiler
> > > people (both GCC and LLVM) to state that their respective compilers will
> > > not do load/store tearing for machine word sized load/stores.
> > 
> > I do very much recall such an example, but I am now unable to either
> > find it or reproduce it.  :-/
> > 
> > If I cannot turn it up in a few days, I will ask the LWN editors to
> > make appropriate changes to the "Who is afraid" article.
> > 
> > > Without this written guarantee (which supposedly was in older GCC
> > > manuals but has since gone missing), I'm loathe to rely on it.
> > > 
> > > Yes, it is very rare, but it is a massive royal pain to debug if/when it
> > > does do happen.
> > 
> > But from what I can see, Linus is OK with use of WRITE_ONCE() for data
> > races on any variable for which there is at least one READ_ONCE().
> > So we can still use WRITE_ONCE() as we would like in our own code.
> > Yes, you or I might be hit by someone else's omission of WRITE_ONCE(),
> > it is better than the proverbial kick in the teeth.
> > 
> > Of course, if anyone knows of a compiler/architecture combination that
> > really does tear stores of 32-bit constants, please do not keep it
> > a secret!  After all, it would be good to get that addressed easily
> > starting now rather than after a difficult and painful series of
> > debugging sessions.
> 
> It's not quite what you asked for, but if you look at the following
> silly code:
> 
> typedef unsigned long long u64;
> 
> struct data {
> 	u64 arr[1023];
> 	u64 flag;
> };
> 
> void foo(struct data *x)
> {
> 	int i;
> 
> 	for (i = 0; i < 1023; ++i)
> 		x->arr[i] = 0;
> 
> 	x->flag = 0;
> }
> 
> void bar(u64 *x)
> {
> 	*x = 0xabcdef10abcdef10;
> }
> 
> Then arm64 clang (-O2) generates the following for foo:
> 
> foo:                                    // @foo
> 	stp	x29, x30, [sp, #-16]!   // 16-byte Folded Spill
> 	orr	w2, wzr, #0x2000
> 	mov	w1, wzr
> 	mov	x29, sp
> 	bl	memset
> 	ldp	x29, x30, [sp], #16     // 16-byte Folded Reload
> 	ret
> 
> and so the store to 'flag' has become part of the memset, which could
> easily be bytewise in terms of atomicity (and this isn't unlikely given
> we have a DC ZVA instruction which only guaratees bytewise atomicity).
> 
> GCC (also -O2) generates the following for bar:
> 
> bar:
> 	mov	w1, 61200
> 	movk	w1, 0xabcd, lsl 16
> 	stp	w1, w1, [x0]
> 	ret
> 
> and so it is using a store-pair instruction to reduce the complexity in
> the immediate generation. Thus, the 64-bit store will only have 32-bit
> atomicity. In fact, this is scary because if I change bar to:
> 
> void bar(u64 *x)
> {
> 	*(volatile u64 *)x = 0xabcdef10abcdef10;
> }
> 
> then I get:
> 
> bar:
> 	mov	w1, 61200
> 	movk	w1, 0xabcd, lsl 16
> 	str	w1, [x0]
> 	str	w1, [x0, 4]
> 	ret
> 
> so I'm not sure that WRITE_ONCE would even help :/

Well, I can have the LWN article cite your email, then.  So thank you
very much!

Is generation of this code for a 64-bit volatile store considered a bug?
Or does ARMv8 exclude the possibility of 64-bit MMIO registers?  And I
would guess that Thomas and Linus would ask a similar bugginess question
for normal stores.  ;-)

> It's worth noting that:
> 
> void baz(atomic_long *x)
> {
> 	atomic_store_explicit(x, 0xabcdef10abcdef10, memory_order_relaxed)
> }
> 
> does the right thing:
> 
> baz:
> 	mov	x1, 61200
> 	movk	x1, 0xabcd, lsl 16
> 	movk	x1, 0xef10, lsl 32
> 	movk	x1, 0xabcd, lsl 48
> 	str	x1, [x0]
> 	ret

OK, the C11 and C++11 guys should be happy with this.

> Whilst these examples may be contrived, I do thing they illustrate that
> we can't simply say "stores to aligned, word-sized pointers are atomic".

And thank you again!

							Thanx, Paul