[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.2406031953550.9248@angie.orcam.me.uk>
Date: Tue, 2 Jul 2024 00:50:42 +0100 (BST)
From: "Maciej W. Rozycki" <macro@...am.me.uk>
To: "Paul E. McKenney" <paulmck@...nel.org>
cc: John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>,
Arnd Bergmann <arnd@...nel.org>, linux-alpha@...r.kernel.org,
Arnd Bergmann <arnd@...db.de>,
Richard Henderson <richard.henderson@...aro.org>,
Ivan Kokshaysky <ink@...assic.park.msu.ru>,
Matt Turner <mattst88@...il.com>, Alexander Viro <viro@...iv.linux.org.uk>,
Marc Zyngier <maz@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-kernel@...r.kernel.org, Michael Cree <mcree@...on.net.nz>,
Frank Scheiner <frank.scheiner@....de>
Subject: Re: [PATCH 00/14] alpha: cleanups for 6.10
On Mon, 3 Jun 2024, Paul E. McKenney wrote:
> > This is a fairly recent addition, thank you for putting it all together.
> > I used to rely solely on Documentation/memory-barriers.txt. Thanks for
> > the reference.
>
> It has been in the kernel since April 2018, but OK. And a big "thank you"
When you've been around for 25+ years, 5 years back seems like yesterday.
> to all the people who made this possible and who continue contributing
> to it. And Documentation/memory-barriers.txt still matters, though the
> long-term goal is for it to be subsumed into tools/memory-model. Things
> like compiler optimizations make this difficult, but not impossible.
I realise these are tough matters and I second your gratitude.
> Another precaution is to ensure that any contraints of a non-common-case
> architecture be tested for. For example, if I add a 64-bit divide, I
> get yelled at promptly. In contrast, that long list of byte accesses
> that Arnd posted were suffered in silence. So they accumulated well
> past the point where they can reasonably be backed out.
Well, it's easy to notice and yell when you get an unresolved link-time
reference to __divdi3 or suchlike. While such heisenbugs as those caused
by the race condition from concurrent unprotected rmw accesses may all be
too easily blamed on cosmic rays or any other random instability.
Take for example the GCC bug I mentioned in my reply to Linus in this
thread, GCC PR rtl-optimization/115565. It took 20 years to spot, even
though it's in heavily used code and it does not depend on timing: with
the right conditions it will trigger every time.
If I were aware of these issues, I would definitely have got at them
sooner. Anyway, as mentioned in the other reply, I've overcome system
setup issues now and will be working on the problem discussed here.
Maciej
Powered by blists - more mailing lists