[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <11bd3fe9-302c-4554-886b-f4a883f8de36@app.fastmail.com>
Date: Mon, 13 May 2024 08:28:32 +0200
From: "Arnd Bergmann" <arnd@...db.de>
To: "Paul E. McKenney" <paulmck@...nel.org>,
"Akira Yokosawa" <akiyks@...il.com>
Cc: "John Paul Adrian Glaubitz" <glaubitz@...sik.fu-berlin.de>,
"Ivan Kokshaysky" <ink@...assic.park.msu.ru>, linux-alpha@...r.kernel.org,
Linux-Arch <linux-arch@...r.kernel.org>, linux-kernel@...r.kernel.org,
"Matt Turner" <mattst88@...il.com>,
"Richard Henderson" <richard.henderson@...aro.org>,
"Linus Torvalds" <torvalds@...ux-foundation.org>,
"Alexander Viro" <viro@...iv.linux.org.uk>,
"Ulrich Teichert" <krypton@...ich-teichert.org>
Subject: Re: [GIT PULL] alpha: cleanups and build fixes for 6.10
On Mon, May 13, 2024, at 06:03, Paul E. McKenney wrote:
> On Mon, May 13, 2024 at 12:50:07PM +0900, Akira Yokosawa wrote:
>> On Sun, 12 May 2024 07:44:25 -0700, Paul E. McKenney wrote:
>> > On Sun, May 12, 2024 at 08:02:59AM +0200, John Paul Adrian Glaubitz wrote:
>> > So why didn't the people running current mainline on pre-EV56 Alpha
>> > systems notice? One possibility is that they are upgrading their
>> > kernels only occasionally. Another possibility is that they are seeing
>> > the failures, but are not tracing the obtuse failure modes back to the
>> > change(s) in question. Yet another possibility is that the resulting
>> > failures are very low probability, with mean times to failure that are
>> > so long that you won't notice anything on a single system.
>>
>> Another possibility is that the Jensen system was booted into uni processer
>> mode. Looking at the early boot log [1] provided by Ulrich (+CCed) back in
>> Sept. 2021, I see the following by running "grep -i cpu":
>>
>> >> > [1] https://marc.info/?l=linux-alpha&m=163265555616841&w=2
>>
>> [ 0.000000] Memory: 90256K/131072K available (8897K kernel code, 9499K rwdata, \
>> 2704K rodata, 312K init, 437K bss, 40816K reserved, 0K cma-reserved) [ 0.000000] \
>> random: get_random_u64 called from __kmem_cache_create+0x54/0x600 with crng_init=0 [ \
>> 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 [ 0.000000]
>> ^^^^^^
>>
>> Without any concurrent atomic updates, the "broken" atomic accesses won't
>> matter, I guess.
>
> True enough!
On the other hand, you would get the same broken behavior on
any SMP machine running a kernel that has support for EV5 or
earlier enabled in a multiplatform kernel. It doesn't really
matter if it's running on hardware that supports BWX or not
as long as the compiler doesn't generate those instructions.
If I understand it correctly, simply running rcutorture on
a large alpha machine with a 'defconfig' kernel from the
past two years should trigger some bugs even if you don't
run into them that frequently on light usage, right?
Arnd
Powered by blists - more mailing lists