[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <59198081-15e2-4b02-934f-c34dd1a0ac93@app.fastmail.com>
Date: Tue, 22 Apr 2025 12:16:33 +0200
From: "Arnd Bergmann" <arnd@...db.de>
To: "kernel test robot" <oliver.sang@...el.com>
Cc: oe-lkp@...ts.linux.dev, "kernel test robot" <lkp@...el.com>,
linux-kernel@...r.kernel.org, "Ingo Molnar" <mingo@...nel.org>,
"Linus Torvalds" <torvalds@...ux-foundation.org>
Subject: Re: [linus:master] [x86/cpu] f388f60ca9:
BUG:soft_lockup-CPU##stuck_for#s![swapper:#]
On Mon, Apr 21, 2025, at 10:12, kernel test robot wrote:
> Hello,
>
> by this commit, we notice big config diff [1]
>
> then in this rcutorture tests, parent runs quite clean, f388f60ca9 shows
> various random issues.
Thanks for the report!
>From my initial reading, my patch most likely caught a preexisting bug,
but my patch itself is correct. It's worth investigating regardless,
at the minimum we should perhaps prevent an invalid configuration from
building or from booting.
> config: i386-randconfig-r071-20250410
Generally, I would not expect 'randconfig' kernels to pass all tests,
and what happened here is that removing the CONFIG_MK8 option made it
pick some completely different CPU
> compiler: gcc-12
> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
The most relevant options here are
-# CONFIG_M486SX is not set
+CONFIG_M486SX=y
# CONFIG_SMP is not set
CONFIG_X86_GENERIC=y
In theory, setting X86_GENERIC should make a kernel built for an
older CPU work on any newer one. In practice, I'm not surprised
that this fails: While AMD K8 is ten years older than Intel Sandy
Bridge, they are architecturally still very similar. The i486SX
is another decade older, but its design is as far removed from
both K8 and Sandy Bridge as it gets.
It would be nice to not have to support 486sx any more.
We have discussed removing support for older CPUs without
TSC, FPU and CX8 in the past, but so far always kept them
around.
> [ 721.016779][ C0] hardirqs last disabled at (159506):
> sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1049)
> [ 721.016779][ C0] softirqs last enabled at (159174): handle_softirqs
> (kernel/softirq.c:408 kernel/softirq.c:589)
> [ 721.016779][ C0] softirqs last disabled at (159159): __do_softirq
> (kernel/softirq.c:596)
> [ 721.016779][ C0] CPU: 0 UID: 0 PID: 1 Comm: swapper Not tainted
> 6.14.0-rc3-00037-gf388f60ca904 #1
> [ 721.016779][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX,
> 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [ 721.016779][ C0] EIP: timekeeping_notify
> (kernel/time/timekeeping.c:1522)
Timekeeping code could be related, I see that CONFIG_X86_TSC
is disabled for i486SX configurations, so even if a TSC is present
in the emulated machine, it is not being used to measure time
accurately.
> -CONFIG_X86_CMPXCHG64=y
This could be another issue, if there is code that relies on
the cx8/cmpxchg8b feature to be used. Since this is a non-SMP
kernel, this is less likely to be the cause of the problem.
Can you try what happens when you enable the two options, either
by changing CONFIG_M486SX to CONFIG_M586TSC, or with a patch
like the one below? Note that CONFIG_X86_CMPXCHG64 recently
got renamed to CONFIG_X86_CX8, but they are the exact same thing.
diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
index f928cf6e3252..ac6cc69060f1 100644
--- a/arch/x86/Kconfig.cpu
+++ b/arch/x86/Kconfig.cpu
@@ -317,7 +317,6 @@ config X86_USE_PPRO_CHECKSUM
config X86_TSC
def_bool y
- depends on (MWINCHIP3D || MCRUSOE || MEFFICEON || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || MVIAC3_2 || MVIAC7 || MGEODEGX1 || MGEODE_LX || MATOM) || X86_64
config X86_HAVE_PAE
def_bool y
@@ -325,7 +324,6 @@ config X86_HAVE_PAE
config X86_CX8
def_bool y
- depends on X86_HAVE_PAE || M586TSC || M586MMX || MK6 || MK7 || MGEODEGX1 || MGEODE_LX
# this should be set for all -march=.. options where the compiler
# generates cmov.
Arnd
Powered by blists - more mailing lists