[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e791dbb534aac79805389a4b754901c24991de89.camel@physik.fu-berlin.de>
Date: Sun, 07 Sep 2025 20:33:14 +0200
From: John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>
To: Michael Karcher <kernel@...rcher.dialup.fu-berlin.de>, Andreas Larsson
<andreas@...sler.com>
Cc: sparclinux@...r.kernel.org, linux-kernel@...r.kernel.org, Anthony Yznaga
<anthony.yznaga@...cle.com>, René Rebe
<rene@...ctcode.com>
Subject: Re: [PATCH v4 2/5] sparc: fix accurate exception reporting in
copy_{from_to}_user for UltraSPARC III
Hi,
On Sun, 2025-09-07 at 19:49 +0200, John Paul Adrian Glaubitz wrote:
> Michael suggested switching to the generic copy_{to,from}_user code offlist
> to verify this:
>
> diff --git a/arch/sparc/kernel/head_64.S b/arch/sparc/kernel/head_64.S
> index c305486501dc..cd1a96a918b3 100644
> --- a/arch/sparc/kernel/head_64.S
> +++ b/arch/sparc/kernel/head_64.S
> @@ -687,7 +687,7 @@ cheetah_tlb_fixup:
> stw %g2, [%g1 + %lo(tlb_type)]
>
> /* Patch copy/page operations to cheetah optimized versions. */
> - call cheetah_patch_copyops
> + call generic_patch_copyops
> nop
> call cheetah_patch_copy_page
> nop
>
> The kernel still crashes, even when using the generic code.
>
> So, this particular issue is not rooted in the U3_copy_{to,from}_user code.
Replacing "call cheetah_patch_copy_page" with a nop doesn't help either:
diff --git a/arch/sparc/kernel/head_64.S b/arch/sparc/kernel/head_64.S
index c305486501dc..ed859bae5175 100644
--- a/arch/sparc/kernel/head_64.S
+++ b/arch/sparc/kernel/head_64.S
@@ -689,7 +689,7 @@ cheetah_tlb_fixup:
/* Patch copy/page operations to cheetah optimized versions. */
call cheetah_patch_copyops
nop
- call cheetah_patch_copy_page
+ nop
nop
call cheetah_patch_cachetlbops
nop
[ 140.207051] systemd-sysv-generator[1037]: SysV service '/etc/init.d/buildd' lacks a native systemd unit file, automatically generating a unit file for compatibility.
[ 140.401791] systemd-sysv-generator[1037]: Please update package to include a native systemd unit file.
[ 140.525028] systemd-sysv-generator[1037]: ⚠ This compatibility logic is deprecated, expect removal soon. ⚠
[ 147.718747] systemd-sysv-generator[1093]: SysV service '/etc/init.d/buildd' lacks a native systemd unit file, automatically generating a unit file for compatibility.
[ 147.913402] systemd-sysv-generator[1093]: Please update package to include a native systemd unit file.
[ 148.036530] systemd-sysv-generator[1093]: ⚠ This compatibility logic is deprecated, expect removal soon. ⚠
[ 149.208409] Unable to handle kernel NULL pointer dereference
[ 149.282820] tsk->{mm,active_mm}->context = 00000000000000ab
[ 149.356117] tsk->{mm,active_mm}->pgd = fff0000008830000
[ 149.424819] \|/ ____ \|/
[ 149.424819] "@'/ .. \`@"
[ 149.424819] /_| \__/ |_\
[ 149.424819] \__U_/
[ 149.618139] systemd(1): Oops [#1]
[ 149.661684] CPU: 0 UID: 0 PID: 1 Comm: systemd Not tainted 6.17.0-rc4+ #16 NONE
[ 149.758917] TSTATE: 0000004411001602 TPC: 00000000006260a4 TNPC: 00000000006260a8 Y: ffffffff Not tainted
[ 149.888258] TPC: <bpf_patch_insn_data+0x204/0x2e0>
[ 149.951255] g0: 0000000000000000 g1: 0000000000000000 g2: 0000000000000036 g3: fff0000012178b28
[ 150.065638] g4: fff0000000236300 g5: fff000023e336000 g6: fff000000026c000 g7: 0000000000000001
[ 150.180010] o0: 0000000100880000 o1: 0000000000000000 o2: 0000000000000001 o3: 0000000000000001
[ 150.294387] o4: fff00000046f42a0 o5: 0000000000000001 sp: fff000000026efb1 ret_pc: 0000000000626058
[ 150.413336] RPC: <bpf_patch_insn_data+0x1b8/0x2e0>
[ 150.476236] l0: fff0000012178000 l1: 0000000100874048 l2: 0000000000000001 l3: 0000000100880000
[ 150.590616] l4: 0000000100874068 l5: 0000000000000005 l6: 0000000000000000 l7: fff000001217e128
[ 150.704994] i0: 0000000100874000 i1: 0000000000000004 i2: 0000000000000005 i3: 0000000000000002
[ 150.819434] i4: 0000000100888000 i5: fff0000012178ae8 i6: fff000000026f061 i7: 000000000064b0e8
[ 150.933878] I7: <bpf_check+0x1988/0x34a0>
[ 150.986575] Call Trace:
[ 151.018687] [<000000000064b0e8>] bpf_check+0x1988/0x34a0
[ 151.088456] [<000000000061bf2c>] bpf_prog_load+0x8ec/0xc80
[ 151.160510] [<000000000061db44>] __sys_bpf+0xd04/0x25a0
[ 151.229138] [<000000000061f9f8>] sys_bpf+0x18/0x60
[ 151.292041] [<0000000000406274>] linux_sparc_syscall+0x34/0x44
[ 151.368678] Disabling lock debugging due to kernel taint
[ 151.438440] Caller[000000000064b0e8]: bpf_check+0x1988/0x34a0
[ 151.513936] Caller[000000000061bf2c]: bpf_prog_load+0x8ec/0xc80
[ 151.591704] Caller[000000000061db44]: __sys_bpf+0xd04/0x25a0
[ 151.666051] Caller[000000000061f9f8]: sys_bpf+0x18/0x60
[ 151.734676] Caller[0000000000406274]: linux_sparc_syscall+0x34/0x44
[ 151.817025] Caller[fff000010099b80c]: 0xfff000010099b80c
[ 151.886791] Instruction DUMP:
[ 151.886795] 326ffffa
[ 151.925677] c4004000
[ 151.956558] c25e2038
[ 151.987440] <c4006118>
[ 152.018326] 80a0a000
[ 152.049204] 04400014
[ 152.080083] c2586100
[ 152.110960] 8400bfff
[ 152.141845] 8e00606c
[ 152.172726]
[ 152.223054] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[ 152.323706] Press Stop-A (L1-A) from sun keyboard or send break
[ 152.323706] twice on console to return to the boot prom
[ 152.470098] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 ]---
Replacing all calls with nops already triggers crashes during boot:
diff --git a/arch/sparc/kernel/head_64.S b/arch/sparc/kernel/head_64.S
index c305486501dc..1e2737649d46 100644
--- a/arch/sparc/kernel/head_64.S
+++ b/arch/sparc/kernel/head_64.S
@@ -687,11 +687,11 @@ cheetah_tlb_fixup:
stw %g2, [%g1 + %lo(tlb_type)]
/* Patch copy/page operations to cheetah optimized versions. */
- call cheetah_patch_copyops
nop
- call cheetah_patch_copy_page
nop
- call cheetah_patch_cachetlbops
+ nop
+ nop
+ nop
nop
ba,a,pt %xcc, tlb_fixup_done
[ 42.061355] decompression failed with status 7
[ 42.172976] SCSI subsystem initialized
[ 42.254511] decompression failed with status 7
[ 42.462903] Unable to handle kernel NULL pointer dereference
[ 42.537392] tsk->{mm,active_mm}->context = 000000000000004d
[ 42.610625] tsk->{mm,active_mm}->pgd = fff0000008954000
[ 42.679246] \|/ ____ \|/
[ 42.679246] "@'/ .. \`@"
[ 42.679246] /_| \__/ |_\
[ 42.679246] \__U_/
[ 42.872571] (udev-worker)(96): Oops [#1]
[ 42.924111] CPU: 0 UID: 0 PID: 96 Comm: (udev-worker) Not tainted 6.17.0-rc4+ #14 NONE
[ 43.029343] TSTATE: 0000000011001601 TPC: 0000000000f6875c TNPC: 0000000000f68760 Y: 00000000 Not tainted
[ 43.158584] TPC: <strcmp+0x1c/0x60>
[ 43.204430] g0: 0000000000000000 g1: 0000000000000000 g2: 000000000000006f g3: 000000001001a130
[ 43.318825] g4: fff000000873cd00 g5: fff000023e336000 g6: fff00000088c4000 g7: 000000001001a058
[ 43.433291] o0: 00000009e2fc2857 o1: 0000000000000000 o2: 0000000000000001 o3: 0000000000000000
[ 43.547667] o4: 0000000000000dc0 o5: 0000000000000dc0 sp: fff00000088c6f21 ret_pc: 00000000005a1fe4
[ 43.666617] RPC: <trace_clock_local+0x4/0x20>
[ 43.723797] l0: fff000000004c798 l1: 0000000000000001 l2: fff000000004c510 l3: 0000000000000000
[ 43.838177] l4: 0000000000000000 l5: 00000000014da748 l6: 00000000012a7ef8 l7: 0000000000000000
[ 43.952553] i0: 0000000010076e97 i1: 0000000000000000 i2: 00000000015370f8 i3: 0000000000000000
[ 44.066928] i4: 0000000000000000 i5: 0000000000000dc0 i6: fff00000088c6fd1 i7: 000000000053a2d0
[ 44.181303] I7: <cmp_name+0x10/0x20>
[ 44.228190] Call Trace:
[ 44.260219] [<000000000053a2d0>] cmp_name+0x10/0x20
[ 44.324268] [<0000000000a20dc0>] bsearch+0x20/0x60
[ 44.387173] [<000000000053a45c>] find_exported_symbol_in_section+0x5c/0xc0
[ 44.477532] [<000000000053ba50>] find_symbol+0xd0/0x160
[ 44.546153] [<000000000053e76c>] load_module+0x1acc/0x22c0
[ 44.618211] [<000000000053f16c>] init_module_from_file+0x6c/0xc0
[ 44.697130] [<000000000053f3cc>] sys_finit_module+0x1ac/0x300
[ 44.772618] [<0000000000406274>] linux_sparc_syscall+0x34/0x44
[ 44.849248] Disabling lock debugging due to kernel taint
[ 44.919018] Caller[000000000053a2d0]: cmp_name+0x10/0x20
[ 44.988784] Caller[0000000000a20dc0]: bsearch+0x20/0x60
[ 45.057412] Caller[000000000053a45c]: find_exported_symbol_in_section+0x5c/0xc0
[ 45.153487] Caller[000000000053ba50]: find_symbol+0xd0/0x160
[ 45.227831] Caller[000000000053e76c]: load_module+0x1acc/0x22c0
[ 45.305604] Caller[000000000053f16c]: init_module_from_file+0x6c/0xc0
[ 45.390244] Caller[000000000053f3cc]: sys_finit_module+0x1ac/0x300
[ 45.471447] Caller[0000000000406274]: linux_sparc_syscall+0x34/0x44
[ 45.553799] Caller[fff000010470e2fc]: 0xfff000010470e2fc
[ 45.623566] Instruction DUMP:
[ 45.623569] 2240000b
[ 45.662452] b0102000
[ 45.693333] c40e0001
[ 45.724211] <c60e4001>
[ 45.755093] 80a08003
[ 45.785978] 024ffffa
[ 45.816857] 82006001
[ 45.847737] b0102001
[ 45.878620] b16567ff
[ 45.909502]
[ 63.467354] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 63.545187] rcu: (detected by 0, t=5252 jiffies, g=-87, q=21 ncpus=1)
[ 63.630966] rcu: All QSes seen, last rcu_sched kthread activity 5252 (4294906056-4294900804), jiffies_till_next_fqs=1, root ->qsmask 0x0
[ 63.792241] rcu: rcu_sched kthread starved for 5252 jiffies! g-87 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 63.922625] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[ 64.040431] rcu: RCU grace-period kthread stack dump:
[ 64.106766] task:rcu_sched state:R running task stack:0 pid:15 tgid:15 ppid:2 task_flags:0x208040 flags:0x07000000
[ 64.272615] Call Trace:
[ 64.304632] [<0000000000f8857c>] schedule+0x1c/0x180
[ 64.369827] [<0000000000f8f61c>] schedule_timeout+0x5c/0xe0
[ 64.443027] [<0000000000529550>] rcu_gp_fqs_loop+0x130/0x540
[ 64.517372] [<000000000052e6f4>] rcu_gp_kthread+0x174/0x200
[ 64.590571] [<00000000004aa700>] kthread+0xe0/0x280
[ 64.654620] [<00000000004060c8>] ret_from_fork+0x1c/0x2c
[ 64.724391] [<0000000000000000>] 0x0
[ 64.771284] rcu: Stack dump where RCU GP kthread last ran:
[ 64.843343] CPU: 0 UID: 0 PID: 96 Comm: (udev-worker) Tainted: G D 6.17.0-rc4+ #14 NONE
[ 64.969158] Tainted: [D]=DIE
[ 65.006896] TSTATE: 0000008080001606 TPC: 00000000007a0fa0 TNPC: 00000000007a0fa4 Y: 00000000 Tainted: G D
[ 65.156733] TPC: <count_memcg_events+0x100/0x200>
[ 65.218489] g0: fff000023f804f78 g1: 00000000014e1f00 g2: 00000000014e6340 g3: 00000000014e2100
[ 65.332869] g4: fff000000873cd00 g5: fff000023e336000 g6: fff00000088c4000 g7: fff000023f81c350
[ 65.447243] o0: fff000000025a880 o1: 0000000000000000 o2: fff000000825a0c8 o3: 80000002026d6fb2
[ 65.561617] o4: 0000000000000000 o5: 0000000000000000 sp: fff00000088c6491 ret_pc: 00000000007a0f94
[ 65.680568] RPC: <count_memcg_events+0xf4/0x200>
[ 65.741182] l0: 0000000000100073 l1: fff000023f804f38 l2: fff000023f804f78 l3: fff000023f804fb8
[ 65.855563] l4: 0000000000000000 l5: 0000000000000005 l6: 0000000000000000 l7: 0000000000000008
[ 65.969937] i0: fff000000025a880 i1: 000000000000000e i2: 0000000000000001 i3: fff0000008256820
[ 66.084313] i4: 0000000000000001 i5: 00000000014f9000 i6: fff00000088c6541 i7: 0000000000722890
[ 66.198686] I7: <handle_mm_fault+0x190/0x2e0>
[ 66.255870] Call Trace:
[ 66.287895] [<0000000000722890>] handle_mm_fault+0x190/0x2e0
[ 66.362241] [<0000000000f92e00>] do_sparc64_fault+0x6c0/0xb20
[ 66.437727] [<0000000000407714>] sparc64_realfault_common+0x10/0x20
[ 66.520077] [<0000000000562070>] exit_robust_list+0x10/0x120
[ 66.594422] [<0000000000562710>] futex_exit_release+0x70/0xc0
[ 66.669910] [<000000000047b48c>] exit_mm_release+0xc/0x40
[ 66.740821] [<0000000000483ab8>] do_exit+0x198/0xb80
[ 66.806014] [<0000000000484528>] make_task_dead+0x88/0x160
[ 66.878070] [<0000000000428374>] die_if_kernel+0x260/0x26c
[ 66.950126] [<0000000000f9271c>] unhandled_fault+0x88/0xac
[ 67.022184] [<0000000000f92af0>] do_sparc64_fault+0x3b0/0xb20
[ 67.097670] [<0000000000407714>] sparc64_realfault_common+0x10/0x20
[ 67.180022] [<0000000000f6875c>] strcmp+0x1c/0x60
[ 67.241784] [<000000000053a2d0>] cmp_name+0x10/0x20
[ 67.305833] [<0000000000a20dc0>] bsearch+0x20/0x60
[ 67.368740] [<000000000053a45c>] find_exported_symbol_in_section+0x5c/0xc0
I assume that cheetah_patch_cachetlbops has to be invoked on UltraSPARC III
since there is other code depending on it. On the other hand, the TLB code
on UltraSPARC III was heavily overhauled in 2016 [1] which was also followed
by a bug fix [2].
Chances are there are still bugs in the code introduced in [1].
Adrian
> [1] https://github.com/torvalds/linux/commit/a74ad5e660a9ee1d071665e7e8ad822784a2dc7f
> [2] https://github.com/torvalds/linux/commit/d3c976c14ad8af421134c428b0a89ff8dd3bd8f8
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Powered by blists - more mailing lists