lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7e4dfc01e132196d3ff10df18622252a8455d1b8.camel@intel.com>
Date: Tue, 12 Aug 2025 18:49:43 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "glaubitz@...sik.fu-berlin.de" <glaubitz@...sik.fu-berlin.de>,
	"peterz@...radead.org" <peterz@...radead.org>, "mingo@...hat.com"
	<mingo@...hat.com>, "luto@...nel.org" <luto@...nel.org>, "bp@...en8.de"
	<bp@...en8.de>
CC: "sam@...too.org" <sam@...too.org>, "andreas@...sler.com"
	<andreas@...sler.com>, "nadav.amit@...il.com" <nadav.amit@...il.com>,
	"anthony.yznaga@...cle.com" <anthony.yznaga@...cle.com>,
	"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux_dti@...oud.com" <linux_dti@...oud.com>, "will.deacon@....com"
	<will.deacon@....com>, "deneen.t.dock@...el.com" <deneen.t.dock@...el.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>, "tglx@...utronix.de"
	<tglx@...utronix.de>, "linux-security-module@...r.kernel.org"
	<linux-security-module@...r.kernel.org>, "sparclinux@...r.kernel.org"
	<sparclinux@...r.kernel.org>, "hpa@...or.com" <hpa@...or.com>,
	"linux-integrity@...r.kernel.org" <linux-integrity@...r.kernel.org>,
	"daniel@...earbox.net" <daniel@...earbox.net>,
	"kernel-hardening@...ts.openwall.com" <kernel-hardening@...ts.openwall.com>,
	"ast@...nel.org" <ast@...nel.org>, "x86@...nel.org" <x86@...nel.org>,
	"kristen@...ux.intel.com" <kristen@...ux.intel.com>
Subject: Re: [PATCH v5 18/23] bpf: Use vmalloc special flag

On Tue, 2025-08-12 at 20:37 +0200, John Paul Adrian Glaubitz wrote:
> That could be true. I knew about the patch in [1] but I didn't think of applying it.
> 
> FWIW, the crashes we're seeing on recent kernel versions look like this:
> 
> [   40.992851]               \|/ ____ \|/
> [   40.992851]               "@'/ .. \`@"
> [   40.992851]               /_| \__/ |_\
> [   40.992851]                  \__U_/
> [   41.186220] (udev-worker)(88): Kernel illegal instruction [#1]

Possibly re-using some stale TLB executable VA which's page now has other data
in it.

> [   41.262910] CPU: 0 UID: 0 PID: 88 Comm: (udev-worker) Tainted: G        W          6.12.0+ #25
> [   41.376151] Tainted: [W]=WARN
> [   41.415025] TSTATE: 0000004411001607 TPC: 00000000101c21c0 TNPC: 00000000101c21c4 Y: 00000000    Tainted: G        W         
> [   41.563717] TPC: <ehci_init_driver+0x0/0x160 [ehci_hcd]>
> [   41.633584] g0: 00000000012005b8 g1: 00000000100a1800 g2: 0000000010206000 g3: 00000000101de000
> [   41.747962] g4: fff000000a5af380 g5: 0000000000000000 g6: fff000000aac8000 g7: 0000000000000e7b
> [   41.862338] o0: 0000000010060118 o1: 000000001020a000 o2: fff000000aa30ce0 o3: 0000000000000e7a
> [   41.976728] o4: 00000000ff000000 o5: 00ff000000000000 sp: fff000000aacb091 ret_pc: 00000000101de028
> [   42.095768] RPC: <ehci_pci_init+0x28/0x2000 [ehci_pci]>
> [   42.164394] l0: 0000000000000000 l1: 0000000100043fff l2: ffffffffff800000 l3: 0000000000800000
> [   42.278768] l4: fff00000001c8008 l5: 0000000000000000 l6: 00000000013358e0 l7: 0000000001002800
> [   42.393143] i0: ffffffffffffffed i1: 00000000004db8d8 i2: 0000000000000000 i3: fff000000aa304e0
> [   42.507517] i4: 0000000001127250 i5: 0000000010060000 i6: fff000000aacb141 i7: 0000000000427d90
> [   42.621893] I7: <do_one_initcall+0x30/0x200>
> [   42.677931] Call Trace:
> [   42.709953] [<0000000000427d90>] do_one_initcall+0x30/0x200
> [   42.783158] [<00000000004db908>] do_init_module+0x48/0x240
> [   42.855214] [<00000000004dd82c>] load_module+0x19cc/0x1f20
> [   42.927270] [<00000000004ddf8c>] init_module_from_file+0x6c/0xa0
> [   43.006189] [<00000000004de1e4>] sys_finit_module+0x1c4/0x2c0
> [   43.081677] [<0000000000406174>] linux_sparc_syscall+0x34/0x44
> [   43.158307] Disabling lock debugging due to kernel taint
> [   43.228077] Caller[0000000000427d90]: do_one_initcall+0x30/0x200
> [   43.306995] Caller[00000000004db908]: do_init_module+0x48/0x240
> [   43.384772] Caller[00000000004dd82c]: load_module+0x19cc/0x1f20
> [   43.462544] Caller[00000000004ddf8c]: init_module_from_file+0x6c/0xa0
> [   43.547184] Caller[00000000004de1e4]: sys_finit_module+0x1c4/0x2c0
> [   43.628389] Caller[0000000000406174]: linux_sparc_syscall+0x34/0x44
> [   43.710741] Caller[fff000010480e2fc]: 0xfff000010480e2fc
> [   43.780508] Instruction DUMP:
> [   43.780511]  00000000 
> [   43.819394]  00000000 
> [   43.850273]  00000000 
> [   43.881153] <00000000>
> [   43.912036]  00000000 
> [   43.942917]  00000000 
> [   43.973797]  00000000 
> [   44.004678]  00000000 
> [   44.035561]  00000000 
> [   44.066443]
> 
> Do you have any suggestion what to bisect?

This does look like kernel range TLB flush related. Not sure how it's related to
userspace huge pages. Perhaps the userspace range TLB flush has issues to? Or
the TLB flush asm needs to be fixed in this another sparc variant?

So far two issues were found with that patch and they were both rare
architectures with broken kernel TLB flushes. Kernel TLB flushes can actually
not be required for a long time, so probably the bug normally looked like
unexplained crashes after days. The VM_FLUSH_RESET_PERMS just made them show up
earlier in a bisectable way.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ