lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <SJ1PR11MB6129E62E3B372932C6B7477FB9BD2@SJ1PR11MB6129.namprd11.prod.outlook.com>
Date: Wed, 16 Apr 2025 18:09:13 +0000
From: "Borah, Chaitanya Kumar" <chaitanya.kumar.borah@...el.com>
To: "luto@...nel.org" <luto@...nel.org>
CC: "intel-gfx@...ts.freedesktop.org" <intel-gfx@...ts.freedesktop.org>,
	"intel-xe@...ts.freedesktop.org" <intel-xe@...ts.freedesktop.org>, "Kurmi,
 Suresh Kumar" <suresh.kumar.kurmi@...el.com>, "Saarinen, Jani"
	<jani.saarinen@...el.com>, "De Marchi, Lucas" <lucas.demarchi@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Regression on linux-next (next-20250414)

Hello Andy,

Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.

This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository.

Since the version next-20250414 [2], we are seeing the following regression

`````````````````````````````````````````````````````````````````````````````````
<4>[    0.203154] WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795 switch_mm_irqs_off+0x389/0x410
<5>[    0.203173] Modules linked in:
<5>[    0.203184] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.15.0-rc2-next-20250414-next-20250414-gb425262c07a6+ #1 PREEMPT(voluntary) 
<5>[    0.203207] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X220.B00.2103302221 03/30/2021
<5>[    0.203229] RIP: 0010:switch_mm_irqs_off+0x389/0x410
<5>[    0.203241] Code: e9 4d fd ff ff be 00 01 00 00 31 ff e8 60 ba f9 ff e9 29 fe ff ff 48 c7 c7 60 25 a1 82 e8 bf 73 a2 00 84 c0 0f 85 d4 fc ff ff <0f> 0b e9 cd fc ff ff bf 0b 01 00 00 be 01 00 00 00 31 d2 e8 1f e9
<5>[    0.203271] RSP: 0000:ffffffff83403d90 EFLAGS: 00010246
<5>[    0.203283] RAX: 0000000000000000 RBX: ffffffff8389f080 RCX: 0000000100a8c000
<5>[    0.203296] RDX: ffffffff83414200 RSI: 0000000000000000 RDI: 0000000000000000
<5>[    0.203309] RBP: ffffffff83403dc8 R08: 000000008d3ea018 R09: 0000000000000000
<5>[    0.203322] R10: 0000000000000000 R11: 0000000003f55067 R12: 0000000000000000
<5>[    0.203335] R13: ffffffff836d0b40 R14: ffffffff83414200 R15: 0000000000000000
<5>[    0.203348] FS:  0000000000000000(0000) GS:ffff8884d94f6000(0000) knlGS:0000000000000000
<5>[    0.203363] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<5>[    0.203374] CR2: ffff88846dfff000 CR3: 000000000344a001 CR4: 00000000003706f0
<5>[    0.203387] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<5>[    0.203400] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<5>[    0.203412] Call Trace:
<5>[    0.203418]  <TASK>
<5>[    0.203428]  use_temporary_mm+0x5b/0x130
<5>[    0.203439]  efi_set_virtual_address_map+0x4c/0x250
<5>[    0.203452]  ? efi_sync_low_kernel_mappings+0x10a/0x220
<5>[    0.203467]  efi_enter_virtual_mode+0x205/0x5b0
<5>[    0.203482]  start_kernel+0xa38/0xc60
<5>[    0.203492]  ? sme_unmap_bootdata+0x14/0x80
<5>[    0.203504]  x86_64_start_reservations+0x18/0x30
<5>[    0.203516]  x86_64_start_kernel+0xbf/0x110
<5>[    0.203526]  ? soft_restart_cpu+0x14/0x14
<5>[    0.203536]  common_startup_64+0x13e/0x141
<5>[    0.203555]  </TASK>
`````````````````````````````````````````````````````````````````````````````````
Details log can be found in [3].

After bisecting the tree, the following patch [4] seems to be the first "bad"
commit

`````````````````````````````````````````````````````````````````````````````````````````````````````````
commit e7021e2fe0b4335523d3f6e2221000bdfc633b62
Author: Andy Lutomirski mailto:luto@...nel.org
Date:   Wed Apr 2 11:45:39 2025 +0200

    x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery

`````````````````````````````````````````````````````````````````````````````````````````````````````````

We also verified that if we revert the patch the issue is not seen.

Could you please check why the patch causes this regression and provide a fix if necessary?

Thank you.

Regards

Chaitanya

[1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
[2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20250414
[3] https://intel-gfx-ci.01.org/tree/linux-next/next-20250414/bat-dg2-8/boot0.txt 
[4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20250414&id=e7021e2fe0b4335523d3f6e2221000bdfc633b62


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ