linux-kernel - Re: possible deadlock in __perf_event_task_sched

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240424094305.GT40213@noisy.programming.kicks-ass.net>
Date: Wed, 24 Apr 2024 11:43:05 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Marius Fleischer <fleischermarius@...il.com>
Cc: Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Alexei Starovoitov <ast@...nel.org>,
	Daniel Borkmann <daniel@...earbox.net>,
	Andrii Nakryiko <andrii@...nel.org>,
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
	syzkaller@...glegroups.com, harrisonmichaelgreen@...il.com
Subject: Re: possible deadlock in __perf_event_task_sched_in

On Mon, Apr 22, 2024 at 11:44:27AM -0700, Marius Fleischer wrote:
> Hi,
> 
> We would like to report the following bug which has been found by our
> modified version of syzkaller.
> 
> We found this report (https://lkml.org/lkml/2021/9/12/333) that seems
> to have a similar but different stack trace. We are unable to tell,
> though, whether it is the same cause. We’d be grateful for your
> advice.

This is just the printk thing sucks again. Some WARN/printk got tripped
in a non-suitable context.


>  _printk+0xba/0xed kernel/printk/printk.c:2299
>  ex_handler_msr.cold+0xb7/0x147 arch/x86/mm/extable.c:90
>  fixup_exception+0x973/0xbb0 arch/x86/mm/extable.c:187
>  __exc_general_protection arch/x86/kernel/traps.c:601 [inline]
>  exc_general_protection+0xed/0x2f0 arch/x86/kernel/traps.c:562
>  asm_exc_general_protection+0x22/0x30 arch/x86/include/asm/idtentry.h:562
> RIP: 0010:__wrmsr arch/x86/include/asm/msr.h:103 [inline]
> RIP: 0010:native_write_msr arch/x86/include/asm/msr.h:154 [inline]
> RIP: 0010:wrmsrl arch/x86/include/asm/msr.h:271 [inline]
> RIP: 0010:__x86_pmu_enable_event
> arch/x86/events/intel/../perf_event.h:1120 [inline]
> RIP: 0010:intel_pmu_enable_event+0x2d9/0xff0 arch/x86/events/intel/core.c:2694
> Code: ea 03 49 81 cc 00 00 40 00 4d 21 f4 80 3c 02 00 0f 85 5b 0c 00
> 00 44 8b ab 70 01 00 00 4c 89 e2 44 89 e0 48 c1 ea 20 44 89 e9 <0f> 30
> 0f 1f 44 00 00 e8 1b 32 75 00 48 83 c4 20 5b 5d 41 5c 41 5d
> RSP: 0018:ffffc900115af348 EFLAGS: 00010002
> RAX: 0000000000530000 RBX: ffff888019dd6a50 RCX: 0000000000000188
> RDX: 0000000000000002 RSI: ffffffff81029464 RDI: ffff888019dd6bc0
> RBP: 0000000000000000 R08: 0000000000000001 R09: ffff888063e22ab7
> R10: 0000000000000000 R11: 0000000000000001 R12: 0000000200530000
> R13: 0000000000000188 R14: ffffffffffffffff R15: ffff888019dd6bb0
>  x86_pmu_start+0x1cc/0x270 arch/x86/events/core.c:1520
>  x86_pmu_enable+0x481/0xdf0 arch/x86/events/core.c:1337
>  perf_pmu_enable kernel/events/core.c:1243 [inline]
>  perf_pmu_enable kernel/events/core.c:1239 [inline]

Most likely your VM is wonky and perf tries to poke an MSR that either
doesn't exist or isn't emulated properly, who knows.