linux-kernel - Re: possible deadlock in __perf_event_task_sched

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAJg=8jxS+omJP-HeBUNEgh-avEGQuCisPcX2knRiucppQTNAdw@mail.gmail.com>
Date: Mon, 29 Apr 2024 09:38:52 -0700
From: Marius Fleischer <fleischermarius@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>, 
	Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, 
	Andrii Nakryiko <andrii@...nel.org>, linux-perf-users@...r.kernel.org, 
	linux-kernel@...r.kernel.org, syzkaller@...glegroups.com, 
	harrisonmichaelgreen@...il.com
Subject: Re: possible deadlock in __perf_event_task_sched_in

Hi Peter,

Thanks for taking the time to explain this issue!

Wishing you a nice day!

Best,
Marius

On Wed, 24 Apr 2024 at 02:43, Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Mon, Apr 22, 2024 at 11:44:27AM -0700, Marius Fleischer wrote:
> > Hi,
> >
> > We would like to report the following bug which has been found by our
> > modified version of syzkaller.
> >
> > We found this report (https://lkml.org/lkml/2021/9/12/333) that seems
> > to have a similar but different stack trace. We are unable to tell,
> > though, whether it is the same cause. We’d be grateful for your
> > advice.
>
> This is just the printk thing sucks again. Some WARN/printk got tripped
> in a non-suitable context.
>
>
> >  _printk+0xba/0xed kernel/printk/printk.c:2299
> >  ex_handler_msr.cold+0xb7/0x147 arch/x86/mm/extable.c:90
> >  fixup_exception+0x973/0xbb0 arch/x86/mm/extable.c:187
> >  __exc_general_protection arch/x86/kernel/traps.c:601 [inline]
> >  exc_general_protection+0xed/0x2f0 arch/x86/kernel/traps.c:562
> >  asm_exc_general_protection+0x22/0x30 arch/x86/include/asm/idtentry.h:562
> > RIP: 0010:__wrmsr arch/x86/include/asm/msr.h:103 [inline]
> > RIP: 0010:native_write_msr arch/x86/include/asm/msr.h:154 [inline]
> > RIP: 0010:wrmsrl arch/x86/include/asm/msr.h:271 [inline]
> > RIP: 0010:__x86_pmu_enable_event
> > arch/x86/events/intel/../perf_event.h:1120 [inline]
> > RIP: 0010:intel_pmu_enable_event+0x2d9/0xff0 arch/x86/events/intel/corec:2694
> > Code: ea 03 49 81 cc 00 00 40 00 4d 21 f4 80 3c 02 00 0f 85 5b 0c 00
> > 00 44 8b ab 70 01 00 00 4c 89 e2 44 89 e0 48 c1 ea 20 44 89 e9 <0f> 30
> > 0f 1f 44 00 00 e8 1b 32 75 00 48 83 c4 20 5b 5d 41 5c 41 5d
> > RSP: 0018:ffffc900115af348 EFLAGS: 00010002
> > RAX: 0000000000530000 RBX: ffff888019dd6a50 RCX: 0000000000000188
> > RDX: 0000000000000002 RSI: ffffffff81029464 RDI: ffff888019dd6bc0
> > RBP: 0000000000000000 R08: 0000000000000001 R09: ffff888063e22ab7
> > R10: 0000000000000000 R11: 0000000000000001 R12: 0000000200530000
> > R13: 0000000000000188 R14: ffffffffffffffff R15: ffff888019dd6bb0
> >  x86_pmu_start+0x1cc/0x270 arch/x86/events/core.c:1520
> >  x86_pmu_enable+0x481/0xdf0 arch/x86/events/core.c:1337
> >  perf_pmu_enable kernel/events/core.c:1243 [inline]
> >  perf_pmu_enable kernel/events/core.c:1239 [inline]
>
> Most likely your VM is wonky and perf tries to poke an MSR that either
> doesn't exist or isn't emulated properly, who knows.