lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANpmjNPrHv56Wvc_NbwhoGEU1ZnOepWXT2AmDVVjuY=R8n2XQA@mail.gmail.com>
Date: Mon, 29 Jul 2024 14:45:19 +0200
From: Marco Elver <elver@...gle.com>
To: Radoslaw Zielonek <radoslaw.zielonek@...il.com>
Cc: Peter Zijlstra <peterz@...radead.org>, rostedt@...dmis.org, mhiramat@...nel.org, 
	mathieu.desnoyers@...icios.com, mingo@...hat.com, juri.lelli@...hat.com, 
	vincent.guittot@...aro.org, dietmar.eggemann@....com, bsegall@...gle.com, 
	mgorman@...e.de, vschneid@...hat.com, song@...nel.org, jolsa@...nel.org, 
	ast@...nel.org, daniel@...earbox.net, andrii@...nel.org, martin.lau@...ux.dev, 
	eddyz87@...il.com, yonghong.song@...ux.dev, john.fastabend@...il.com, 
	kpsingh@...nel.org, sdf@...ichev.me, haoluo@...gle.com, 
	mattbobrowski@...gle.com, qyousef@...alina.io, tiozhang@...iglobal.com, 
	linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org, 
	bpf@...r.kernel.org
Subject: Re: [RFC] Printk deadlock in bpf trace called from scheduler context

On Mon, 29 Jul 2024 at 14:27, Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Mon, Jul 29, 2024 at 01:46:09PM +0200, Radoslaw Zielonek wrote:
> > I am currently working on a syzbot-reported bug where bpf
> > is called from trace_sched_switch. In this scenario, we are still within
> > the scheduler context, and calling printk can create a deadlock.
> >
> > I am uncertain about the best approach to fix this issue.
>
> It's been like this forever, it doesn't need fixing, because tracepoints
> shouldn't be doing printk() in the first place.
>
> > Should we simply forbid such calls, or perhaps we should replace printk
> > with printk_deferred in the bpf where we are still in scheduler context?
>
> Not doing printk() is best.

And teaching more debugging tools to behave.

This particular case originates from fault injection:

> [   60.265518][ T8343]  should_fail_ex+0x383/0x4d0
> [   60.265547][ T8343]  strncpy_from_user+0x36/0x2d0
> [   60.265601][ T8343]  strncpy_from_user_nofault+0x70/0x140
> [   60.265637][ T8343]  bpf_probe_read_user_str+0x2a/0x70

Probably the fail_dump() function in lib/fault-inject.c being a little
too verbose in this case.

Radoslaw,  the fix should be in lib/fault-inject.c. Similar to other
debugging tools (like KFENCE, which you discovered) adding
lockdep_off()/lockdep_on(), prink_deferred, or not being as verbose in
this context may be more appropriate. Fault injection does not need to
print a message to inject a fault - the message is for debugging
purposes. Probably a reasonable compromise is to use printk_deferred()
in fail_dump() if in this context to still help with debugging on a
best effort basis. You also need to take care to avoid dumping the
stack in fail_dump().

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ