[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD8CoPBYfAyb6FtQ8KsqO-f4jfsYXoqe9heWcQkFprX=TQ50PA@mail.gmail.com>
Date: Sat, 13 May 2023 17:19:32 +0800
From: Ze Gao <zegao2021@...il.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Yonghong Song <yhs@...a.com>, Jiri Olsa <olsajiri@...il.com>,
Song Liu <song@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>,
Yonghong Song <yhs@...com>,
John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...nel.org>,
Stanislav Fomichev <sdf@...gle.com>,
Hao Luo <haoluo@...gle.com>,
Masami Hiramatsu <mhiramat@...nel.org>,
Ze Gao <zegao@...cent.com>, bpf@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org
Subject: Re: [PATCH] bpf: reject blacklisted symbols in kprobe_multi to avoid
recursive trap
Exactly, and rethook_trampoline_handler suffers the same problem.
And I've posted two patches for kprobe and rethook by using the
notrace verison of preempt_
{disable, enable} to fix fprobe+rethook.
[1] https://lore.kernel.org/all/20230513081656.375846-1-zegao@tencent.com/T/#u
[2] https://lore.kernel.org/all/20230513090548.376522-1-zegao@tencent.com/T/#u
Even worse, bpf callback introduces more such use cases, which is
typically organized as follows
to guard the lifetime of bpf related resources ( per-cpu access or trampoline).
migrate_disable()
rcu_read_lock()
...
bpf_prog_run()
...
rcu_read_unlock()
migrate_enable().
But this may need to introduce fprobe_blacklist and
bpf_kprobe_blacklist to solve such bugs at all,
just like what Jiri and Yonghong suggested. Since bpf kprobe works on
a different (higher and
constrained) level than fprobe and ftrace and we cannot blindly mark
functions (migrate_disable,
__rcu_read_lock, etc.) used in tracer callbacks from external
subsystems in case of semantic breakage.
And I will try to implement these ideas later.
Thanks,
Ze
On Sat, May 13, 2023 at 12:18 PM Steven Rostedt <rostedt@...dmis.org> wrote:
>
> On Fri, 12 May 2023 07:29:02 -0700
> Yonghong Song <yhs@...a.com> wrote:
>
> > A fprobe_blacklist might make sense indeed as fprobe and kprobe are
> > quite different... Thanks for working on this.
>
> Hmm, I think I see the problem:
>
> fprobe_kprobe_handler() {
> kprobe_busy_begin() {
> preempt_disable() {
> preempt_count_add() { <-- trace
> fprobe_kprobe_handler() {
> [ wash, rinse, repeat, CRASH!!! ]
>
> Either the kprobe_busy_begin() needs to use preempt_disable_notrace()
> versions, or fprobe_kprobe_handle() needs a
> ftrace_test_recursion_trylock() call.
>
> -- Steve
Powered by blists - more mailing lists