[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87tt4nzjbq.fsf@posteo.net>
Date: Tue, 10 Jun 2025 15:51:05 +0000
From: Charalampos Mitrodimas <charmitro@...teo.net>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Daniel Borkmann <daniel@...earbox.net>, "David S. Miller"
<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub
Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Simon
Horman <horms@...nel.org>, Martin KaFai Lau <martin.lau@...ux.dev>, John
Fastabend <john.fastabend@...il.com>, Alexei Starovoitov
<ast@...nel.org>, Andrii Nakryiko <andrii@...nel.org>, Eduard Zingerman
<eddyz87@...il.com>, Song Liu <song@...nel.org>, Yonghong Song
<yonghong.song@...ux.dev>, KP Singh <kpsingh@...nel.org>, Stanislav
Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>, Jiri Olsa
<jolsa@...nel.org>, Feng Yang <yangfeng@...inos.cn>, Tejun Heo
<tj@...nel.org>, Network Development <netdev@...r.kernel.org>, LKML
<linux-kernel@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
syzbot+b4169a1cfb945d2ed0ec@...kaller.appspotmail.com
Subject: Re: [PATCH bpf-next] bpf: Fix RCU usage in
bpf_get_cgroup_classid_curr helper
Alexei Starovoitov <alexei.starovoitov@...il.com> writes:
> On Tue, Jun 10, 2025 at 8:23 AM Charalampos Mitrodimas
> <charmitro@...teo.net> wrote:
>>
>> Alexei Starovoitov <alexei.starovoitov@...il.com> writes:
>>
>> > On Tue, Jun 10, 2025 at 5:58 AM Daniel Borkmann <daniel@...earbox.net> wrote:
>> >>
>> >> On 6/9/25 5:51 PM, Alexei Starovoitov wrote:
>> >> > On Sun, Jun 8, 2025 at 8:35 AM Charalampos Mitrodimas
>> >> > <charmitro@...teo.net> wrote:
>> >> >>
>> >> >> The commit ee971630f20f ("bpf: Allow some trace helpers for all prog
>> >> >> types") made bpf_get_cgroup_classid_curr helper available to all BPF
>> >> >> program types. This helper used __task_get_classid() which calls
>> >> >> task_cls_state() that requires rcu_read_lock_bh_held().
>> >> >>
>> >> >> This triggers an RCU warning when called from BPF syscall programs
>> >> >> which run under rcu_read_lock_trace():
>> >> >>
>> >> >> WARNING: suspicious RCU usage
>> >> >> 6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted
>> >> >> -----------------------------
>> >> >> net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage!
>> >> >>
>> >> >> Fix this by replacing __task_get_classid() with task_cls_classid()
>> >> >> which handles RCU locking internally using regular rcu_read_lock() and
>> >> >> is safe to call from any context.
>> >> >>
>> >> >> Reported-by: syzbot+b4169a1cfb945d2ed0ec@...kaller.appspotmail.com
>> >> >> Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec
>> >> >> Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types")
>> >> >> Signed-off-by: Charalampos Mitrodimas <charmitro@...teo.net>
>> >> >> ---
>> >> >> net/core/filter.c | 2 +-
>> >> >> 1 file changed, 1 insertion(+), 1 deletion(-)
>> >> >>
>> >> >> diff --git a/net/core/filter.c b/net/core/filter.c
>> >> >> index 30e7d36790883b29174654315738e93237e21dd0..3b3f81cf674dde7d2bd83488450edad4e129bdac 100644
>> >> >> --- a/net/core/filter.c
>> >> >> +++ b/net/core/filter.c
>> >> >> @@ -3083,7 +3083,7 @@ static const struct bpf_func_proto bpf_msg_pop_data_proto = {
>> >> >> #ifdef CONFIG_CGROUP_NET_CLASSID
>> >> >> BPF_CALL_0(bpf_get_cgroup_classid_curr)
>> >> >> {
>> >> >> - return __task_get_classid(current);
>> >> >> + return task_cls_classid(current);
>> >> >> }
>> >> >
>> >> > Daniel added this helper in
>> >> > commit 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks")
>> >> > with intention to use it from networking hooks.
>> >> >
>> >> > But task_cls_classid() has
>> >> > if (in_interrupt())
>> >> > return 0;
>> >> >
>> >> > which will trigger in softirq and tc hooks.
>> >> > So this might break Daniel's use case.
>> >>
>> >> Yeap, we cannot break tc(x) BPF programs. It probably makes sense to have
>> >> a new helper implementation for the more generic, non-networking case which
>> >> then internally uses task_cls_classid().
>> >
>> > Instead of forking the helper I think we can :
>> > rcu_read_lock_bh_held() || rcu_read_lock_held()
>> > in task_cls_state().
>>
>> I tested your suggestion with,
>>
>> rcu_read_lock_bh_held() || rcu_read_lock_held()
>>
>> but it still triggers the RCU warning because BPF syscall programs use
>> rcu_read_lock_trace().
>>
>> Adding rcu_read_lock_trace_held() fixes it functionally but triggers a
>> checkpatch warning:
>>
>> WARNING: use of RCU tasks trace is incorrect outside BPF or core RCU code
>
> It's safe to ignore checkpatch in this case.
If that is the case I'll move forward with this. It was my initial fix
for this[1] anyway, but checkpatch made me change it.
[1]: https://github.com/charmitro/linux/commit/e5c42d49bfb967c3c35f536971f397492d2f46bf
>
>> I think the best solution here would be to add
>> local_bh_disable()/enable() protection directly in the BPF helper. This
>> keeps the fix localized to where the problem exists, and avoids
>> modifying core cgroup RCU.
>
> That works, but it will add runtime overhead.
> I doubt the helper is in a critical path, so I don't mind.
Powered by blists - more mailing lists