[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAADnVQJYVLEs8zr414j1xRZ_DAAwcxiCC-1YqDOt8oF13Wf6zw@mail.gmail.com>
Date: Thu, 9 Jan 2025 11:23:59 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Juntong Deng <juntong.deng@...look.com>, Tejun Heo <tj@...nel.org>
Cc: Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
John Fastabend <john.fastabend@...il.com>, Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>, Eddy Z <eddyz87@...il.com>, Song Liu <song@...nel.org>,
Yonghong Song <yonghong.song@...ux.dev>, KP Singh <kpsingh@...nel.org>,
Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
Kumar Kartikeya Dwivedi <memxor@...il.com>, snorcht@...il.com,
Christian Brauner <brauner@...nel.org>, bpf <bpf@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Linux-Fsdevel <linux-fsdevel@...r.kernel.org>
Subject: per st_ops kfunc allow/deny mask. Was: [PATCH bpf-next v6 4/5] bpf:
Make fs kfuncs available for SYSCALL program type
On Mon, Dec 23, 2024 at 4:51 PM Juntong Deng <juntong.deng@...look.com> wrote:
>
> >
> > The main goal is to get rid of run-time mask check in SCX_CALL_OP() and
> > make it static by the verifier. To make that happen scx_kf_mask flags
> > would need to become KF_* flags while each struct-ops callback will
> > specify the expected mask.
> > Then at struct-ops prog attach time the verifier will see the expected mask
> > and can check that all kfuncs calls of this particular program
> > satisfy the mask. Then all of the runtime overhead of
> > current->scx.kf_mask and scx_kf_allowed() will go away.
>
> Thanks for pointing this out.
>
> Yes, I am interested in working on it.
>
> I will try to solve this problem in a separate patch series.
>
>
> The following are my thoughts:
>
> Should we really use KF_* to do this? I think KF_* is currently more
> like declaring that a kfunc has some kind of attribute, e.g.
> KF_TRUSTED_ARGS means that the kfunc only accepts trusted arguments,
> rather than being used to categorise kfuncs.
>
> It is not sustainable to restrict the kfuncs that can be used based on
> program types, which are coarse-grained. This problem will get worse
> as kfuncs increase.
>
> In my opinion, managing the kfuncs available to bpf programs should be
> implemented as capabilities. Capabilities are a mature permission model.
> We can treat a set of kfuncs as a capability (like the various current
> kfunc_sets, but the current kfunc_sets did not carefully divide
> permissions).
>
> We should use separate BPF_CAP_XXX flags to manage these capabilities.
> For example, SCX may define BPF_CAP_SCX_DISPATCH.
>
> For program types, we should divide them into two levels, types and
> subtypes. Types are used to register common capabilities and subtypes
> are used to register specific capabilities. The verifier can check if
> the used kfuncs are allowed based on the type and subtype of the bpf
> program.
>
> I understand that we need to maintain backward compatibility to
> userspace, but capabilities are internal changes in the kernel.
> Perhaps we can make the current program types as subtypes and
> add 'types' that are only used internally, and more subtypes
> (program types) can be added in the future.
Sorry for the delay.
imo CAP* approach doesn't fit.
caps are security bits exposed to user space.
Here there is no need to expose anything to user space.
But you're also correct that we cannot extend kfunc KF_* flags
that easily. KF_* flags are limited to 32-bit and we're already
using 12 bits.
enum scx_kf_mask needs 5 bits, so we can squeeze them into
the current 32-bit field _for now_,
but eventually we'd need to refactor kfunc definition into a wider set:
BTF_ID_FLAGS(func, .. KF_*)
so that different struct_ops consumers can define their own bits.
Right now SCX is the only st_ops consumer who needs this feature,
so let's squeeze into the existing KF facility.
First step is to remap scx_kf_mask bits into unused bits in KF_
and annotate corresponding sched-ext kfuncs with it.
For example:
SCX_KF_DISPATCH will become
KF_DISPATCH (1 << 13)
and all kfuncs that are allowed to be called from ->dispatch() callback
will be annotated like:
- BTF_KFUNCS_START(scx_kfunc_ids_dispatch)
- BTF_ID_FLAGS(func, scx_bpf_dispatch_nr_slots)
- BTF_ID_FLAGS(func, scx_bpf_dispatch_cancel)
+ BTF_KFUNCS_START(scx_kfunc_ids_dispatch)
+ BTF_ID_FLAGS(func, scx_bpf_dispatch_nr_slots, KF_DISPATCH)
+ BTF_ID_FLAGS(func, scx_bpf_dispatch_cancel, KF_DISPATCH)
For sched_ext_ops callback annotations, I think,
the simplest approach is to add special
BTF_SET8_START(st_ops_flags)
BTF_ID_FLAGS(func, sched_ext_ops__dispatch, KF_DISPATCH)
and so on for other ops stubs.
sched_ext_ops__dispatch() is an empty function that
exists in the vmlinux, and though it's not a kfunc
we can use it to annotate
(struct sched_ext_ops *)->dispatch() callback
with a particular KF_ flag
(or a set of flags for SCX_KF_RQ_LOCKED case).
Then the verifier (while analyzing the program that is targeted
to be attach to this ->dispatch() hook)
will check this extra KF flag in st_ops
and will only allow to call kfuncs with matching flags:
if (st_ops->kf_mask & kfunc->kf_mask) // ok to call kfunc from this callback
The end result current->scx.kf_mask will be removed
and instead of run-time check it will become static verifier check.
Powered by blists - more mailing lists