linux-kernel - Re: [PATCH bpf-next v6 4/5] bpf: Make fs kfuncs available for SYSCALL program type

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <AM6PR03MB50806221B121CB56E0D4958E99032@AM6PR03MB5080.eurprd03.prod.outlook.com>
Date: Tue, 24 Dec 2024 12:10:56 +0000
From: Juntong Deng <juntong.deng@...look.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Alexei Starovoitov <ast@...nel.org>,
 Daniel Borkmann <daniel@...earbox.net>,
 John Fastabend <john.fastabend@...il.com>,
 Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau
 <martin.lau@...ux.dev>, Eddy Z <eddyz87@...il.com>,
 Song Liu <song@...nel.org>, Yonghong Song <yonghong.song@...ux.dev>,
 KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...ichev.me>,
 Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
 Kumar Kartikeya Dwivedi <memxor@...il.com>, snorcht@...il.com,
 Christian Brauner <brauner@...nel.org>, bpf <bpf@...r.kernel.org>,
 LKML <linux-kernel@...r.kernel.org>,
 Linux-Fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH bpf-next v6 4/5] bpf: Make fs kfuncs available for SYSCALL
 program type

On 2024/12/24 00:51, Juntong Deng wrote:
> On 2024/12/19 16:41, Alexei Starovoitov wrote:
>> On Tue, Dec 17, 2024 at 3:45 PM Juntong Deng 
>> <juntong.deng@...look.com> wrote:
>>>
>>> -static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 
>>> kfunc_id)
>>> -{
>>> -       if (!btf_id_set8_contains(&bpf_fs_kfunc_set_ids, kfunc_id) ||
>>> -           prog->type == BPF_PROG_TYPE_LSM)
>>> -               return 0;
>>> -       return -EACCES;
>>> -}
>>> -
>>>   static const struct btf_kfunc_id_set bpf_fs_kfunc_set = {
>>>          .owner = THIS_MODULE,
>>>          .set = &bpf_fs_kfunc_set_ids,
>>> -       .filter = bpf_fs_kfuncs_filter,
>>>   };
>>>
>>>   static int __init bpf_fs_kfuncs_init(void)
>>>   {
>>> -       return register_btf_kfunc_id_set(BPF_PROG_TYPE_LSM, 
>>> &bpf_fs_kfunc_set);
>>> +       int ret;
>>> +
>>> +       ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_LSM, 
>>> &bpf_fs_kfunc_set);
>>> +       return ret ?: 
>>> register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL, &bpf_fs_kfunc_set);
>>>   }
>>>
>>>   late_initcall(bpf_fs_kfuncs_init);
>>> diff --git a/tools/testing/selftests/bpf/progs/verifier_vfs_reject.c 
>>> b/tools/testing/selftests/bpf/progs/verifier_vfs_reject.c
>>> index d6d3f4fcb24c..5aab75fd2fa5 100644
>>> --- a/tools/testing/selftests/bpf/progs/verifier_vfs_reject.c
>>> +++ b/tools/testing/selftests/bpf/progs/verifier_vfs_reject.c
>>> @@ -148,14 +148,4 @@ int BPF_PROG(path_d_path_kfunc_invalid_buf_sz, 
>>> struct file *file)
>>>          return 0;
>>>   }
>>>
>>> -SEC("fentry/vfs_open")
>>> -__failure __msg("calling kernel function bpf_path_d_path is not 
>>> allowed")
>>
>> This is incorrect.
>> You have to keep bpf_fs_kfuncs_filter() and prog->type == 
>> BPF_PROG_TYPE_LSM
>> check because bpf_prog_type_to_kfunc_hook() aliases LSM and fentry
>> into BTF_KFUNC_HOOK_TRACING category. It's been an annoying quirk.
>> We're figuring out details for significant refactoring of
>> register_btf_kfunc_id_set() and the whole registration process.
>>
>> Maybe you would be interested in working on it?
>>
>> The main goal is to get rid of run-time mask check in SCX_CALL_OP() and
>> make it static by the verifier. To make that happen scx_kf_mask flags
>> would need to become KF_* flags while each struct-ops callback will
>> specify the expected mask.
>> Then at struct-ops prog attach time the verifier will see the expected 
>> mask
>> and can check that all kfuncs calls of this particular program
>> satisfy the mask. Then all of the runtime overhead of
>> current->scx.kf_mask and scx_kf_allowed() will go away.
> 
> Thanks for pointing this out.
> 
> Yes, I am interested in working on it.
> 
> I will try to solve this problem in a separate patch series.
> 
> 
> The following are my thoughts:
> 
> Should we really use KF_* to do this? I think KF_* is currently more
> like declaring that a kfunc has some kind of attribute, e.g.
> KF_TRUSTED_ARGS means that the kfunc only accepts trusted arguments,
> rather than being used to categorise kfuncs.
> 
> It is not sustainable to restrict the kfuncs that can be used based on
> program types, which are coarse-grained. This problem will get worse
> as kfuncs increase.
> 
> In my opinion, managing the kfuncs available to bpf programs should be
> implemented as capabilities. Capabilities are a mature permission model.
> We can treat a set of kfuncs as a capability (like the various current
> kfunc_sets, but the current kfunc_sets did not carefully divide
> permissions).
> 
> We should use separate BPF_CAP_XXX flags to manage these capabilities.
> For example, SCX may define BPF_CAP_SCX_DISPATCH.
> 
> For program types, we should divide them into two levels, types and
> subtypes. Types are used to register common capabilities and subtypes
> are used to register specific capabilities. The verifier can check if
> the used kfuncs are allowed based on the type and subtype of the bpf
> program.
> 
> I understand that we need to maintain backward compatibility to
> userspace, but capabilities are internal changes in the kernel.
> Perhaps we can make the current program types as subtypes and
> add 'types' that are only used internally, and more subtypes
> (program types) can be added in the future.

Sorry, this email was sent at midnight before I went to bed, and yes,
it looks a bit radical.

But in the long run, as ebpf is used in more and more scenarios
(better kernel module), and we have more and more kfuncs
(better EXPORT_SYMBOL_GPL).

Managing (restricting) kfuncs that can be used in different contexts
will become more and more complex.

Therefore it might be better for ebpf to transition to fine-grained
permission management (capabilities).

Maybe we can have more discussion.

Many thanks.