netdev - Re: [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e693a6f3-7058-0bc8-d5a6-f1572ebd75f8@fb.com>
Date:   Mon, 23 May 2022 08:41:38 -0700
From:   Yonghong Song <yhs@...com>
To:     Stanislav Fomichev <sdf@...gle.com>
Cc:     netdev@...r.kernel.org, bpf@...r.kernel.org, ast@...nel.org,
        daniel@...earbox.net, andrii@...nel.org
Subject: Re: [PATCH bpf-next v7 03/11] bpf: per-cgroup lsm flavor



On 5/20/22 5:03 PM, Stanislav Fomichev wrote:
> On Thu, May 19, 2022 at 6:01 PM Yonghong Song <yhs@...com> wrote:
>>
>>
>>
>> On 5/18/22 3:55 PM, Stanislav Fomichev wrote:
>>> Allow attaching to lsm hooks in the cgroup context.
>>>
>>> Attaching to per-cgroup LSM works exactly like attaching
>>> to other per-cgroup hooks. New BPF_LSM_CGROUP is added
>>> to trigger new mode; the actual lsm hook we attach to is
>>> signaled via existing attach_btf_id.
>>>
>>> For the hooks that have 'struct socket' or 'struct sock' as its first
>>> argument, we use the cgroup associated with that socket. For the rest,
>>> we use 'current' cgroup (this is all on default hierarchy == v2 only).
>>> Note that for some hooks that work on 'struct sock' we still
>>> take the cgroup from 'current' because some of them work on the socket
>>> that hasn't been properly initialized yet.
>>>
>>> Behind the scenes, we allocate a shim program that is attached
>>> to the trampoline and runs cgroup effective BPF programs array.
>>> This shim has some rudimentary ref counting and can be shared
>>> between several programs attaching to the same per-cgroup lsm hook.
>>>
>>> Note that this patch bloats cgroup size because we add 211
>>> cgroup_bpf_attach_type(s) for simplicity sake. This will be
>>> addressed in the subsequent patch.
>>>
>>> Also note that we only add non-sleepable flavor for now. To enable
>>> sleepable use-cases, bpf_prog_run_array_cg has to grab trace rcu,
>>> shim programs have to be freed via trace rcu, cgroup_bpf.effective
>>> should be also trace-rcu-managed + maybe some other changes that
>>> I'm not aware of.
>>>
>>> Signed-off-by: Stanislav Fomichev <sdf@...gle.com>
>>> ---
>>>    arch/x86/net/bpf_jit_comp.c     |  24 +++--
>>>    include/linux/bpf-cgroup-defs.h |   6 ++
>>>    include/linux/bpf-cgroup.h      |   7 ++
>>>    include/linux/bpf.h             |  25 +++++
>>>    include/linux/bpf_lsm.h         |  14 +++
>>>    include/linux/btf_ids.h         |   3 +-
>>>    include/uapi/linux/bpf.h        |   1 +
>>>    kernel/bpf/bpf_lsm.c            |  50 +++++++++
>>>    kernel/bpf/btf.c                |  11 ++
>>>    kernel/bpf/cgroup.c             | 181 ++++++++++++++++++++++++++++---
>>>    kernel/bpf/core.c               |   2 +
>>>    kernel/bpf/syscall.c            |  10 ++
>>>    kernel/bpf/trampoline.c         | 184 ++++++++++++++++++++++++++++++++
>>>    kernel/bpf/verifier.c           |  28 +++++
>>>    tools/include/linux/btf_ids.h   |   4 +-
>>>    tools/include/uapi/linux/bpf.h  |   1 +
>>>    16 files changed, 527 insertions(+), 24 deletions(-)
>>
>> A few nits below.
>>
>>>
>>> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
>>> index a2b6d197c226..5cdebf4312da 100644
>>> --- a/arch/x86/net/bpf_jit_comp.c
>>> +++ b/arch/x86/net/bpf_jit_comp.c
>>> @@ -1765,6 +1765,10 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
>>>                           struct bpf_tramp_link *l, int stack_size,
>>>                           int run_ctx_off, bool save_ret)
>>>    {
>>> +     void (*exit)(struct bpf_prog *prog, u64 start,
>>> +                  struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_exit;
>>> +     u64 (*enter)(struct bpf_prog *prog,
>>> +                  struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_enter;
>>>        u8 *prog = *pprog;
>>>        u8 *jmp_insn;
>>>        int ctx_cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
[...]
>>>        return bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func) != NULL;
>>> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
>>> index 134785ab487c..2c356a38f4cf 100644
>>> --- a/kernel/bpf/cgroup.c
>>> +++ b/kernel/bpf/cgroup.c
>>> @@ -14,6 +14,9 @@
>>>    #include <linux/string.h>
>>>    #include <linux/bpf.h>
>>>    #include <linux/bpf-cgroup.h>
>>> +#include <linux/btf_ids.h>
>>> +#include <linux/bpf_lsm.h>
>>> +#include <linux/bpf_verifier.h>
>>>    #include <net/sock.h>
>>>    #include <net/bpf_sk_storage.h>
>>>
>>> @@ -61,6 +64,85 @@ bpf_prog_run_array_cg(const struct cgroup_bpf *cgrp,
>>>        return run_ctx.retval;
>>>    }
>>>
>>> +unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
>>> +                                    const struct bpf_insn *insn)
>>> +{
>>> +     const struct bpf_prog *shim_prog;
>>> +     struct sock *sk;
>>> +     struct cgroup *cgrp;
>>> +     int ret = 0;
>>> +     u64 *regs;
>>> +
>>> +     regs = (u64 *)ctx;
>>> +     sk = (void *)(unsigned long)regs[BPF_REG_0];
>>
>> Maybe just my own opinion. Using BPF_REG_0 as index is a little bit
>> confusing. Maybe just use '0' to indicate the first parameters.
>> Maybe change 'regs' to 'params' is also a better choice?
>> In reality, trampline just passed an array of parameters to
>> the program. The same for a few places below.
> 
> Sure, let's rename it and use 0. I'll do args instead of params maybe?

'args' works for me too.

> 
>>> +     /*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/
>>> +     shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi));
>>
>> I didn't experiment, but why container_of won't work?
> 
> There is a type check in container_of that doesn't seem to work for flex arrays:
> 
> kernel/bpf/cgroup.c:78:14: error: static_assert failed due to
> requirement '__builtin_types_compatible_p(const struct bpf_insn,
> struct bpf_insn []"
>          shim_prog = container_of(insn, struct bpf_prog, insnsi);
>                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ./include/linux/container_of.h:19:2: note: expanded from macro 'container_of'
>          static_assert(__same_type(*(ptr), ((type *)0)->member) ||       \
>          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ./include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
> #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
>                                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ./include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
> #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
>                                          ^              ~~~~
> 1 error generated.

You are right. Thanks for explanation.

[...]