[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMB2axN7o0pHca_u2HnbMb+pEOJubRR8Y8JewExzwxaRWtKUmQ@mail.gmail.com>
Date: Wed, 15 Oct 2025 15:35:02 -0700
From: Amery Hung <ameryhung@...il.com>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org, alexei.starovoitov@...il.com,
andrii@...nel.org, daniel@...earbox.net, tj@...nel.org, martin.lau@...nel.org,
kernel-team@...a.com
Subject: Re: [RFC PATCH v1 bpf-next 2/4] bpf: Support associating BPF program
with struct_ops
On Mon, Oct 13, 2025 at 5:10 PM Andrii Nakryiko
<andrii.nakryiko@...il.com> wrote:
>
> On Fri, Oct 10, 2025 at 10:49 AM Amery Hung <ameryhung@...il.com> wrote:
> >
> > Add a new BPF command BPF_STRUCT_OPS_ASSOCIATE_PROG to allow associating
> > a BPF program with a struct_ops. This command takes a file descriptor of
> > a struct_ops map and a BPF program and set prog->aux->st_ops_assoc to
> > the kdata of the struct_ops map.
> >
> > The command does not accept a struct_ops program or a non-struct_ops
> > map. Programs of a struct_ops map is automatically associated with the
> > map during map update. If a program is shared between two struct_ops
> > maps, the first one will be the map associated with the program. The
> > associated struct_ops map, once set cannot be changed later. This
> > restriction may be lifted in the future if there is a use case.
> >
> > Each associated programs except struct_ops programs of the map will take
> > a refcount on the map to pin it so that prog->aux->st_ops_assoc, if set,
> > is always valid. However, it is not guaranteed whether the map members
> > are fully updated nor is it attached or not. For example, a BPF program
> > can be associated with a struct_ops map before map_update. The
> > struct_ops implementer will be responsible for maintaining and checking
> > the state of the associated struct_ops map before accessing it.
> >
> > Signed-off-by: Amery Hung <ameryhung@...il.com>
> > ---
> > include/linux/bpf.h | 11 ++++++++++
> > include/uapi/linux/bpf.h | 16 ++++++++++++++
> > kernel/bpf/bpf_struct_ops.c | 32 ++++++++++++++++++++++++++++
> > kernel/bpf/core.c | 6 ++++++
> > kernel/bpf/syscall.c | 38 ++++++++++++++++++++++++++++++++++
> > tools/include/uapi/linux/bpf.h | 16 ++++++++++++++
> > 6 files changed, 119 insertions(+)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index a98c83346134..d5052745ffc6 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1710,6 +1710,8 @@ struct bpf_prog_aux {
> > struct rcu_head rcu;
> > };
> > struct bpf_stream stream[2];
> > + struct mutex st_ops_assoc_mutex;
>
> do we need a mutex at all? cmpxchg() should work just fine. We'll also
> potentially need to access st_ops_assoc from kprobes/fentry anyways,
> and we can't just take mutex there
>
> > + void *st_ops_assoc;
> > };
> >
> > struct bpf_prog {
>
> [...]
>
> >
> > @@ -1890,6 +1901,11 @@ union bpf_attr {
> > __u32 prog_fd;
> > } prog_stream_read;
> >
> > + struct {
> > + __u32 map_fd;
> > + __u32 prog_fd;
>
> let's add flags, we normally have some sort of flags for commands for
> extensibility
I will add a flag
>
> > + } struct_ops_assoc_prog;
> > +
> > } __attribute__((aligned(8)));
> >
> > /* The description below is an attempt at providing documentation to eBPF
> > diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
> > index a41e6730edcf..e57428e1653b 100644
> > --- a/kernel/bpf/bpf_struct_ops.c
> > +++ b/kernel/bpf/bpf_struct_ops.c
> > @@ -528,6 +528,7 @@ static void bpf_struct_ops_map_put_progs(struct bpf_struct_ops_map *st_map)
> > for (i = 0; i < st_map->funcs_cnt; i++) {
> > if (!st_map->links[i])
> > break;
> > + bpf_struct_ops_disassoc_prog(st_map->links[i]->prog);
> > bpf_link_put(st_map->links[i]);
> > st_map->links[i] = NULL;
> > }
> > @@ -801,6 +802,11 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > goto reset_unlock;
> > }
> >
> > + /* Don't stop a program from being reused. prog->aux->st_ops_assoc
>
> nit: comment style, we are converging onto /* on separate line
Got it, so I assume it applies to kerne/bpf/* even existing comments
in the file are netdev style. Is it also the case for
net/core/filter.c?
>
> > + * will point to the first struct_ops kdata.
> > + */
> > + bpf_struct_ops_assoc_prog(&st_map->map, prog);
>
> ignoring error? we should do something better here... poisoning this
> association altogether if program is used in multiple struct_ops seems
> like the only thing we can reasonable do, no?
>
> > +
> > link = kzalloc(sizeof(*link), GFP_USER);
> > if (!link) {
> > bpf_prog_put(prog);
>
> [...]
>
> >
> > +#define BPF_STRUCT_OPS_ASSOCIATE_PROG_LAST_FIELD struct_ops_assoc_prog.prog_fd
> > +
>
> looking at libbpf side, it's quite a mouthful to write out
> bpf_struct_ops_associate_prog()... maybe let's shorten this to
> BPF_STRUCT_OPS_ASSOC or BPF_ASSOC_STRUCT_OPS (with the idea that we
> associate struct_ops with a program). The latter is actually a bit
> more preferable, because then we can have a meaningful high-level
> bpf_program__assoc_struct_ops(struct bpf_program *prog, struct bpf_map
> *map), where map has to be struct_ops. Having bpf_map__assoc_prog() is
> a bit too generic, as this works only for struct_ops maps.
>
> It's all not major, but I think that lends for a bit better naming and
> more logical usage throughout.
Will change the naming.
>
> > +static int struct_ops_assoc_prog(union bpf_attr *attr)
> > +{
> > + struct bpf_prog *prog;
> > + struct bpf_map *map;
> > + int ret;
> > +
> > + if (CHECK_ATTR(BPF_STRUCT_OPS_ASSOCIATE_PROG))
> > + return -EINVAL;
> > +
> > + prog = bpf_prog_get(attr->struct_ops_assoc_prog.prog_fd);
> > + if (IS_ERR(prog))
> > + return PTR_ERR(prog);
> > +
> > + map = bpf_map_get(attr->struct_ops_assoc_prog.map_fd);
> > + if (IS_ERR(map)) {
> > + ret = PTR_ERR(map);
> > + goto out;
> > + }
> > +
> > + if (map->map_type != BPF_MAP_TYPE_STRUCT_OPS ||
> > + prog->type == BPF_PROG_TYPE_STRUCT_OPS) {
>
> you can check prog->type earlier, before getting map itself
Got it. I will make it a separate check right after getting prog.
>
> > + ret = -EINVAL;
> > + goto out;
> > + }
> > +
> > + ret = bpf_struct_ops_assoc_prog(map, prog);
> > +out:
> > + if (ret && !IS_ERR(map))
>
> nit: purely stylistic preference, but I'd rather have a clear
> error-only clean up path, and success with explicit return 0, instead
> of checking ret or IS_ERR(map)
>
> ...
>
> /* goto to put_{map,prog}, depending on how far we've got */
>
> err = bpf_struct_ops_assoc_prog(map, prog);
> if (err)
> goto put_map;
>
> return 0;
>
> put_map:
> bpf_map_put(map);
> put_prog:
> bpf_prog_put(prog);
> return err;
I will separate error path out.
>
>
> > + bpf_map_put(map);
> > + bpf_prog_put(prog);
> > + return ret;
> > +}
> > +
>
> [...]
Powered by blists - more mailing lists