[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210605045218.jnkfhu7iys7zbt64@apollo>
Date: Sat, 5 Jun 2021 10:22:18 +0530
From: Kumar Kartikeya Dwivedi <memxor@...il.com>
To: Yonghong Song <yhs@...com>
Cc: bpf@...r.kernel.org,
Toke Høiland-Jørgensen <toke@...hat.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Jamal Hadi Salim <jhs@...atatu.com>,
Vlad Buslov <vladbu@...dia.com>,
Cong Wang <xiyou.wangcong@...il.com>,
Jesper Dangaard Brouer <brouer@...hat.com>,
netdev@...r.kernel.org
Subject: Re: [PATCH bpf-next v2 3/7] net: sched: add bpf_link API for bpf
classifier
On Sat, Jun 05, 2021 at 08:38:17AM IST, Yonghong Song wrote:
>
>
> On 6/3/21 11:31 PM, Kumar Kartikeya Dwivedi wrote:
> > This commit introduces a bpf_link based kernel API for creating tc
> > filters and using the cls_bpf classifier. Only a subset of what netlink
> > API offers is supported, things like TCA_BPF_POLICE, TCA_RATE and
> > embedded actions are unsupported.
> >
> > The kernel API and the libbpf wrapper added in a subsequent patch are
> > more opinionated and mirror the semantics of low level netlink based
> > TC-BPF API, i.e. always setting direct action mode, always setting
> > protocol to ETH_P_ALL, and only exposing handle and priority as the
> > variables the user can control. We add an additional gen_flags parameter
> > though to allow for offloading use cases. It would be trivial to extend
> > the current API to support specifying other attributes in the future,
> > but for now I'm sticking how we want to push usage.
> >
> > The semantics around bpf_link support are as follows:
> >
> > A user can create a classifier attached to a filter using the bpf_link
> > API, after which changing it and deleting it only happens through the
> > bpf_link API. It is not possible to bind the bpf_link to existing
> > filter, and any such attempt will fail with EEXIST. Hence EEXIST can be
> > returned in two cases, when existing bpf_link owned filter exists, or
> > existing netlink owned filter exists.
> >
> > Removing bpf_link owned filter from netlink returns EPERM, denoting that
> > netlink is locked out from filter manipulation when bpf_link is
> > involved.
> >
> > Whenever a filter is detached due to chain removal, or qdisc tear down,
> > or net_device shutdown, the bpf_link becomes automatically detached.
> >
> > In this way, the netlink API and bpf_link creation path are exclusive
> > and don't stomp over one another. Filters created using bpf_link API
> > cannot be replaced by netlink API, and filters created by netlink API are
> > never replaced by bpf_link. Netfilter also cannot detach bpf_link filters.
> >
> > We serialize all changes dover rtnl_lock as cls_bpf API doesn't support the
>
> dover => over?
>
Thanks, will fix.
> > unlocked classifier API.
> >
> > Reviewed-by: Toke Høiland-Jørgensen <toke@...hat.com>.
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@...il.com>
> > ---
> > include/linux/bpf_types.h | 3 +
> > include/net/pkt_cls.h | 13 ++
> > include/net/sch_generic.h | 6 +-
> > include/uapi/linux/bpf.h | 15 +++
> > kernel/bpf/syscall.c | 10 +-
> > net/sched/cls_api.c | 139 ++++++++++++++++++++-
> > net/sched/cls_bpf.c | 250 +++++++++++++++++++++++++++++++++++++-
> > 7 files changed, 430 insertions(+), 6 deletions(-)
> >
> [...]
> > subsys_initcall(tc_filter_init);
> > +
> > +#if IS_ENABLED(CONFIG_NET_CLS_BPF)
> > +
> > +int bpf_tc_link_attach(union bpf_attr *attr, struct bpf_prog *prog)
> > +{
> > + struct net *net = current->nsproxy->net_ns;
> > + struct tcf_chain_info chain_info;
> > + u32 chain_index, prio, parent;
> > + struct tcf_block *block;
> > + struct tcf_chain *chain;
> > + struct tcf_proto *tp;
> > + int err, tp_created;
> > + unsigned long cl;
> > + struct Qdisc *q;
> > + __be16 protocol;
> > + void *fh;
> > +
> > + /* Caller already checks bpf_capable */
> > + if (!ns_capable(current->nsproxy->net_ns->user_ns, CAP_NET_ADMIN))
>
> net->user_ns?
>
True, will fix.
> > + return -EPERM;
> > +
> > + if (attr->link_create.flags ||
> > + !attr->link_create.target_ifindex ||
> > + !tc_flags_valid(attr->link_create.tc.gen_flags))
> > + return -EINVAL;
> > +
> [...]
--
Kartikeya
Powered by blists - more mailing lists