lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <a14a83e9-e159-3ee0-782b-c4caf7c25428@linux.dev> Date: Thu, 26 Oct 2023 11:46:21 -0700 From: Martin KaFai Lau <martin.lau@...ux.dev> To: Kui-Feng Lee <sinquersw@...il.com> Cc: netdev@...r.kernel.org, razor@...ckwall.org, ast@...nel.org, andrii@...nel.org, john.fastabend@...il.com, sdf@...gle.com, toke@...nel.org, kuba@...nel.org, andrew@...n.ch, Toke Høiland-Jørgensen <toke@...hat.com>, bpf@...r.kernel.org, Daniel Borkmann <daniel@...earbox.net> Subject: Re: [PATCH bpf-next v4 1/7] netkit, bpf: Add bpf programmable net device On 10/26/23 10:47 AM, Kui-Feng Lee wrote: > > > On 10/25/23 23:20, Daniel Borkmann wrote: >> Hi Kui-Feng, >> >> On 10/26/23 3:18 AM, Kui-Feng Lee wrote: >>> On 10/25/23 18:15, Kui-Feng Lee wrote: >>>> On 10/25/23 15:09, Martin KaFai Lau wrote: >>>>> On 10/25/23 2:24 PM, Kui-Feng Lee wrote: >>>>>> On 10/24/23 14:48, Daniel Borkmann wrote: >>>>>>> This work adds a new, minimal BPF-programmable device called "netkit" >>>>>>> (former PoC code-name "meta") we recently presented at LSF/MM/BPF. The >>>>>>> core idea is that BPF programs are executed within the drivers xmit routine >>>>>>> and therefore e.g. in case of containers/Pods moving BPF processing closer >>>>>>> to the source. >>>>>> >>>>>> Sorry for intruding into this discussion! Although it is too late to >>>>>> mentioned this since this patchset have been v4 already. >>>>>> >>>>>> I notice netkit has introduced a new attach type. I wonder if it >>>>>> possible to implement it as a new struct_ops type. >>>>> >>>>> Could your elaborate more about what does this struct_ops type do and how >>>>> is it different from the SCHED_CLS bpf prog that the netkit is running? >>>> >>>> I found the code has been landed. >>>> Basing on the landed code and >>>> the patchset of registering bpf struct_ops from modules that I >>>> am working on, it will looks like what is done in following patch. >>>> No changes on syscall, uapi and libbpf are required. >>>> >>>> diff --git a/drivers/net/netkit.c b/drivers/net/netkit.c >>>> index 7e484f9fd3ae..e4eafaf397bf 100644 >>>> --- a/drivers/net/netkit.c >>>> +++ b/drivers/net/netkit.c >>>> @@ -20,6 +20,7 @@ struct netkit { >>>> struct bpf_mprog_entry __rcu *active; >>>> enum netkit_action policy; >>>> struct bpf_mprog_bundle bundle; >>>> + struct hlist_head ops_list; >>>> >>>> /* Needed in slow-path */ >>>> enum netkit_mode mode; >>>> @@ -27,6 +28,13 @@ struct netkit { >>>> u32 headroom; >>>> }; >>>> >>>> +struct netkit_ops { >>>> + struct hlist_node node; >>>> + int ifindex; >>>> + >>>> + int (*xmit)(struct sk_buff *skb); >>>> +}; >>>> + >>>> struct netkit_link { >>>> struct bpf_link link; >>>> struct net_device *dev; >>>> @@ -46,6 +54,22 @@ netkit_run(const struct bpf_mprog_entry *entry, struct >>>> sk_buff *skb, >>>> if (ret != NETKIT_NEXT) >>>> break; >>>> } >>>> + >>>> + return ret; >>>> +} >>>> + >>>> +static __always_inline int >>>> +netkit_run_st_ops(const struct netkit *nk, struct sk_buff *skb, >>>> + enum netkit_action ret) >>>> +{ >>>> + struct netkit_ops *ops; >>>> + >>>> + hlist_for_each_entry_rcu(ops, &nk->ops_list, node) { >>>> + ret = ops->xmit(skb); >>>> + if (ret != NETKIT_NEXT) >>>> + break; >>>> + } >>>> + >>>> return ret; >>>> } >>>> >>>> @@ -80,6 +104,8 @@ static netdev_tx_t netkit_xmit(struct sk_buff *skb, >>>> struct net_device *dev) >>>> entry = rcu_dereference(nk->active); >>>> if (entry) >>>> ret = netkit_run(entry, skb, ret); >>>> + if (ret == NETKIT_NEXT) >>>> + ret = netkit_run_st_ops(nk, skb, ret); >>>> switch (ret) { >>>> case NETKIT_NEXT: >>>> case NETKIT_PASS: >> >> I don't think it makes sense to cramp struct ops in here for what has been >> solved already with the bpf_mprog interface in a more efficient way and with >> control dependencies for the insertion (before/after relative programs/links). >> The latter is in particular crucial for a multi-user interface when dealing >> with network traffic (think for example: policy, forwarder, observability >> prog, etc). >> > > I don't mean to cramp two implementations together > and don't notice this patchset is already landed at beginning. There are a few ways to track this. patchwork bot will send a landing message to the list. There is a few mins lag time but I don't think this lags matter here. You may want to check your inbox and ensure it gets through. git always has the source of true also. > This patch is just for explanation of how it likes if it is implemented > with just struct_ops (without bpf_mprog). Thanks for sharing a struct_ops code snippet. It is an interesting idea to embed ifindex and other details in the struct. Leaving it still needs verifier changes to make the PTR_TO_BTF_ID skb in struct_ops to work like tc __sk_buff such that all existing tc-bpf prog will work as is. Daniel has already mentioned the ordering API (bpf_mprog) that has been discussed for a year and has already been used in tc-link which I hope it will be extended to solve the xdp ordering also. I am also not convinced saving two attach types (note the prog type is the same here) deserve to re-create something in-parallel to tc-link and then require the same "skb" bpf dataplane program to be administrated (attach/introspect...etc) differently.
Powered by blists - more mailing lists