[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87plzc239o.fsf@toke.dk>
Date: Mon, 11 Dec 2023 16:00:03 +0100
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Martin KaFai Lau <martin.lau@...ux.dev>
Cc: deb.chatterjee@...el.com, anjali.singhai@...el.com,
namrata.limaye@...el.com, mleitner@...hat.com, Mahesh.Shirshyad@....com,
tomasz.osinski@...el.com, jiri@...nulli.us, xiyou.wangcong@...il.com,
davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, vladbu@...dia.com, horms@...nel.org,
khalidm@...dia.com, daniel@...earbox.net, bpf@...r.kernel.org,
netdev@...r.kernel.org, Jamal Hadi Salim <jhs@...atatu.com>
Subject: Re: [PATCH net-next v9 14/15] p4tc: add set of P4TC table kfuncs
Martin KaFai Lau <martin.lau@...ux.dev> writes:
> On 12/8/23 2:15 AM, Toke Høiland-Jørgensen wrote:
>> Martin KaFai Lau <martin.lau@...ux.dev> writes:
>>
>>> On 12/1/23 10:29 AM, Jamal Hadi Salim wrote:
>>>> We add an initial set of kfuncs to allow interactions from eBPF programs
>>>> to the P4TC domain.
>>>>
>>>> - bpf_p4tc_tbl_read: Used to lookup a table entry from a BPF
>>>> program installed in TC. To find the table entry we take in an skb, the
>>>> pipeline ID, the table ID, a key and a key size.
>>>> We use the skb to get the network namespace structure where all the
>>>> pipelines are stored. After that we use the pipeline ID and the table
>>>> ID, to find the table. We then use the key to search for the entry.
>>>> We return an entry on success and NULL on failure.
>>>>
>>>> - xdp_p4tc_tbl_read: Used to lookup a table entry from a BPF
>>>> program installed in XDP. To find the table entry we take in an xdp_md,
>>>> the pipeline ID, the table ID, a key and a key size.
>>>> We use struct xdp_md to get the network namespace structure where all
>>>> the pipelines are stored. After that we use the pipeline ID and the table
>>>> ID, to find the table. We then use the key to search for the entry.
>>>> We return an entry on success and NULL on failure.
>>>>
>>>> - bpf_p4tc_entry_create: Used to create a table entry from a BPF
>>>> program installed in TC. To create the table entry we take an skb, the
>>>> pipeline ID, the table ID, a key and its size, and an action which will
>>>> be associated with the new entry.
>>>> We return 0 on success and a negative errno on failure
>>>>
>>>> - xdp_p4tc_entry_create: Used to create a table entry from a BPF
>>>> program installed in XDP. To create the table entry we take an xdp_md, the
>>>> pipeline ID, the table ID, a key and its size, and an action which will
>>>> be associated with the new entry.
>>>> We return 0 on success and a negative errno on failure
>>>>
>>>> - bpf_p4tc_entry_create_on_miss: conforms to PNA "add on miss".
>>>> First does a lookup using the passed key and upon a miss will add the entry
>>>> to the table.
>>>> We return 0 on success and a negative errno on failure
>>>>
>>>> - xdp_p4tc_entry_create_on_miss: conforms to PNA "add on miss".
>>>> First does a lookup using the passed key and upon a miss will add the entry
>>>> to the table.
>>>> We return 0 on success and a negative errno on failure
>>>>
>>>> - bpf_p4tc_entry_update: Used to update a table entry from a BPF
>>>> program installed in TC. To update the table entry we take an skb, the
>>>> pipeline ID, the table ID, a key and its size, and an action which will
>>>> be associated with the new entry.
>>>> We return 0 on success and a negative errno on failure
>>>>
>>>> - xdp_p4tc_entry_update: Used to update a table entry from a BPF
>>>> program installed in XDP. To update the table entry we take an xdp_md, the
>>>> pipeline ID, the table ID, a key and its size, and an action which will
>>>> be associated with the new entry.
>>>> We return 0 on success and a negative errno on failure
>>>>
>>>> - bpf_p4tc_entry_delete: Used to delete a table entry from a BPF
>>>> program installed in TC. To delete the table entry we take an skb, the
>>>> pipeline ID, the table ID, a key and a key size.
>>>> We return 0 on success and a negative errno on failure
>>>>
>>>> - xdp_p4tc_entry_delete: Used to delete a table entry from a BPF
>>>> program installed in XDP. To delete the table entry we take an xdp_md, the
>>>> pipeline ID, the table ID, a key and a key size.
>>>> We return 0 on success and a negative errno on failure
>>>
>>> [ ... ]
>>>
>>>> +BTF_SET8_START(p4tc_kfunc_check_tbl_set_skb)
>>>> +BTF_ID_FLAGS(func, bpf_p4tc_tbl_read, KF_RET_NULL);
>>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_create);
>>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_create_on_miss);
>>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_update);
>>>> +BTF_ID_FLAGS(func, bpf_p4tc_entry_delete);
>>>> +BTF_SET8_END(p4tc_kfunc_check_tbl_set_skb)
>>>
>>> These create/read/update/delete kfuncs are like defining a new hidden bpf map
>>> type in the kernel. bpf prog can now create its own link-list and rbtree.
>>> sched_ext has already been using it. This is the way the bpf prog should use
>>> instead of creating a new map type.
>>
>> I don't really think this is an accurate assessment, given Jamal's use
>> case. These kfuncs are more akin to the FIB lookup helper, or the
>> netfilter kfuncs: they provide lookup into a kernel-internal data
>> structure, so that BPF can access that data structure while staying in
>> sync with the rest of the kernel.
>>
>> If this was a BPF-only implementation you'd be right, but given the
>> constraint of having the P4 objects represented in the kernel[0], I
>> think this is a perfectly reasonable use of kfuncs, even though they
>> happen to look like the map API.
>>
>> -Toke
>>
>> [0] Whether having those objects represented at all is reasonable is a
>> separate discussion, which I believe John et al are having with Jamal in
>> a separate subthread. I don't personally have any strong objections to
>> doing that.
>
> I might not be clear. It was my question on why it has to be in the kernel
> instead of in the bpf map, so the earlier bpf link-list and rbtree example just
> in case this recent bpf capability has not been considered.
A bit tangential, but it came to mind while thinking about this: how
would one go about updating a bpf rbtree-based data structure from
userspace? Is there a way to get bpf_map_update()-semantics that inserts
things into the rbtree somehow?
> If it is an existing kernel infra-structure, kfunc is a reasonable use.
>
> The P4 objects are newly added to this set with bpf program as its user. It can
> be represented in the bpf map as well instead of in the kernel.
>
> or is it fair to say that bpf prog is not the primary consumer of the P4
> objects. Instead kernel is the primary user of the p4 objects such that p4tc can
> work independently without the bpf piece to begin with and bpf could be
> considered as an extension later?
That's a good question, actually. I think that conceptually, if viewed
purely as a control plane, it could be merged separately and the BPF
support added later. But with this series, that would make it a control
plane that doesn't really control anything; so there would need to be a
second consumer (hardware offload?) added for that to make sense, I
suppose.
Or to put it another way, the way this series is designed, there is an
implicit "these are kernel objects that we want to use for other things"
assumption in there; it's just that those "other things" are not part
of this series (because hardware offload doesn't exist yet - I think?
I'll let Jamal answer that). I can see the point of asking for that
second user, though, as that would make it clear why the control plane
needs to be in the kernel.
-Toke
Powered by blists - more mailing lists