[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y9kLeGs0JmIJV0Ch@nanopsycho>
Date: Tue, 31 Jan 2023 13:37:12 +0100
From: Jiri Pirko <jiri@...nulli.us>
To: Toke Høiland-Jørgensen <toke@...hat.com>
Cc: Jamal Hadi Salim <jhs@...atatu.com>,
John Fastabend <john.fastabend@...il.com>,
Jamal Hadi Salim <hadi@...atatu.com>,
Willem de Bruijn <willemb@...gle.com>,
Stanislav Fomichev <sdf@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
kernel@...atatu.com, deb.chatterjee@...el.com,
anjali.singhai@...el.com, namrata.limaye@...el.com,
khalidm@...dia.com, tom@...anda.io, pratyush@...anda.io,
xiyou.wangcong@...il.com, davem@...emloft.net, edumazet@...gle.com,
pabeni@...hat.com, vladbu@...dia.com, simon.horman@...igine.com,
stefanc@...vell.com, seong.kim@....com, mattyk@...dia.com,
dan.daly@...el.com, john.andy.fingerhut@...el.com
Subject: Re: [PATCH net-next RFC 00/20] Introducing P4TC
Tue, Jan 31, 2023 at 01:17:14PM CET, toke@...hat.com wrote:
>Jamal Hadi Salim <jhs@...atatu.com> writes:
>
>> Toke, i dont think i have managed to get across that there is an
>> "autonomous" control built into the kernel. It is not just things that
>> come across netlink. It's about the whole infra.
>
>I'm not disputing the need for the TC infra to configure the pipelines
>and their relationship in the hardware. I'm saying that your
>implementation *of the SW path* is the wrong approach and it would be
>better done by using BPF (not talking about the existing TC-BPF,
>either).
>
>It's a bit hard to know your thinking for sure here, since your patch
>series doesn't include any of the offload control bits. But from the
>slides and your hints in this series, AFAICT, the flow goes something
>like:
>
>hw_pipeline_id = devlink_program_hardware(dev, p4_compiled_blob);
I don't think that devlink is the correct iface for this. If you want to
tight it together with the SW pipeline configurable by TC, use TC as you
do for the BPF binary in this example. If you have the TC-block shared
among many netdevs, the HW needs to know that for binding the P4 input.
Btw, you can have multiple netdevs of different vendors sharing the same
TC-block, then you need to upload multiple HW binary blobs here.
What it eventually might result with is that the userspace would upload
a list of binaries with indication what is the target:
"BPF" -> xxx.o
"DRIVERNAMEX" -> aaa.bin
"DRIVERNAMEY" -> bbb.bin
In theory, there might be a HW to accept the BPF binary :) My point is,
userspace provides a list of binaries, individual kernel parts take what
they like.
>sw_pipeline_id = `tc p4template create ...` (etc, this is generated by P4C)
>
>tc_act = tc_act_create(hw_pipeline_id, sw_pipeline_id)
>
>which will turn into something like:
>
>struct p4_cls_offload ofl = {
> .classid = classid,
> .pipeline_id = hw_pipeline_id
>};
>
>if (check_sw_and_hw_equivalence(hw_pipeline_id, sw_pipeline_id)) /* some magic check here */
Ha! I would like to see this magic here :)
> return -EINVAL;
>
>netdev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_P4, &ofl);
>
>
>I.e, all that's being passed to the hardware is the ID of the
>pre-programmed pipeline, because that programming is going to be
>out-of-band via devlink anyway.
>
>In which case, you could just as well replace the above:
>
>sw_pipeline_id = `tc p4template create ...` (etc, this is generated by P4C)
>
>with
>
>sw_pipeline_id = bpf_prog_load(BPF_PROG_TYPE_P4TC, "my_obj_file.o"); /* my_obj_file is created by P4c */
>
>and achieve exactly the same.
>
>Having all the P4 data types and concepts exist inside the kernel
>*might* make sense if the kernel could then translate those into the
>hardware representations and manage their lifecycle in a uniform way.
>But as far as I can tell from the slides and what you've been saying in
>this thread that's not going to be possible anyway, so why do you need
>anything more granular than the pipeline ID?
>
>-Toke
>
Powered by blists - more mailing lists