lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CO1PR11MB4993CA55EDF590EF66FF3D4893D39@CO1PR11MB4993.namprd11.prod.outlook.com>
Date:   Mon, 30 Jan 2023 23:24:34 +0000
From:   "Singhai, Anjali" <anjali.singhai@...el.com>
To:     Jamal Hadi Salim <jhs@...atatu.com>,
        Toke Høiland-Jørgensen <toke@...hat.com>
CC:     John Fastabend <john.fastabend@...il.com>,
        Jamal Hadi Salim <hadi@...atatu.com>,
        Jiri Pirko <jiri@...nulli.us>,
        Willem de Bruijn <willemb@...gle.com>,
        Stanislav Fomichev <sdf@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "kernel@...atatu.com" <kernel@...atatu.com>,
        "Chatterjee, Deb" <deb.chatterjee@...el.com>,
        "Limaye, Namrata" <namrata.limaye@...el.com>,
        "khalidm@...dia.com" <khalidm@...dia.com>,
        "tom@...anda.io" <tom@...anda.io>,
        "pratyush@...anda.io" <pratyush@...anda.io>,
        "xiyou.wangcong@...il.com" <xiyou.wangcong@...il.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "edumazet@...gle.com" <edumazet@...gle.com>,
        "pabeni@...hat.com" <pabeni@...hat.com>,
        "vladbu@...dia.com" <vladbu@...dia.com>,
        "simon.horman@...igine.com" <simon.horman@...igine.com>,
        "stefanc@...vell.com" <stefanc@...vell.com>,
        "seong.kim@....com" <seong.kim@....com>,
        "mattyk@...dia.com" <mattyk@...dia.com>,
        "Daly, Dan" <dan.daly@...el.com>,
        "Fingerhut, John Andy" <john.andy.fingerhut@...el.com>
Subject: RE: [PATCH net-next RFC 00/20] Introducing P4TC

Devlink is only for downloading the vendor specific compiler output for a P4 program and for teaching the driver about the names of runtime P4 object as to how they map onto the HW. This helps with the Initial definition of the Dataplane.

Devlink is NOT for the runtime programming of the Dataplane, that has to go through the P4TC block for anybody to deploy a programmable dataplane between the HW and the SW and have some flows that are in HW and some in SW or some processing HW and some in SW. ndo_setup_tc framework and support in the drivers will give us the hooks to program the HW match-action entries. 
Please explain through ebpf model how do I program the HW at runtime? 

Thanks
Anjali


-----Original Message-----
From: Jamal Hadi Salim <jhs@...atatu.com> 
Sent: Monday, January 30, 2023 2:54 PM
To: Toke Høiland-Jørgensen <toke@...hat.com>
Cc: John Fastabend <john.fastabend@...il.com>; Jamal Hadi Salim <hadi@...atatu.com>; Jiri Pirko <jiri@...nulli.us>; Willem de Bruijn <willemb@...gle.com>; Stanislav Fomichev <sdf@...gle.com>; Jakub Kicinski <kuba@...nel.org>; netdev@...r.kernel.org; kernel@...atatu.com; Chatterjee, Deb <deb.chatterjee@...el.com>; Singhai, Anjali <anjali.singhai@...el.com>; Limaye, Namrata <namrata.limaye@...el.com>; khalidm@...dia.com; tom@...anda.io; pratyush@...anda.io; xiyou.wangcong@...il.com; davem@...emloft.net; edumazet@...gle.com; pabeni@...hat.com; vladbu@...dia.com; simon.horman@...igine.com; stefanc@...vell.com; seong.kim@....com; mattyk@...dia.com; Daly, Dan <dan.daly@...el.com>; Fingerhut, John Andy <john.andy.fingerhut@...el.com>
Subject: Re: [PATCH net-next RFC 00/20] Introducing P4TC

I think we are going in cycles. John I asked you earlier and i think you answered my question: You are actually pitching an out of band runtime using some vendor sdk via devlink (why even bother with devlink interface, not sure). P4TC is saying the runtime API is via the kernel to the drivers.

Toke, i dont think i have managed to get across that there is an "autonomous" control built into the kernel. It is not just things that come across netlink. It's about the whole infra. I think without that clarity we are going to speak past each other and it's a frustrating discussion which could get emotional. You cant just displace, for example flower and say "use ebpf because it works on tc", theres a lot of tribal knowledge gluing relationship between hardware and software.
Maybe take a look at this patchset below to see an example which shows how part of an action graph will work in hardware and partially in sw under certain conditions:
https://www.spinics.net/lists/netdev/msg877507.html and then we can have a better discussion.

cheers,
jamal


On Mon, Jan 30, 2023 at 4:21 PM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
>
> John Fastabend <john.fastabend@...il.com> writes:
>
> > Toke Høiland-Jørgensen wrote:
> >> Jamal Hadi Salim <hadi@...atatu.com> writes:
> >>
> >> > On Mon, Jan 30, 2023 at 12:04 PM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
> >> >>
> >> >> Jamal Hadi Salim <jhs@...atatu.com> writes:
> >> >>
> >> >> > So i dont have to respond to each email individually, I will 
> >> >> > respond here in no particular order. First let me provide some 
> >> >> > context, if that was already clear please skip it. Hopefully 
> >> >> > providing the context will help us to focus otherwise that 
> >> >> > bikeshed's color and shape will take forever to settle on.
> >> >> >
> >> >> > __Context__
> >> >> >
> >> >> > I hope we all agree that when you have 2x100G NIC (and i have 
> >> >> > seen people asking for 2x800G NICs) no XDP or DPDK is going to 
> >> >> > save you. To
> >> >> > visualize: one 25G port is 35Mpps unidirectional. So "software stack"
> >> >> > is not the answer. You need to offload.
> >> >>
> >> >> I'm not disputing the need to offload, and I'm personally 
> >> >> delighted that
> >> >> P4 is breaking open the vendor black boxes to provide a 
> >> >> standardised interface for this.
> >> >>
> >> >> However, while it's true that software can't keep up at the high 
> >> >> end, not everything runs at the high end, and today's high end 
> >> >> is tomorrow's mid end, in which XDP can very much play a role. 
> >> >> So being able to move smoothly between the two, and even 
> >> >> implement functions that split processing between them, is an 
> >> >> essential feature of a programmable networking path in Linux. 
> >> >> Which is why I'm objecting to implementing the
> >> >> P4 bits as something that's hanging off the side of the stack in 
> >> >> its own thing and is not integrated with the rest of the stack. 
> >> >> You were touting this as a feature ("being self-contained"). I consider it a bug.
> >> >>
> >> >> > Scriptability is not a new idea in TC (see u32 and pedit and 
> >> >> > others in TC).
> >> >>
> >> >> u32 is notoriously hard to use. The others are neat, but 
> >> >> obviously limited to particular use cases.
> >> >
> >> > Despite my love for u32, I admit its user interface is cryptic. I 
> >> > just wanted to point out to existing samples of scriptable and 
> >> > offloadable TC objects.
> >> >
> >> >> Do you actually expect anyone to use P4 by manually entering TC 
> >> >> commands to build a pipeline? I really find that hard to 
> >> >> believe...
> >> >
> >> > You dont have to manually hand code anything - its the compilers job.
> >>
> >> Right, that was kinda my point: in that case the compiler could 
> >> just as well generate a (set of) BPF program(s) instead of this TC script thing.
> >>
> >> >> > IOW, we are reusing and plugging into a proven and deployed 
> >> >> > mechanism with a built-in policy driven, transparent symbiosis 
> >> >> > between hardware offload and software that has matured over 
> >> >> > time. You can take a pipeline or a table or actions and split 
> >> >> > them between hardware and software transparently, etc.
> >> >>
> >> >> That's a control plane feature though, it's not an argument for 
> >> >> adding another interpreter to the kernel.
> >> >
> >> > I am not sure what you mean by control, but what i described is 
> >> > kernel built in. Of course i could do more complex things from 
> >> > user space (if that is what you mean as control).
> >>
> >> "Control plane" as in SDN parlance. I.e., the bits that keep track 
> >> of configuration of the flow/pipeline/table configuration.
> >>
> >> There's no reason you can't have all that infrastructure and use 
> >> BPF as the datapath language. I.e., instead of:
> >>
> >> tc p4template create pipeline/aP4proggie numtables 1 ... + all the 
> >> other stuff to populate it
> >>
> >> you could just do:
> >>
> >> tc p4 create pipeline/aP4proggie obj_file aP4proggie.bpf.o
> >>
> >> and still have all the management infrastructure without the new 
> >> interpreter and associated complexity in the kernel.
> >>
> >> >> > This hammer already meets our goals.
> >> >>
> >> >> That 60k+ line patch submission of yours says otherwise...
> >> >
> >> > This is pretty much covered in the cover letter and a few 
> >> > responses in the thread since.
> >>
> >> The only argument for why your current approach makes sense I've 
> >> seen you make is "I don't want to rewrite it in BPF". Which is not 
> >> a technical argument.
> >>
> >> I'm not trying to be disingenuous here, BTW: I really don't see the 
> >> technical argument for why the P4 data plane has to be implemented 
> >> as its own interpreter instead of integrating with what we have 
> >> already (i.e., BPF).
> >>
> >> -Toke
> >>
> >
> > I'll just take this here becaues I think its mostly related.
> >
> > Still not convinced the P4TC has any value for sw. From the slide 
> > you say vendors prefer you have this picture roughtly.
> >
> >
> >    [ P4 compiler ] ------ [ P4TC backend ] ----> TC API
> >         |
> >         |
> >    [ P4 Vendor backend ]
> >         |
> >         |
> >         V
> >    [ Devlink ]
> >
> >
> > Now just replace P4TC backend with P4C and your only work is to 
> > replace devlink with the current hw specific bits and you have a sw 
> > and hw components. Then you get XDP-BPF pretty easily from P4XDP 
> > backend if you like. The compat piece is handled by compiler where 
> > it should be. My CPU is not a MAT so pretending it is seems not 
> > ideal to me, I don't have a TCAM on my cores.
> >
> > For runtime get those vendors to write their SDKs over Devlink and 
> > no need for this software thing. The runtime for P4c should already 
> > work over BPF. Giving this picture
> >
> >    [ P4 compiler ] ------ [ P4C backend ] ----> BPF
> >         |
> >         |
> >    [ P4 Vendor backend ]
> >         |
> >         |
> >         V
> >    [ Devlink ]
> >
> > And much less work for us to maintain.
>
> Yes, this was basically my point as well. Thank you for putting it 
> into ASCII diagrams! :)
>
> There's still the control plane bit: some kernel component that 
> configures the pieces (pipelines?) created in the top-right and 
> bottom-left corners of your diagram(s), keeping track of which 
> pipelines are in HW/SW, maybe updating some match tables dynamically 
> and extracting statistics. I'm totally OK with having that bit be in 
> the kernel, but that can be added on top of your second diagram just 
> as well as on top of the first one...
>
> -Toke
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ