[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181217173929.024ab742@cakuba.netronome.com>
Date: Mon, 17 Dec 2018 17:39:29 -0800
From: Jakub Kicinski <jakub.kicinski@...ronome.com>
To: Pablo Neira Ayuso <pablo@...filter.org>
Cc: netdev@...r.kernel.org, davem@...emloft.net,
thomas.lendacky@....com, f.fainelli@...il.com,
ariel.elior@...ium.com, michael.chan@...adcom.com,
santosh@...lsio.com, madalin.bucur@....com,
yisen.zhuang@...wei.com, salil.mehta@...wei.com,
jeffrey.t.kirsher@...el.com, tariqt@...lanox.com,
saeedm@...lanox.com, jiri@...lanox.com, idosch@...lanox.com,
peppe.cavallaro@...com, grygorii.strashko@...com, andrew@...n.ch,
vivien.didelot@...oirfairelinux.com, alexandre.torgue@...com,
joabreu@...opsys.com, linux-net-drivers@...arflare.com,
ganeshgr@...lsio.com, ogerlitz@...lanox.com,
Manish.Chopra@...ium.com, marcelo.leitner@...il.com,
mkubecek@...e.cz, venkatkumar.duvvuru@...adcom.com,
julia.lawall@...6.fr, john.fastabend@...il.com, jhs@...atatu.com,
gerlitz.or@...il.com
Subject: Re: [PATCH net-next,v6 00/12] add flow_rule infrastructure
On Fri, 14 Dec 2018 19:11:53 +0100, Pablo Neira Ayuso wrote:
> Hi,
>
> This patchset introduces a kernel intermediate representation (IR) to
> express ACL hardware offloads, this is heavily based on the existing
> flow dissector infrastructure and the TC actions. This IR can be used by
> different frontend ACL interfaces such as ethtool_rxnfc, tc and
> netfilter to represent ACL hardware offloads.
>
> The main goals of this patchset are:
>
> * Provide an unified representation that can be passed to the driver
> for translation to HW IR. This consolidates the code to be maintained
> in the driver and it also simplifies the development of ACL hardware
> offloads. Driver developers do not need to add one specific parser for
> each supported ACL frontend, instead each frontend just generates
> this flow_rule IR and pass it to drivers to populate the hardware IR.
Ack.
> * Do not expose TC software frontend details to the drivers anymore,
> such as structure layouts. Hence, if TC needs to be updated to support
> a new software feature, no changes to the drivers will be required.
This one you may need to clarify. TC already transforms the rules and
for the most part provides abstract enough accessor helpers to protect
from changes in internal implementation of the SW path.
You could make an argument that if one subsystem already offloads
something and the IR can express it another subsystem can add the same
functionality and drivers don't have to change. But why do we have
multiple subsystem offloading the same thing in the first place?..
> In handcrafted ascii art, the idea is the following:
>
> . ethtool_rxnfc tc netfilter
> | (ioctl) (netlink) (netlink)
> | | | | translate native
> Frontend | | | | interface representation
> | | | | to flow_rule IR
> | | | |
> . \/ \/ \/
> . ----- flow_rule IR ------
> | |
> Drivers | | parsing of flow_rule IR
> | | to populate hardware IR
> | \/
> . hardware IR (driver)
>
> This patchset only converts tc and ethtool_rxnfc to use this
> infrastructure. Therefore. this patchset comes with no netfilter
> changes at this stage.
Let's try to differentiate between cls_flower and TC as a subsystem.
[...]
> P.S: Moving forward, my rough plan is to propose and discuss the
> following changes to use this infrastructure from netfilter for ACL
> hardware offload:
>
> * Rename tc_setup ndo to flow_offload (or similar), so it can be used
> from netfilter too. Otherwise, I'm open for alternatives.
That is the part which really makes me uneasy. Ordering rules, tables
and offloads in HW is already a hard question, which keeps coming back
(see Jesse's talk from LPC for example). (And if you think you know
the answer you probably forgot about legacy SR-IOV :))
The TC offloads themselves today are still quite flaky. I don't think
anyone respects rule ordering/priorities at all. Even if we did order
cls_flower's rules in HW - what if someone added a cls_basic rule before
them. SW path will no longer match HW path. And we are about to mix
another subsystem into that soup.
I don't really see how you could rename ndo_setup_tc to flow_offload.
It offloads block notifications, and Qdiscs. You must realize that
actual rules are offloaded via block callbacks..? Are you planning to
create fake blocks for netfilter? Create yet another "give me all
rules for device X" registration construct?
What level of abstraction are we really going to achieve here? The
devices still need to know netfilter is a separate subsystem with its
quirks and rules. Say someone adds a PASS rule to flower, and a DROP
rule to netfilter. Packet has to be dropped, but the device will most
likely PASS because flower's rule will hit first. n-tuple filters
don't have pass so the problem isn't obvious from this patch set.
I'm starting to like Or's idea of "expressing nft rules in cls_flower".
I don't mean pretending its flower rules just for offload, but rather
have a TC classifier module that can take nft rules (in SW path) and
which would semantically fit very clearly into the existing TC
framework. Mirroring SW behaviour in HW becomes easier, we don't have
to teach drivers about another subsystem.
> * Add a conversion function from native netfilter representation to
> flow_rule, including extra new glue code from the two-phase commit
> protocol to integrate this infrastructure.
Powered by blists - more mailing lists