[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <541fde6d-01ce-edf3-84e4-153756aba00f@mellanox.com>
Date: Thu, 26 Sep 2019 07:30:40 +0000
From: Paul Blakey <paulb@...lanox.com>
To: Edward Cree <ecree@...arflare.com>,
Jakub Kicinski <jakub.kicinski@...ronome.com>
CC: Pravin Shelar <pshelar@....org>,
Daniel Borkmann <daniel@...earbox.net>,
Vlad Buslov <vladbu@...lanox.com>,
David Miller <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Jiri Pirko <jiri@...nulli.us>,
Cong Wang <xiyou.wangcong@...il.com>,
Jamal Hadi Salim <jhs@...atatu.com>,
Simon Horman <simon.horman@...ronome.com>,
Or Gerlitz <gerlitz.or@...il.com>
Subject: Re: CONFIG_NET_TC_SKB_EXT
On 9/25/2019 8:01 PM, Edward Cree wrote:
> On 24/09/2019 12:48, Paul Blakey wrote:
>> The 'miss' for all or nothing is easy, but the hard part is combining
>> all the paths a packet can take in software to a single 'all or nothing'
>> rule in hardware.
> But you don't combine them to a single rule in hardware, because you
> have multiple sequential tables. (I just spent the last few weeks
> telling our hardware guys that no, they can't just give us one big
> table and expect the driver to do all that combining, because as you
> say, it's 'the hard part'.)
>
>> What if you 'miss' on the match for the tuple? You already did some
>> processing in hardware, so either you revert those, or you continue in
>> software where you left off (the action ct).
> But the only processing you did was to match stuff and generate metadata
> in the form of lookup keys (e.g. a ct_zone) for the next round of
> matching. There's nothing to "revert" unless you've actually modified
> the packet before sending it to CT, and as I said I don't believe that's
> worth supporting.
>
>> The all or nothing approach will require changing the software model to
>> allow
>>
>> merging the ct zone table matches into the hardware rules
> I don't know how much more clearly I can say this: all-or-nothing does not
> require merging. It just requires any actions that come before a matching
> stage (and that the hw doesn't have the capability to revert) to put a
> rule straight in the 'nothing' bucket.
> So if you write
> chain 0 dst_mac aa:bb:cc:dd:ee:ff ct_state -trk action vlan push blah action ct action goto chain X
> the driver can say -EOPNOTSUPP because you pushed a VLAN and might still
> miss in chain X. But if you write
> chain 0 dst_mac aa:bb:cc:dd:ee:ff ct_state -trk action ct action goto chain X
> then the driver will happily offload that because if you miss in the later
> lookups you've not altered the packet — the chain0-rule is *idempotent* so
> it doesn't matter if HW and SW both perform it. (Or even all three of HW,
> tc and OvS.)
Ok, I thought you meant merging the rules because we do want to support
those modifications use-cases.
In nat scenarios the packet will be modified, and then there can be a miss:
-trk .... CT(zone X, Restore NAT),goto chain 1
+trk+est, match on ipv4, CT(zone Y), goto chain 2
+trk+est, output..
In tunneling scenarios, the tunnel device decapsulates the packet before
it even reaches OvS/Tc, which is another modification.
Also, there are stats issues if we already accounted for some actions in
hardware.
Paul.
Powered by blists - more mailing lists