netdev - Re: CONFIG_NET_TC_SKB

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3aff5059-c28c-f194-72f0-69edddf89f84@solarflare.com>
Date:   Thu, 26 Sep 2019 15:26:08 +0100
From:   Edward Cree <ecree@...arflare.com>
To:     Paul Blakey <paulb@...lanox.com>,
        Jakub Kicinski <jakub.kicinski@...ronome.com>
CC:     Pravin Shelar <pshelar@....org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Vlad Buslov <vladbu@...lanox.com>,
        David Miller <davem@...emloft.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Jiri Pirko <jiri@...nulli.us>,
        Cong Wang <xiyou.wangcong@...il.com>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        Simon Horman <simon.horman@...ronome.com>,
        Or Gerlitz <gerlitz.or@...il.com>
Subject: Re: CONFIG_NET_TC_SKB_EXT

On 26/09/2019 14:56, Paul Blakey wrote:
>>> In nat scenarios the packet will be modified, and then there can be a miss:
>>>
>>>              -trk .... CT(zone X, Restore NAT),goto chain 1
>>>
>>>              +trk+est, match on ipv4, CT(zone Y), goto chain 2
>>>
>>>              +trk+est, output..
>> I'm confused, I thought the usual nat scenario looked more like
>>      0: -trk ... action ct(zone x), goto chain 1
>>      1: +trk+new ... action ct(commit, nat=foo) # sw only
>>      1: +trk+est ... action ct(nat), mirred eth1
>> i.e. the NAT only happens after conntrack has matched (and thus provided
>>   the saved NAT metadata), at the end of the pipe.  I don't see how you
>>   can NAT a -trk packet.
> Both are valid, Nat in the first hop, executes the nat stored on the 
> connection if available (configured by commit).
This still isn't making sense to me.
Until you've done a conntrack lookup and found the connection, you can't
 use NAT information that's stored in the connection.
So the NAT can only happen after a conntrack match is found.

And all the rest of your stuff (like doing conntrack twice, in different
 zones X and Y) is 'weird' inasmuch as it's beyond the basic minimum
 functionality for a useful offload, and inherently doesn't map to a
 fixed-layout (non-loopy) HW pipeline.  You may want to support it in
 your driver, you may be able to support it in your hardware, but it's
 not true that "even nat needs that" (the nat scenario I described above
 is entirely reasonable and is perfectly workable in an all-or-nothing
 offload world), so if your changes are causing problems, they should be
 reverted for this cycle.

>> AFAICT only 'deliverish' actions (i.e. mirred and drop) in TC have stats.
>> So stats are unlikely to be a problem unless you've got (say) a mirred
>>   mirror before you send to ct and goto chain, in which case the extra
>>   copy of the packet is a rather bigger problem for idempotency than mere
>>   stats ;-)
> All tc actions have software stats, and at least one (goto, mirred, 
> drop) per OvS generated rule will have hardware stats.
Ooh, goto has hardware stats?  That's something I hadn't spotted *re-
 draws hardware design slightly*

> All OvS datapath rules have stats, and in turn the translated TC rules 
> all have stats. OvS ages each rule independently.
TC rules do not have stats.  Only TC actions have stats.  (The offload
 API currently gets confused about this and it's really annoyed me...)