netdev - Re: [PATCH net-next 0/4] net/sched: cls_api: Support hardware miss to tc action

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d7404d79-1d3b-9392-7911-1851f1c37cbf@nvidia.com>
Date:   Wed, 18 Jan 2023 23:27:21 +0200
From:   Paul Blakey <paulb@...dia.com>
To:     Jamal Hadi Salim <jhs@...atatu.com>
Cc:     netdev@...r.kernel.org, Saeed Mahameed <saeedm@...dia.com>,
        Paolo Abeni <pabeni@...hat.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Cong Wang <xiyou.wangcong@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Oz Shlomo <ozsh@...dia.com>, Jiri Pirko <jiri@...dia.com>,
        Roi Dayan <roid@...dia.com>, Vlad Buslov <vladbu@...dia.com>
Subject: Re: [PATCH net-next 0/4] net/sched: cls_api: Support hardware miss to
 tc action



On 18/01/2023 14:54, Jamal Hadi Salim wrote:
> On Tue, Jan 17, 2023 at 9:48 AM Paul Blakey <paulb@...dia.com> wrote:
>>
>> On 17/01/2023 15:40, Jamal Hadi Salim wrote:
> 
> [..]
> 
>>> Question: How does someone adding these rules tell whether some of
>>> the actions are offloaded and some are not? If i am debugging this because
>>> something was wrong I would like to know.
>>
>> Currently by looking at the per action hw stats, and if they are
>> advancing. This is the same now with filters and the CT action for
>> new connections (driver reports offload, but it means that only for
>> established connections).
> 
> I think that may be sufficient given we use the same technique for
> filter offload.
> Can you maybe post an example of such a working example in your commit message
> with stats?
> You showed a candidate scenario that could be sorted but not a running example.
> 

Sure Ill give it as full example in v3.

>>> It will be an action continue for a scenario where (on ingress) you have
>>> action A from A,B,C being offloaded and B,C is in s/w - the fw filter
>>> will have the
>>> B,C and flower can have A offloaded.
>>> Yes, someone/thing programming these will have to know that only A can
>>> be offloaded
>>> in that graph.
>>
>> I meant using continue to go to next tc priority "as in "action A action
>> continue" but I'm not sure about the actual details of fully supporting
>> this as its not the purpose of this patch, but maybe will lead there.
> 
> Yeah, that was initially confusing when i read the commit log. It sounded
> like action continue == action pipe (because it continues to the next action
> in the action graph).
> Maybe fix the commit to be clearer.

I don't think I mentioned it in the cover letter/commits, or did I miss 
it ?

> 
>>> Ok, so would this work for the scenario I described above? i.e A,B, C where
>>> A is offloaded but not B, C?
>>
>> You mean the reorder? we reorder the CT action first if all other
>> actions are supported, so we do all or nothing.
> 
> Let me give a longer explanation.
> The key i believe is understanding the action dependency. In my mind
> there are 3 levels of
> complexity for assumed ordering of actions A, B, C:
> 
> 1) The simplest thing is to assume all-or-nothing (which is what we
> have done so far in tc);
> IOW if not all of A, B, C can be offloaded then we dont offload. >
> 2) next level of complexity is assuming that A MUST occur before B
> which MUST occur before C.
> Therefore on ingress you can offload part of that graph depending on
> your hardware capability.
> Example: On ingress A, B offloaded and then "continue" to C in s/w if
> your hardware supports
> only offloading A and B but not C. You do the reverse of that graph
> for egress offload.

This is actually the case we want support in this patchset.
Assuming a tc filter has action A , action CT, action B.
If hardware finds that it can't do CT action in hardware (for example 
for a new connection), but we already did action A, we want to continue
executing to "action CT, action B" in sw.

We can use it for partial offload of the action list, but for now it 
will be used for supporting tuple rewrite action followed by action ct 
such as in the example in the cover letter.

> 
> 3) And your case is even more complex because you have a lot more
> knowledge that infact
> there is no action dependency and you can offload something in the
> middle like B.
> So i believe you are solving a harder problem than #2 which is what
> was referring to in
> my earlier email.
> 


This is something we currently do but is "transparent" to the user.
If we have action A, action CT, action B, where A/B != CT and A doesn't 
affect CT result (no dependency), we reorder it internally as action CT, 
action A, action B. Then if we can't do CT in hardware, we didn't do A, 
and can continue in sw to re-execute the filter and actions.

If there is no action dependency then let driver take care of the 
details as we currently do we mlx5.


> The way these things are typically solved is to have a "dependency"
> graph that can be
> programmed depending on h/w offload capability and then you can make a decision
> whether (even in s/w) to allow A,B,C vs C,A,B for example.
> 
> Note: I am not asking for the change - but would be nice to have and I
> think over time
> generalize. I am not sure how other vendors would implement this today.
> 
> Note: if i have something smart in user space - which is what i was
> referring to earlier
> (with mention of skbmark) I can achieve these goals without any kernel
> code changes
> but like i said i understand the operational simplicity by putting
> things in the kernel.

Yeah that would work.


Thanks for the comments,
Paul.

>  > cheers,
> jamal