[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a2928082-3764-3765-13cb-68be519f88f2@nvidia.com>
Date: Wed, 28 Sep 2022 18:19:56 +0300
From: Oz Shlomo <ozsh@...dia.com>
To: Edward Cree <ecree.xilinx@...il.com>, netdev@...r.kernel.org
Cc: Jiri Pirko <jiri@...dia.com>, Jamal Hadi Salim <jhs@...atatu.com>,
Simon Horman <simon.horman@...igine.com>,
Baowen Zheng <baowen.zheng@...igine.com>,
Vlad Buslov <vladbu@...dia.com>,
Ido Schimmel <idosch@...dia.com>, Roi Dayan <roid@...dia.com>
Subject: Re: [ RFC net-next 2/3] net: flow_offload: add action stats api
Hן Edward,
On 8/17/2022 5:43 PM, Oz Shlomo wrote:
> Hi Edward,
>
> On 8/16/2022 4:42 PM, Edward Cree wrote:
>> On 16/08/2022 10:23, Oz Shlomo wrote:
>>> The current offload api provides visibility to flow hw stats.
>>> This works as long as the flow stats values apply to all the flow's
>>> actions. However, this assumption breaks when an action, such as police,
>>> decides to drop or jump over other actions.
>>>
>>> Extend the flow_offload api to return stat record per action instance.
>>> Use the per action stats value, if available, when updating the action
>>> instance counters.
>>>
>>> Signed-off-by: Oz Shlomo <ozsh@...dia.com>
>>
>> When I worked on this before I tried with a similar "array of action
>> stats" API [1], but after some discussion it seemed cleaner to have
>> a "get stats for one single action" callback [2] which then could
>> be called in a loop for filter dumps but also called singly for
>> action dumps (RTM_GETACTION). I recommend this approach to your
>> consideration.
>>
>> [1]:
>> https://lore.kernel.org/all/9804a392-c9fd-8d03-7900-e01848044fea@solarflare.com/
>>
>> [2]:
>> https://lore.kernel.org/all/a3f0a79a-7e2c-4cdc-8c97-dfebe959ab1f@solarflare.com/
>>
>>
>
> The recent hw_actions infrastructure provides the platform for updating
> stats per action.
> However, the platform does introduce performance penalties as it invokes
> a driver api method call per action (compared to the current single api
> call). It also requires the driver to lookup the specific action counter
> - requiring more processing compared to the current flow cookie lookup.
> Further more, the current single stats per filter (rather than per
> action) design only breaks when using branching actions (e.g. police),
> which probably applies to a small subset of the rules.
>
> This series proposes two apis:
> 1. High performance api for filter dump update (ovs triggers a dump per
> rule per second) - extending the current api providing the driver an
> option to update stats per action, if required.
> 2. Re-use the hw_actions api for tc action list update (see patch #3)
>
I tried implementing the per action stats using the hw_action api.
The api proved itself well.
However, it is extremely inefficient to allocate a counter per action in
hardware. As such, the driver is required to lookup the action's counter
(hashtable lookup) and also update all the other action stats hanging on
this hw counter (requiring list iteration and locks).
This introduces quite a complex design with performance overheads.
Stats update is performance sensitive as ovs queries the filters' stats
every second.
Supporting tc action stats api will degrade the performance for existing
use cases.
Extending the existing flow_offload api will preserve the current
functionality (single flow stat which applies to all the actions) and
performance while providing the ability to specify per action stats for
use cases involving branching actions.
In the future we could add driver support for returning a per action
stats using the current hw_action api.
WDYT?
>>> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
>>> index 7da3337c4356..7dc8a62796b5 100644
>>> --- a/net/sched/cls_flower.c
>>> +++ b/net/sched/cls_flower.c
>>> @@ -499,7 +499,9 @@ static void fl_hw_update_stats(struct tcf_proto
>>> *tp, struct cls_fl_filter *f,
>>> tc_setup_cb_call(block, TC_SETUP_CLSFLOWER, &cls_flower, false,
>>> rtnl_held);
>>> - tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats);
>>> + tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats,
>>> cls_flower.act_stats);
>>> +
>>> + kfree(cls_flower.act_stats);
>>> }
>>
>> Perhaps I'm being dumb, but I don't see this being allocated
>> anywhere. Is the driver supposed to be responsible for doing so?
>> That seems inelegant.
>
> You are right, the intention is for the driver to allocate the array and
> for the calling method to free it.
>
> While the proposed design is indeed inelegant, it is efficient compared
> to the possible other alternatives:
> 1. Dynamically allocated stats array - this will introduce an alloc/free
> calls per stats query (1 / filter/ second), even if per action stats is
> not required.
> 2. Static action stats array - this has size issues, as this api is
> shared for both tc and nft. Perhaps we can use a hard coded size and
> return an error if the actual counter array size is larger.
>
>
I realized that we cannot assume a 1:1 mapping between tc action and its
corresponding offload action as tc pedit action can create an array of
flow offload actions.
I will fix this in v2.
>>
>> -ed
Powered by blists - more mailing lists