[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <59e7eba0-c26b-6833-8d92-1a4e04ac9e9b@nvidia.com>
Date: Sun, 14 Mar 2021 11:50:14 +0200
From: Roi Dayan <roid@...dia.com>
To: Saeed Mahameed <saeed@...nel.org>,
Jia-Ju Bai <baijiaju1990@...il.com>, <leon@...nel.org>,
<davem@...emloft.net>, <kuba@...nel.org>
CC: <netdev@...r.kernel.org>, <linux-rdma@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] net: mellanox: mlx5: fix error return code of
mlx5e_stats_flower()
On 2021-03-12 12:47 AM, Saeed Mahameed wrote:
> On Tue, 2021-03-09 at 11:44 +0200, Roi Dayan wrote:
>>
>>
>> On 2021-03-09 10:32 AM, Jia-Ju Bai wrote:
>>>
>>>
>>> On 2021/3/9 16:24, Roi Dayan wrote:
>>>>
>>>>
>>>> On 2021-03-09 10:20 AM, Roi Dayan wrote:
>>>>>
>>>>>
>>>>> On 2021-03-06 3:47 PM, Jia-Ju Bai wrote:
>>>>>> When mlx5e_tc_get_counter() returns NULL to counter or
>>>>>> mlx5_devcom_get_peer_data() returns NULL to peer_esw, no
>>>>>> error return
>>>>>> code of mlx5e_stats_flower() is assigned.
>>>>>> To fix this bug, err is assigned with -EINVAL in these cases.
>>>>>>
>>>>>> Reported-by: TOTE Robot <oslab@...nghua.edu.cn>
>
> Hey Jia-Ju, What are the conditions for this robot to raise a flag?
> sometimes it is totally normal to abort a function and return 0.. i am
> just curious to know ?
>
>
>>>>>> Signed-off-by: Jia-Ju Bai <baijiaju1990@...il.com>
>>>>>> ---
>>>>>> drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 12
>>>>>> +++++++++---
>>>>>> 1 file changed, 9 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>>>>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>>>>>> index 0da69b98f38f..1f2c9da7bd35 100644
>>>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>>>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>>>>>> @@ -4380,8 +4380,10 @@ int mlx5e_stats_flower(struct
>>>>>> net_device
>>>>>> *dev, struct mlx5e_priv *priv,
>>>>>> if (mlx5e_is_offloaded_flow(flow) ||
>>>>>> flow_flag_test(flow, CT)) {
>>>>>> counter = mlx5e_tc_get_counter(flow);
>>>>>> - if (!counter)
>>>>>> + if (!counter) {
>>>>>> + err = -EINVAL;
>>>>>> goto errout;
>>>>>> + }
>>>>>> mlx5_fc_query_cached(counter, &bytes, &packets,
>>>>>> &lastuse);
>>>>>> }
>>>>>> @@ -4390,8 +4392,10 @@ int mlx5e_stats_flower(struct
>>>>>> net_device
>>>>>> *dev, struct mlx5e_priv *priv,
>>>>>> * un-offloaded while the other rule is offloaded.
>>>>>> */
>>>>>> peer_esw = mlx5_devcom_get_peer_data(devcom,
>>>>>> MLX5_DEVCOM_ESW_OFFLOADS);
>>>>>> - if (!peer_esw)
>>>>>> + if (!peer_esw) {
>>>>>> + err = -EINVAL;
>>>>>
>
> This is not an error flow, i am curious what are the thresholds of this
> robot ?
>
>>>>> note here it's not an error. it could be there is no peer esw
>>>>> so just continue with the stats update.
>>>>>
>>>>>> goto out;
>>>>>> + }
>>>>>> if (flow_flag_test(flow, DUP) &&
>>>>>> flow_flag_test(flow->peer_flow, OFFLOADED)) {
>>>>>> @@ -4400,8 +4404,10 @@ int mlx5e_stats_flower(struct
>>>>>> net_device
>>>>>> *dev, struct mlx5e_priv *priv,
>>>>>> u64 lastuse2;
>>>>>> counter = mlx5e_tc_get_counter(flow->peer_flow);
>>>>>> - if (!counter)
>>>>>> + if (!counter) {
>>>>>> + err = -EINVAL;
>>>>
>>>> this change is problematic. the current goto is to do stats
>>>> update with
>>>> the first counter stats we got but if you now want to return an
>>>> error
>>>> then you probably should not do any update at all.
>>>
>>> Thanks for your reply :)
>>> I am not sure whether an error code should be returned here?
>>> If so, flow_stats_update(...) should not be called here?
>>>
>>>
>>> Best wishes,
>>> Jia-Ju Bai
>>>
>>
>> basically flow and peer_flow should be valid and protected from
>> changes,
>> and counter should be valid.
>> it looks like the check here is more of a sanity check if something
>> goes
>> wrong but shouldn't. you can just let it be, update the stats from
>> the
>> first queried counter.
>>
>
> Roi, let's consider returning an error code here, we shouldn't be
> silently returning if we are not expecting these errors,
>
> why would mlx5e_stats_flower() be called if stats are not offloaded ?
>
> Thanks,
> Saeed.
>
>
yes we can return an error if peer counter missing.
I just pointed out we should probably not call flow_stats_update() if
we do return an error.
the other option, as today, is updating the stats with first counter
stats and because of that we didn't return an error.
Powered by blists - more mailing lists