netdev - RE: [PATCH] net/mlx5e: Fix zero table prio set by user.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <AM4PR05MB3411EB711FA8B14D8F8B75ACCFC20@AM4PR05MB3411.eurprd05.prod.outlook.com>
Date:   Sun, 28 Jul 2019 10:04:40 +0000
From:   Paul Blakey <paulb@...lanox.com>
To:     Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
        wenxu <wenxu@...oud.cn>
CC:     Or Gerlitz <gerlitz.or@...il.com>,
        Saeed Mahameed <saeedm@...lanox.com>,
        Roi Dayan <roid@...lanox.com>, Mark Bloch <markb@...lanox.com>,
        "pablo@...filter.org" <pablo@...filter.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH] net/mlx5e: Fix zero table prio set by user.


On 7/26/2019 5:01 PM, Marcelo Ricardo Leitner wrote:
> On Fri, Jul 26, 2019 at 08:39:43PM +0800, wenxu wrote:
>>
>> 在 2019/7/26 20:19, Or Gerlitz 写道:
>>> On Fri, Jul 26, 2019 at 12:24 AM Saeed Mahameed <saeedm@...lanox.com> wrote:
>>>> On Thu, 2019-07-25 at 19:24 +0800, wenxu@...oud.cn wrote:
>>>>> From: wenxu <wenxu@...oud.cn>
>>>>>
>>>>> The flow_cls_common_offload prio is zero
>>>>>
>>>>> It leads the invalid table prio in hw.
>>>>>
>>>>> Error: Could not process rule: Invalid argument
>>>>>
>>>>> kernel log:
>>>>> mlx5_core 0000:81:00.0: E-Switch: Failed to create FDB Table err -22
>>>>> (table prio: 65535, level: 0, size: 4194304)
>>>>>
>>>>> table_prio = (chain * FDB_MAX_PRIO) + prio - 1;
>>>>> should check (chain * FDB_MAX_PRIO) + prio is not 0
>>>>>
>>>>> Signed-off-by: wenxu <wenxu@...oud.cn>
>>>>> ---
>>>>>  drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 4 +++-
>>>>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git
>>>>> a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
>>>>> b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
>>>>> index 089ae4d..64ca90f 100644
>>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
>>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
>>>>> @@ -970,7 +970,9 @@ static int esw_add_fdb_miss_rule(struct
>>>> this piece of code isn't in this function, weird how it got to the
>>>> diff, patch applies correctly though !
>>>>
>>>>> mlx5_eswitch *esw)
>>>>>               flags |= (MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT |
>>>>>                         MLX5_FLOW_TABLE_TUNNEL_EN_DECAP);
>>>>>
>>>>> -     table_prio = (chain * FDB_MAX_PRIO) + prio - 1;
>>>>> +     table_prio = (chain * FDB_MAX_PRIO) + prio;
>>>>> +     if (table_prio)
>>>>> +             table_prio = table_prio - 1;
>>>>>
>>>> This is black magic, even before this fix.
>>>> this -1 seems to be needed in order to call
>>>> create_next_size_table(table_prio) with the previous "table prio" ?
>>>> (table_prio - 1)  ?
>>>>
>>>> The whole thing looks wrong to me since when prio is 0 and chain is 0,
>>>> there is not such thing table_prio - 1.
>>>>
>>>> mlnx eswitch guys in the cc, please advise.
>>> basically, prio 0 is not something we ever get in the driver, since if
>>> user space
>>> specifies 0, the kernel generates some random non-zero prio, and we support
>>> only prios 1-16 -- Wenxu -- what do you run to get this error?
>>>
>>>
>> I run offload with nfatbles(but not tc), there is no prio for each rule.
>>
>> prio of flow_cls_common_offload init as 0.
>>
>> static void nft_flow_offload_common_init(struct flow_cls_common_offload *common,
>>
>>                      __be16 proto,
>>                     struct netlink_ext_ack *extack)
>> {
>>     common->protocol = proto;
>>     common->extack = extack;
>> }
>>
>>
>> flow_cls_common_offload
>
> Note that on
> [PATCH net-next] netfilter: nf_table_offload: Fix zero prio of flow_cls_common_offload
> I asked Pablo on how nftables should behave on this situation.
>
> It's the same issue as in the patch above but being fixed at a
> different level.

That's better, since the original code relied on not having prio 0 as valid, the suggested fix (net/mlx5e: Fix zero table prio set by user) maps NFT offload prio 0 and tc prio 1 to the same

hardware table. This is wrong and can cause issues.