[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <77fbf905-77f1-96fb-921a-515ee71ece10@mellanox.com>
Date: Fri, 29 Dec 2017 19:09:21 +0200
From: Arkadi Sharshevsky <arkadis@...lanox.com>
To: David Ahern <dsa@...ulusnetworks.com>,
Jiri Pirko <jiri@...nulli.us>
Cc: Yuval Mintz <yuvalm@...lanox.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
mlxsw <mlxsw@...lanox.com>, "andrew@...n.ch" <andrew@...n.ch>,
"vivien.didelot@...oirfairelinux.com"
<vivien.didelot@...oirfairelinux.com>,
"f.fainelli@...il.com" <f.fainelli@...il.com>,
"michael.chan@...adcom.com" <michael.chan@...adcom.com>,
"ganeshgr@...lsio.com" <ganeshgr@...lsio.com>,
Saeed Mahameed <saeedm@...lanox.com>,
Matan Barak <matanb@...lanox.com>,
Leon Romanovsky <leonro@...lanox.com>,
Ido Schimmel <idosch@...lanox.com>,
"jakub.kicinski@...ronome.com" <jakub.kicinski@...ronome.com>,
"ast@...nel.org" <ast@...nel.org>,
"daniel@...earbox.net" <daniel@...earbox.net>,
"simon.horman@...ronome.com" <simon.horman@...ronome.com>,
"pieter.jansenvanvuuren@...ronome.com"
<pieter.jansenvanvuuren@...ronome.com>,
"john.hurley@...ronome.com" <john.hurley@...ronome.com>,
"alexander.h.duyck@...el.com" <alexander.h.duyck@...el.com>,
"linville@...driver.com" <linville@...driver.com>,
"gospo@...adcom.com" <gospo@...adcom.com>,
"steven.lin1@...adcom.com" <steven.lin1@...adcom.com>,
Or Gerlitz <ogerlitz@...lanox.com>,
"roopa@...ulusnetworks.com" <roopa@...ulusnetworks.com>,
Shrijeet Mukherjee <shm@...ulusnetworks.com>
Subject: Re: [patch net-next v2 00/10] Add support for resource abstraction
On 12/28/2017 06:33 PM, David Ahern wrote:
> On 12/28/17 10:23 AM, Jiri Pirko wrote:
>>> So there are 4 tables exported to userspace:
>>>
>>> 1. mlxsw_erif table which is not in any of the kvd regions (no
>>> resource path is given) and it has a size of 1000. Does
>>> mlxsw_erif mean a rif as in Router Interfaces? So the switch
>>> supports up to 1000 router interfaces.
>>>
>>> 2. mlxsw_host4 in /kvd/hash_single with a size of 62. Based on
>>> the
>> Size tells you the actual size. It cannot give you max size. The
>> reason is simple. The resources are shared among multiple tables.
>> That is exactly what this resource patch makes visible.
>>
>>
>
> In the erif table, the 1000 is the max not current usage. I do not
> have 1000 interfaces:
>
> $ ip -br li sh | wc -l 597
>
>
> $ devlink dpipe table dump pci/0000:03:00.0 name mlxsw_erif ... index
> 503 match_value: type field_exact header mlxsw_meta field erif_port
> mapping ifindex mapping_value 601 value 503 action_value: type
> field_modify header mlxsw_meta field l3_forward value 1
>
>
> The host4 table it is current size with no maximum.
>
> The meaning of table size needs to be consistent across tables.
>
You are right the egress RIF table size is not correct, I will
definitely fix it, but it is not what you think it should be. So in
order to clarify this point, just a reminder:
1. Both dpipe and devlink resource are abstraction models for
hardware entities, and as a result they true to provide generic objects.
Each driver/ASIC should register his own and it absolutely proprietary
implementation. There is absolutely NO industry standard here, the only
thing that resembles a standard is that dpipe looks a bit like P4 only
because its proved to be useful for describing packet forwarding
pipelines. The host4 table is just a hardware process in the mellanox
spectrum ASIC pipeline and it should not be part of ABI, sorry I clearly
don't understand how this even came up.
2. Dpipe table is a single hardware process, most of the time it uses
some resources (for example LPM algorithm uses hash memory).
3. ERIF table is a table that is located in the end of the L3 pipeline.
The current dpipe description is not complete and that why it caused
confusion. The table performs match on rif index and packet type
(UC/MC/BC) and performs forward/drop decision. As you can see, for each
rif the table can have several entries, which provide different
statistics for different traffic types per rif, currently only the UC
is exposed with forward.
4. ASICs use shared resource for many processes, this is exactly the
behavior we want to expose!
Again, the size of the ERIF table should NOT provide the number of
rifs which are in use, simply because dpipe tables do not describe
hardware resources.
In the future the RIF bank will be exported as resource object with size
of 1000, and in order to observe how much are in use you should check
its occupancy. This is the whole reason of this interface.
Powered by blists - more mailing lists