[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <69a6ec3e-b892-ef71-1d5c-5789fc014d04@cumulusnetworks.com>
Date: Thu, 4 Jan 2018 09:44:29 -0700
From: David Ahern <dsa@...ulusnetworks.com>
To: Arkadi Sharshevsky <arkadis@...lanox.com>,
Jiri Pirko <jiri@...nulli.us>, netdev@...r.kernel.org,
roopa@...ulusnetworks.com
Cc: davem@...emloft.net, mlxsw@...lanox.com, andrew@...n.ch,
vivien.didelot@...oirfairelinux.com, f.fainelli@...il.com,
michael.chan@...adcom.com, ganeshgr@...lsio.com,
saeedm@...lanox.com, matanb@...lanox.com, leonro@...lanox.com,
idosch@...lanox.com, jakub.kicinski@...ronome.com, ast@...nel.org,
daniel@...earbox.net, simon.horman@...ronome.com,
pieter.jansenvanvuuren@...ronome.com, john.hurley@...ronome.com,
alexander.h.duyck@...el.com, linville@...driver.com,
gospo@...adcom.com, steven.lin1@...adcom.com, yuvalm@...lanox.com,
ogerlitz@...lanox.com
Subject: Re: [patch net-next v2 00/10] Add support for resource abstraction
On 1/4/18 9:13 AM, Arkadi Sharshevsky wrote:
>
>
> On 01/04/2018 05:58 PM, David Ahern wrote:
>> On 1/4/18 2:24 AM, Arkadi Sharshevsky wrote:
>>>> Again, my comments all stem from user experience.
>>>>
>>>> Can you explain what "double_word" means for a unit? I would expect a
>>>> units to be kB or count (or items or entries).
>>>>
>>>
>>> Double word is 64 bit, dont understand why this is confusing.
>>
>> As Andrew pointed out, double word can have a range of sizes.
>>
>> To my point here, a 'unit' for a number should not be the number of bits
>> it is represented by. I believe all of the kvd sizes are in entries
>> (ie., a linear size of 98304 means I can have 98,304 entries in that
>> resource).
>>
>>>
>>>> $ devlink resource show pci/0000:03:00.0
>>>> pci/0000:03:00.0:
>>>> name kvd size 245760 unit double_word size_valid true
>>>> resources:
>>>> name linear size 98304 occ 0 unit double_word
>>>> name hash_double size 60416 unit double_word
>>>> name hash_single size 87040 unit double_word
>>>>
>>>> While that is confusing here from the userspace command it goes hand in
>>>> hand with patch 2 and:
>>>>
>>>> +enum devlink_resource_unit {
>>>> + DEVLINK_RESOURCE_UNIT_DOUBLE_WORD,
>>>> +};
>>>>
>>>>
>>>> Also, it seems like the occ of 0 is wrong since we know from past
>>>> responses that if I set linear to 0 all of networking breaks.
>>>
>>> You are clearly misunderstanding what is occupancy of the resource
>>> and what kvd linear is good for. As I mentioned in the last response
>>> kvd linear is mapped to adjacency table. So in case its 0 no nexthop
>>> routes could be configured, this information is provided by the
>>> dpipe<-->resource.
>>>
>>> Occupancy means how much of the resource is used right now, why is
>>> this wrong? and how its related to the size 0 exactly?
>>
>> The summary line above shows the current kvd/linear occupancy is '0'.
>> That suggests my L3 only deployment is not using any kvd/linear which
>> means I can set its allocation to 0 and devote more kvd resources to the
>> hash regions.
>>
>> But, as I pointed out in previous responses I can not set linear to 0.
>> If I do all of networking breaks. That suggests that kvd/linear is used
>> by more networking entities than you are currently reporting. Hence,
>> telling me the kvd/linear occupancy is 0 is wrong.
>>
>> I believe the stems from the how you are determining occupancy and the
>> fact that not all tables have been implemented. Showing the current
>> occupancy of a resource is very helpful, so it should be part of the API.
>>
>> However, until mlxsw shows a proper value (either by implementing all
>> dpipe tables or changing how it is calculated) it should not be
>> returning that attribute. As it stands you are giving a user invalid data.
>>
>
> Wait, the occupancy is very precise it goes directly to the linear
> allocator inside the driver and gives exact current usage. You assumed
> it goes to the dpipe tables which is incorrect.
I am trying *guess* based on this discussion as to why it is 0. In past
responses you push that the tables expose how the resource is used.
>
> The linear kvd memory is used by:
> 1. The adjacency table is responsible for the ECMP and nexthop
> resolution.
> 2. ACL flexible actions blocks (dpipe will contain table for this).
>
> I dont know what you configured but in case you got some remote
> routes with a nexthop the occ should not be zero.
>
> If your router contains only local routes and dont use ACL you can
> shrink it to zero, no problem.
My current setup is an L3 only deployment, no ACLs and no bridges.
We are meeting in 15 minutes to discuss this design; let's use some of
the time for debugging this problem.
>
>>>
>>>>
>>>>
>>>>
>>>> How does a user learn the granularity of a resource:
>>>>
>>>> $ devlink resource set pci/0000:03:00.0 path /kvd/hash_double size 50000
>>>> Error: mlxsw_spectrum: resource set with wrong granularity.
>>>>
>>>> Try again with 51000 and then 52000 and ... Why not export the
>>>> granularity read-only? I don't see it in the proposed UAPI.
>>>>
>>>
>>> I would like more adding the granularity size to the extack string
>>> instead of adding this to UAPI. The user will try to update once
>>> and will get the required granularity in the error message.
>>
>> A user should not have to run a command to get an error message to know
>> essential data to running a command with the right value. Further, do
>> not assume 'user' is a person or the devlink command.
>>
>> Since granularity is essential to running a valid command, it should be
>> an attribute for each resource.
>>
>
> This is also your approach towards the min/max for resource sizes?
> Am I correct?
yes.
Powered by blists - more mailing lists