[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5a294d33-acf6-7855-524f-bc9b3f9a5005@mellanox.com>
Date: Thu, 4 Jan 2018 18:13:22 +0200
From: Arkadi Sharshevsky <arkadis@...lanox.com>
To: David Ahern <dsa@...ulusnetworks.com>,
Jiri Pirko <jiri@...nulli.us>, netdev@...r.kernel.org,
roopa@...ulusnetworks.com
Cc: davem@...emloft.net, mlxsw@...lanox.com, andrew@...n.ch,
vivien.didelot@...oirfairelinux.com, f.fainelli@...il.com,
michael.chan@...adcom.com, ganeshgr@...lsio.com,
saeedm@...lanox.com, matanb@...lanox.com, leonro@...lanox.com,
idosch@...lanox.com, jakub.kicinski@...ronome.com, ast@...nel.org,
daniel@...earbox.net, simon.horman@...ronome.com,
pieter.jansenvanvuuren@...ronome.com, john.hurley@...ronome.com,
alexander.h.duyck@...el.com, linville@...driver.com,
gospo@...adcom.com, steven.lin1@...adcom.com, yuvalm@...lanox.com,
ogerlitz@...lanox.com
Subject: Re: [patch net-next v2 00/10] Add support for resource abstraction
On 01/04/2018 05:58 PM, David Ahern wrote:
> On 1/4/18 2:24 AM, Arkadi Sharshevsky wrote:
>>> Again, my comments all stem from user experience.
>>>
>>> Can you explain what "double_word" means for a unit? I would expect a
>>> units to be kB or count (or items or entries).
>>>
>>
>> Double word is 64 bit, dont understand why this is confusing.
>
> As Andrew pointed out, double word can have a range of sizes.
>
> To my point here, a 'unit' for a number should not be the number of bits
> it is represented by. I believe all of the kvd sizes are in entries
> (ie., a linear size of 98304 means I can have 98,304 entries in that
> resource).
>
>>
>>> $ devlink resource show pci/0000:03:00.0
>>> pci/0000:03:00.0:
>>> name kvd size 245760 unit double_word size_valid true
>>> resources:
>>> name linear size 98304 occ 0 unit double_word
>>> name hash_double size 60416 unit double_word
>>> name hash_single size 87040 unit double_word
>>>
>>> While that is confusing here from the userspace command it goes hand in
>>> hand with patch 2 and:
>>>
>>> +enum devlink_resource_unit {
>>> + DEVLINK_RESOURCE_UNIT_DOUBLE_WORD,
>>> +};
>>>
>>>
>>> Also, it seems like the occ of 0 is wrong since we know from past
>>> responses that if I set linear to 0 all of networking breaks.
>>
>> You are clearly misunderstanding what is occupancy of the resource
>> and what kvd linear is good for. As I mentioned in the last response
>> kvd linear is mapped to adjacency table. So in case its 0 no nexthop
>> routes could be configured, this information is provided by the
>> dpipe<-->resource.
>>
>> Occupancy means how much of the resource is used right now, why is
>> this wrong? and how its related to the size 0 exactly?
>
> The summary line above shows the current kvd/linear occupancy is '0'.
> That suggests my L3 only deployment is not using any kvd/linear which
> means I can set its allocation to 0 and devote more kvd resources to the
> hash regions.
>
> But, as I pointed out in previous responses I can not set linear to 0.
> If I do all of networking breaks. That suggests that kvd/linear is used
> by more networking entities than you are currently reporting. Hence,
> telling me the kvd/linear occupancy is 0 is wrong.
>
> I believe the stems from the how you are determining occupancy and the
> fact that not all tables have been implemented. Showing the current
> occupancy of a resource is very helpful, so it should be part of the API.
>
> However, until mlxsw shows a proper value (either by implementing all
> dpipe tables or changing how it is calculated) it should not be
> returning that attribute. As it stands you are giving a user invalid data.
>
Wait, the occupancy is very precise it goes directly to the linear
allocator inside the driver and gives exact current usage. You assumed
it goes to the dpipe tables which is incorrect.
The linear kvd memory is used by:
1. The adjacency table is responsible for the ECMP and nexthop
resolution.
2. ACL flexible actions blocks (dpipe will contain table for this).
I dont know what you configured but in case you got some remote
routes with a nexthop the occ should not be zero.
If your router contains only local routes and dont use ACL you can
shrink it to zero, no problem.
>>
>>>
>>>
>>>
>>> How does a user learn the granularity of a resource:
>>>
>>> $ devlink resource set pci/0000:03:00.0 path /kvd/hash_double size 50000
>>> Error: mlxsw_spectrum: resource set with wrong granularity.
>>>
>>> Try again with 51000 and then 52000 and ... Why not export the
>>> granularity read-only? I don't see it in the proposed UAPI.
>>>
>>
>> I would like more adding the granularity size to the extack string
>> instead of adding this to UAPI. The user will try to update once
>> and will get the required granularity in the error message.
>
> A user should not have to run a command to get an error message to know
> essential data to running a command with the right value. Further, do
> not assume 'user' is a person or the devlink command.
>
> Since granularity is essential to running a valid command, it should be
> an attribute for each resource.
>
This is also your approach towards the min/max for resource sizes?
Am I correct?
>
>>
>>>
>>> And then on the reload:
>>>
>>> $ devlink reload pci/0000:03:00.0
>>> Error: devlink: resources size validation failed.
>>>
>>> Since the reload is not doing any resource sizing that error message is
>>> confusing. Maybe something like "Sum of the resource components exceeds
>>> total size."
>>>
>>
>> No problem, sounds better.
>>
>>>
>>> Finally, I still contend a 1-line description of each of the resources
>>> goes a long way to improving the user friendliness of this set.
>>>
>>
>> Strongly against it.
>>
>
Powered by blists - more mailing lists