[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <97442589-c504-d997-52fb-edc0bdf1cbe5@nvidia.com>
Date: Wed, 21 Apr 2021 15:08:07 +0300
From: Dmytro Linkin <dlinkin@...dia.com>
To: Jakub Kicinski <kuba@...nel.org>
CC: <netdev@...r.kernel.org>, <davem@...emloft.net>, <jiri@...dia.com>
Subject: Re: [PATCH net-next 00/18] devlink: rate objects API
On 4/20/21 11:35 PM, Jakub Kicinski wrote:
> On Tue, 20 Apr 2021 14:33:36 +0300 dlinkin@...dia.com wrote:
>> From: Dmytro Linkin <dlinkin@...dia.com>
>>
>> Currently kernel provides a way to change tx rate of single VF in
>> switchdev mode via tc-police action. When lots of VFs are configured
>> management of theirs rates becomes non-trivial task and some grouping
>> mechanism is required. Implementing such grouping in tc-police will bring
>> flow related limitations and unwanted complications, like:
>> - flows requires net device to be placed on
>
> Meaning they are only usable in "switchdev mode"?
Meaning, "groups" wouldn't have corresponding net devices and needs
somehow to deal with that. I'll rephrase this line.
>
>> - effect of limiting depends on the position of tc-police action in the
>> pipeline
>
> Could you expand? tc-police is usually expected to be first.
Ok
>
>> - etc.
>
> Please expand.
Ok
>
>> According to that devlink is the most appropriate place.
>>
>> This series introduces devlink API for managing tx rate of single devlink
>> port or of a group by invoking callbacks (see below) of corresponding
>> driver. Also devlink port or a group can be added to the parent group,
>> where driver responsible to handle rates of a group elements. To achieve
>> all of that new rate object is added. It can be one of the two types:
>> - leaf - represents a single devlink port; created/destroyed by the
>> driver and bound to the devlink port. As example, some driver may
>> create leaf rate object for every devlink port associated with VF.
>> Since leaf have 1to1 mapping to it's devlink port, in user space it is
>> referred as pci/<bus_addr>/<port_index>;
>> - node - represents a group of rate objects; created/deleted by request
>> from the userspace; initially empty (no rate objects added). In
>> userspace it is referred as pci/<bus_addr>/<node_name>, where node name
>> can be any, except decimal number, to avoid collisions with leafs.
>>
>> devlink_ops extended with following callbacks:
>> - rate_{leaf|node}_tx_{share|max}_set
>> - rate_node_{new|del}
>> - rate_{leaf|node}_parent_set
>
> Tx is incorrect. You're setting an admission rate limiter on the port.
>
>> KAPI provides:
>> - creation/destruction of the leaf rate object associated with devlink
>> port
>> - storing/retrieving driver specific data in rate object
>>
>> UAPI provides:
>> - dumping all or single rate objects
>> - setting tx_{share|max} of rate object of any type
>> - creating/deleting node rate object
>> - setting/unsetting parent of any rate object
>
>> Add devlink rate object support for netdevsim driver.
>> To support devlink rate objects implement VF ports and eswitch mode
>> selector for netdevsim driver.
>>
>> Issues/open questions:
>> - Does user need DEVLINK_CMD_RATE_DEL_ALL_CHILD command to clean all
>> children of particular parent node? For example:
>> $ devlink port func rate flush netdevsim/netdevsim10/group
>
> Is this an RFC? There is no real user in this set.
Yes. I'll resend patches anyway, because of issue with smtp server
>
Powered by blists - more mailing lists