[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210420133529.4904f08b@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date: Tue, 20 Apr 2021 13:35:29 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: dlinkin@...dia.com
Cc: netdev@...r.kernel.org, davem@...emloft.net, jiri@...dia.com
Subject: Re: [PATCH net-next 00/18] devlink: rate objects API
On Tue, 20 Apr 2021 14:33:36 +0300 dlinkin@...dia.com wrote:
> From: Dmytro Linkin <dlinkin@...dia.com>
>
> Currently kernel provides a way to change tx rate of single VF in
> switchdev mode via tc-police action. When lots of VFs are configured
> management of theirs rates becomes non-trivial task and some grouping
> mechanism is required. Implementing such grouping in tc-police will bring
> flow related limitations and unwanted complications, like:
> - flows requires net device to be placed on
Meaning they are only usable in "switchdev mode"?
> - effect of limiting depends on the position of tc-police action in the
> pipeline
Could you expand? tc-police is usually expected to be first.
> - etc.
Please expand.
> According to that devlink is the most appropriate place.
>
> This series introduces devlink API for managing tx rate of single devlink
> port or of a group by invoking callbacks (see below) of corresponding
> driver. Also devlink port or a group can be added to the parent group,
> where driver responsible to handle rates of a group elements. To achieve
> all of that new rate object is added. It can be one of the two types:
> - leaf - represents a single devlink port; created/destroyed by the
> driver and bound to the devlink port. As example, some driver may
> create leaf rate object for every devlink port associated with VF.
> Since leaf have 1to1 mapping to it's devlink port, in user space it is
> referred as pci/<bus_addr>/<port_index>;
> - node - represents a group of rate objects; created/deleted by request
> from the userspace; initially empty (no rate objects added). In
> userspace it is referred as pci/<bus_addr>/<node_name>, where node name
> can be any, except decimal number, to avoid collisions with leafs.
>
> devlink_ops extended with following callbacks:
> - rate_{leaf|node}_tx_{share|max}_set
> - rate_node_{new|del}
> - rate_{leaf|node}_parent_set
Tx is incorrect. You're setting an admission rate limiter on the port.
> KAPI provides:
> - creation/destruction of the leaf rate object associated with devlink
> port
> - storing/retrieving driver specific data in rate object
>
> UAPI provides:
> - dumping all or single rate objects
> - setting tx_{share|max} of rate object of any type
> - creating/deleting node rate object
> - setting/unsetting parent of any rate object
> Add devlink rate object support for netdevsim driver.
> To support devlink rate objects implement VF ports and eswitch mode
> selector for netdevsim driver.
>
> Issues/open questions:
> - Does user need DEVLINK_CMD_RATE_DEL_ALL_CHILD command to clean all
> children of particular parent node? For example:
> $ devlink port func rate flush netdevsim/netdevsim10/group
Is this an RFC? There is no real user in this set.
Powered by blists - more mailing lists