netdev - Re: [RFC] HW TX Rate Limiting Driver API

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM0EoMnEWbLJXNChpDrnKSsu6gXjaPwCX9jRqKv0UagPUuo1tA@mail.gmail.com>
Date: Fri, 5 Apr 2024 09:33:50 -0400
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Simon Horman <horms@...nel.org>
Cc: netdev@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>, Jiri Pirko <jiri@...nulli.us>, 
	Madhu Chittim <madhu.chittim@...el.com>, Sridhar Samudrala <sridhar.samudrala@...el.com>, 
	Paolo Abeni <pabeni@...hat.com>
Subject: Re: [RFC] HW TX Rate Limiting Driver API

On Fri, Apr 5, 2024 at 6:25 AM Simon Horman <horms@...nel.org> wrote:
>
> Hi,
>
> This is follow-up to the ongoing discussion started by Intel to extend the
> support for TX shaping H/W offload [1].
>
> The goal is allowing the user-space to configure TX shaping offload on a
> per-queue basis with min guaranteed B/W, max B/W limit and burst size on a
> VF device.
>
>
> In the past few months several different solutions were attempted and
> discussed, without finding a perfect fit:
>
> - devlink_rate APIs are not appropriate for to control TX shaping on netdevs
> - No existing TC qdisc offload covers the required feature set
> - HTB does not allow direct queue configuration
> - MQPRIO imposes constraint on the maximum number of TX queues
> - TBF does not support max B/W limit
> - ndo_set_tx_maxrate() only controls the max B/W limit
>
> A new H/W offload API is needed, but offload API proliferation should be
> avoided.
>
> The following proposal intends to cover the above specified requirement and
> provide a possible base to unify all the shaping offload APIs mentioned above.
>
> The following only defines the in-kernel interface between the core and
> drivers. The intention is to expose the feature to user-space via Netlink.
> Hopefully the latter part should be straight-forward after agreement
> on the in-kernel interface.
>
> All feedback and comment is more then welcome!
>
> [1] https://lore.kernel.org/netdev/20230808015734.1060525-1-wenjun1.wu@intel.com/
>

My 2 cents:
I did peruse the lore quoted thread but i am likely to have missed something.
It sounds like the requirement is for egress-from-host (which to a
device internal looks like ingress-from-host on the device). Doesn't
existing HTB offload already support this? I didnt see this being
discussed in the thread. Also, IIUC, there is no hierarchy
requirement. That is something you can teach HTB but there's probably
something i missed because i didnt understand the context of "HTB does
not allow direct queue configuration". If HTB is confusing from a
config pov then it seems what Paolo was describing in the thread on
TBF is a reasonable approach too. I couldnt grok why that TBF
extension for max bw was considered a bad idea.
On config:
Could we not introduce skip_sw/hw semantics for qdiscs? IOW, skip_sw
means the config is only subjected to hw and you have DIRECT
semantics, etc.
I understand the mlnx implementation of HTB does a lot of things in
the driver but the one nice thing they had was ability to use classid
X:Y to select a egress h/w queue. The driver resolution of all the
hierarchy is not needed at all here if i understood the requirement
above.
You still need to have a classifier in s/w (which could be attached to
clsact egress) to select the queue. That is something the mlnx
implementation allowed. So there is no "double queueing"

If this is about totally bypassing s/w config then its a different ballgame..

cheers,
jamal

> Regards,
> Simon with much assistance from Paolo
>
> ---
> /* SPDX-License-Identifier: GPL-2.0-or-later */
>
> #ifndef _NET_SHAPER_H_
> #define _NET_SHAPER_H_
>
> /**
>  * enum shaper_metric - the metric of the shaper
>  * @SHAPER_METRIC_PPS: Shaper operates on a packets per second basis
>  * @SHAPER_METRIC_BPS: Shaper operates on a bits per second basis
>  */
> enum shaper_metric {
>         SHAPER_METRIC_PPS;
>         SHAPER_METRIC_BPS;
> };
>
> #define SHAPER_ROOT_ID 0
> #define SHAPER_NONE_ID UINT_MAX
>
> /**
>  * struct shaper_info - represent a node of the shaper hierarchy
>  * @id: Unique identifier inside the shaper tree.
>  * @parent_id: ID of parent shaper, or SHAPER_NONE_ID if the shaper has
>  *             no parent. Only the root shaper has no parent.
>  * @metric: Specify if the bw limits refers to PPS or BPS
>  * @bw_min: Minimum guaranteed rate for this shaper
>  * @bw_max: Maximum peak bw allowed for this shaper
>  * @burst: Maximum burst for the peek rate of this shaper
>  * @priority: Scheduling priority for this shaper
>  * @weight: Scheduling weight for this shaper
>  *
>  * The full shaper hierarchy is maintained only by the
>  * NIC driver (or firmware), possibly in a NIC-specific format
>  * and/or in H/W tables.
>  * The kernel uses this representation and the shaper_ops to
>  * access, traverse, and update it.
>  */
> struct shaper_info {
>         /* The following fields allow the full traversal of the whole
>          * hierarchy.
>          */
>         u32 id;
>         u32 parent_id;
>
>         /* The following fields define the behavior of the shaper. */
>         enum shaper_metric metric;
>         u64 bw_min;
>         u64 bw_max;
>         u32 burst;
>         u32 priority;
>         u32 weight;
> };
>
> /**
>  * enum shaper_lookup_mode - Lookup method used to access a shaper
>  * @SHAPER_LOOKUP_BY_PORT: The root shaper for the whole H/W, @id is unused
>  * @SHAPER_LOOKUP_BY_NETDEV: The main shaper for the given network device,
>  *                           @id is unused
>  * @SHAPER_LOOKUP_BY_VF: @id is a virtual function number.
>  * @SHAPER_LOOKUP_BY_QUEUE: @id is a queue identifier.
>  * @SHAPER_LOOKUP_BY_TREE_ID: @id is the unique shaper identifier inside the
>  *                            shaper hierarchy as in shaper_info.id
>  *
>  * SHAPER_LOOKUP_BY_PORT and SHAPER_LOOKUP_BY_VF, SHAPER_LOOKUP_BY_TREE_ID are
>  * only available on PF devices, usually inside the host/hypervisor.
>  * SHAPER_LOOKUP_BY_NETDEV is available on both PFs and VFs devices, but
>  * only if the latter are privileged ones.
>  * The same shaper can be reached with different lookup mode/id pairs,
>  * mapping network visible objects (devices, VFs, queues) to the scheduler
>  * hierarchy and vice-versa.
>  */
> enum shaper_lookup_mode {
>     SHAPER_LOOKUP_BY_PORT,
>     SHAPER_LOOKUP_BY_NETDEV,
>     SHAPER_LOOKUP_BY_VF,
>     SHAPER_LOOKUP_BY_QUEUE,
>     SHAPER_LOOKUP_BY_TREE_ID,
> };
>
>
> /**
>  * struct shaper_ops - Operations on shaper hierarchy
>  * @get: Access the specified shaper.
>  * @set: Modify the specifier shaper.
>  * @move: Move the specifier shaper inside the hierarchy.
>  * @add: Add a shaper inside the shaper hierarchy.
>  * @delete: Delete the specified shaper .
>  *
>  * The netdevice exposes a pointer to these ops.
>  *
>  * It’s up to the driver or firmware to create the default shapers hierarchy,
>  * according to the H/W capabilities.
>  */
> struct shaper_ops {
>         /* get - Fetch the specified shaper, if it exists
>          * @dev: Netdevice to operate on.
>          * @lookup_mode: How to perform the shaper lookup
>          * @id: ID of the specified shaper,
>          *      relative to the specified @lookup_mode.
>          * @shaper: Object to return shaper.
>          * @extack: Netlink extended ACK for reporting errors.
>          *
>          * Multiple placement domain/id pairs can refer to the same shaper.
>          * And multiple entities (e.g. VF and PF) can try to access the same
>          * shaper concurrently.
>          *
>          * Values of @id depend on the @access_type:
>          * * If @access_type is SHAPER_LOOKUP_BY_PORT or
>          *   SHAPER_LOOKUP_BY_NETDEV, then @placement_id is unused.
>          * * If @access_type is SHAPER_LOOKUP_BY_VF,
>          *   then @id is a virtual function number, relative to @dev
>          *   which should be phisical function
>          * * If @access_type is SHAPER_LOOKUP_BY_QUEUE,
>          *   Then @id represents the queue number, relative to @dev
>          * * If @access_type is SHAPER_LOOKUP_BY_TREE_ID,
>          *   then @id is a @shaper_info.id and any shaper inside the
>          *   hierarcy can be accessed directly.
>          *
>          * Return:
>          * * %0 - Success
>          * * %-EOPNOTSUPP - Operation is not supported by hardware, driver,
>          *                  or core for any reason. @extack should be set
>          *                  to text describing the reason.
>          * * Other negative error value on failure.
>          */
>         int (*get)(struct net_device *dev,
>                    enum shaper_lookup_mode lookup_mode, u32 id,
>                    struct shaper_info *shaper, struct netlink_ext_ack *extack);
>
>         /* set - Update the specified shaper, if it exists
>          * @dev: Netdevice to operate on.
>          * @lookup_mode: How to perform the shaper lookup
>          * @id: ID of the specified shaper,
>          *      relative to the specified @access_type.
>          * @shaper: Configuration of shaper.
>          * @extack: Netlink extended ACK for reporting errors.
>          *
>          * Configure the parameters of @shaper according to values supplied
>          * in the following fields:
>          * * @shaper.metric
>          * * @shaper.bw_min
>          * * @shaper.bw_max
>          * * @shaper.burst
>          * * @shaper.priority
>          * * @shaper.weight
>          * Values supplied in other fields of @shaper must be zero and,
>          * other than verifying that, are ignored.
>          *
>          * Return:
>          * * %0 - Success
>          * * %-EOPNOTSUPP - Operation is not supported by hardware, driver,
>          *                  or core for any reason. @extack should be set
>          *                  to text describing the reason.
>          * * Other negative error values on failure.
>          */
>         int (*set)(struct net_device *dev,
>                    enum shaper_lookup_mode lookup_mode, u32 id,
>                    const struct shaper_info *shaper,
>                    struct netlink_ext_ack *extack);
>
>         /* Move - change the parent id of the specified shaper
>          * @dev: netdevice to operate on.
>          * @lookup_mode: how to perform the shaper lookup
>          * @id: ID of the specified shaper,
>          *                      relative to the specified @access_mode.
>          * @new_parent_id: new ID of the parent shapers,
>          *                      always relative to the SHAPER_LOOKUP_BY_TREE_ID
>          *                      lookup mode
>          * @extack: Netlink extended ACK for reporting errors.
>          *
>          * Move the specified shaper in the hierarchy replacing its
>          * current parent shaper with @new_parent_id
>          *
>          * Return:
>          * * %0 - Success
>          * * %-EOPNOTSUPP - Operation is not supported by hardware, driver,
>          *                  or core for any reason. @extack should be set
>          *                  to text describing the reason.
>          * * Other negative error values on failure.
>          */
>         int (*move)(struct net_device *dev,
>                     enum shaper_lookup_mode lookup_mode, u32 id,
>                     u32 new_parent_id, struct netlink_ext_ack *extack);
>
>         /* add - Add a shaper inside the shaper hierarchy
>          * @dev: netdevice to operate on.
>          * @shaper: configuration of shaper.
>          * @extack: Netlink extended ACK for reporting errors.
>          *
>          * @shaper.id must be set to SHAPER_NONE_ID as
>          * the id for the shaper will be automatically allocated.
>          * @shaper.parent_id determines where inside the shaper's tree
>          * this node is inserted.
>          *
>          * Return:
>          * * non-negative shaper id on success
>          * * %-EOPNOTSUPP - Operation is not supported by hardware, driver,
>          *                  or core for any reason. @extack should be set
>          *                  to text describing the reason.
>          * * Other negative error values on failure.
>          *
>          * Examples or reasons this operation may fail include:
>          * * H/W resources limits.
>          * * The parent is a ‘leaf’ node - attached to a queue.
>          * * Can’t respect the requested bw limits.
>          */
>         int (*add)(struct net_device *dev, const struct shaper_info *shaper,
>                    struct netlink_ext_ack *extack);
>
>         /* delete - Add a shaper inside the shaper hierarchy
>          * @dev: netdevice to operate on.
>          * @lookup_mode: how to perform the shaper lookup
>          * @id: ID of the specified shaper,
>          *                      relative to the specified @access_type.
>          * @shaper: Object to return the deleted shaper configuration.
>          *              Ignored if NULL.
>          * @extack: Netlink extended ACK for reporting errors.
>          *
>          * Return:
>          * * %0 - Success
>          * * %-EOPNOTSUPP - Operation is not supported by hardware, driver,
>          *                  or core for any reason. @extack should be set
>          *                  to text describing the reason.
>          * * Other negative error values on failure.
>          */
>         int (*delete)(struct net_device *dev,
>                       enum shaper_lookup_mode lookup_mode,
>                       u32 id, struct shaper_info *shaper,
>                       struct netlink_ext_ack *extack);
> };
>
> /*
>  * Examples:
>  * - set shaping on a given queue
>  *   struct shaper_info info = { // fill this };
>  *   dev->shaper_ops->set(dev, SHAPER_LOOKUP_BY_QUEUE, queue_id, &info, NULL);
>  *
>  * - create a queue group with a queue group shaping limits.
>  *   Assuming the following topology already exists:
>  *                    < netdev shaper >
>  *                      /           \
>  *         <queue 0 shaper> . . .  <queue N shaper>
>  *
>  *   struct shaper_info pinfo, ginfo;
>  *   dev->shaper_ops->get(dev, SHAPER_LOOKUP_BY_NETDEV, 0, &pinfo);
>  *
>  *   ginfo.parent_id = pinfo.id;
>  *   // fill-in other shaper params...
>  *   new_node_id = dev->shaper_ops->add(dev, &ginfo);
>  *
>  *   // now topology is:
>  *   //                  <    netdev shaper    >
>  *   //                  /            |        \
>  *   //                 /             |        <newly created shaper>
>  *   //                /              |
>  *   // <queue 0 shaper> . . . <queue N shaper>
>  *
>  *   // move a shapers for queues 3..n out of such queue group
>  *   for (i = 0; i <= 2; ++i)
>  *           dev->shaper_ops->move(dev, SHAPER_LOOKUP_BY_QUEUE, i, new_node_id);
>  *
>  *   // now topology is:
>  *   //                   < netdev shaper >
>  *   //                   /              \
>  *   //        <newly created shaper>   <queue 3 shaper> ... <queue n shaper>
>  *   //         /                   \
>  *   // <queue 0 shaper> ... <queue 2 shaper>
>  */
> #endif
>
>