[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<BY3PR18MB47379D035EB5DD965D6D187FC6122@BY3PR18MB4737.namprd18.prod.outlook.com>
Date: Mon, 22 Apr 2024 11:30:12 +0000
From: Sunil Kovvuri Goutham <sgoutham@...vell.com>
To: Simon Horman <horms@...nel.org>,
"netdev@...r.kernel.org"
<netdev@...r.kernel.org>
CC: Jakub Kicinski <kuba@...nel.org>, Jiri Pirko <jiri@...nulli.us>,
Madhu
Chittim <madhu.chittim@...el.com>,
Sridhar Samudrala
<sridhar.samudrala@...el.com>,
Paolo Abeni <pabeni@...hat.com>
Subject: RE: [RFC] HW TX Rate Limiting Driver API
> -----Original Message-----
> From: Simon Horman <horms@...nel.org>
> Sent: Friday, April 5, 2024 3:53 PM
> To: netdev@...r.kernel.org
> Cc: Jakub Kicinski <kuba@...nel.org>; Jiri Pirko <jiri@...nulli.us>; Madhu
> Chittim <madhu.chittim@...el.com>; Sridhar Samudrala
> <sridhar.samudrala@...el.com>; Paolo Abeni <pabeni@...hat.com>
> Subject: [EXTERNAL] [RFC] HW TX Rate Limiting Driver API
>
> Hi,
>
> This is follow-up to the ongoing discussion started by Intel to extend the
> support for TX shaping H/W offload [1].
>
> The goal is allowing the user-space to configure TX shaping offload on a per-
> queue basis with min guaranteed B/W, max B/W limit and burst size on a VF
> device.
>
>
> In the past few months several different solutions were attempted and
> discussed, without finding a perfect fit:
>
> - devlink_rate APIs are not appropriate for to control TX shaping on netdevs
> - No existing TC qdisc offload covers the required feature set
> - HTB does not allow direct queue configuration
> - MQPRIO imposes constraint on the maximum number of TX queues
> - TBF does not support max B/W limit
> - ndo_set_tx_maxrate() only controls the max B/W limit
>
> A new H/W offload API is needed, but offload API proliferation should be
> avoided.
>
> The following proposal intends to cover the above specified requirement and
> provide a possible base to unify all the shaping offload APIs mentioned above.
>
> The following only defines the in-kernel interface between the core and
> drivers. The intention is to expose the feature to user-space via Netlink.
> Hopefully the latter part should be straight-forward after agreement on the in-
> kernel interface.
>
> All feedback and comment is more then welcome!
>
> [1] https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__lore.kernel.org_netdev_20230808015734.1060525-2D1-
> 2Dwenjun1.wu-
> 40intel.com_&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=q3VKxXQKiboRw
> _F01ggTzHuhwawxR1P9_tMCN2FODU4&m=015N0GdRd3FIBUwfXqM-
> HTLdWkiKKaV6waaJqEgl39nha-
> lqR10J3SmjlrjKbtBA&s=VbuyXcaL4DZCVXveM_Y9iSadF7UJvPNDmTAgCbwSD
> FA&e=
>
> Regards,
> Simon with much assistance from Paolo
>
> ---
> /* SPDX-License-Identifier: GPL-2.0-or-later */
>
> #ifndef _NET_SHAPER_H_
> #define _NET_SHAPER_H_
>
> /**
> * enum shaper_metric - the metric of the shaper
> * @SHAPER_METRIC_PPS: Shaper operates on a packets per second basis
> * @SHAPER_METRIC_BPS: Shaper operates on a bits per second basis */
> enum shaper_metric {
> SHAPER_METRIC_PPS;
> SHAPER_METRIC_BPS;
> };
>
> #define SHAPER_ROOT_ID 0
> #define SHAPER_NONE_ID UINT_MAX
>
> /**
> * struct shaper_info - represent a node of the shaper hierarchy
> * @id: Unique identifier inside the shaper tree.
> * @parent_id: ID of parent shaper, or SHAPER_NONE_ID if the shaper has
> * no parent. Only the root shaper has no parent.
> * @metric: Specify if the bw limits refers to PPS or BPS
> * @bw_min: Minimum guaranteed rate for this shaper
> * @bw_max: Maximum peak bw allowed for this shaper
> * @burst: Maximum burst for the peek rate of this shaper
> * @priority: Scheduling priority for this shaper
> * @weight: Scheduling weight for this shaper
> *
> * The full shaper hierarchy is maintained only by the
> * NIC driver (or firmware), possibly in a NIC-specific format
> * and/or in H/W tables.
> * The kernel uses this representation and the shaper_ops to
> * access, traverse, and update it.
> */
> struct shaper_info {
> /* The following fields allow the full traversal of the whole
> * hierarchy.
> */
> u32 id;
> u32 parent_id;
>
> /* The following fields define the behavior of the shaper. */
> enum shaper_metric metric;
> u64 bw_min;
> u64 bw_max;
> u32 burst;
> u32 priority;
> u32 weight;
> };
>
'bw_min/max' is u64 and 'burst' / 'weight' are u32, any specific reason ?
> /**
> * enum shaper_lookup_mode - Lookup method used to access a shaper
> * @SHAPER_LOOKUP_BY_PORT: The root shaper for the whole H/W, @id is
> unused
> * @SHAPER_LOOKUP_BY_NETDEV: The main shaper for the given network
> device,
> * @id is unused
> * @SHAPER_LOOKUP_BY_VF: @id is a virtual function number.
> * @SHAPER_LOOKUP_BY_QUEUE: @id is a queue identifier.
> * @SHAPER_LOOKUP_BY_TREE_ID: @id is the unique shaper identifier inside
> the
> * shaper hierarchy as in shaper_info.id
> *
> * SHAPER_LOOKUP_BY_PORT and SHAPER_LOOKUP_BY_VF,
> SHAPER_LOOKUP_BY_TREE_ID are
> * only available on PF devices, usually inside the host/hypervisor.
> * SHAPER_LOOKUP_BY_NETDEV is available on both PFs and VFs devices, but
> * only if the latter are privileged ones.
> * The same shaper can be reached with different lookup mode/id pairs,
> * mapping network visible objects (devices, VFs, queues) to the scheduler
> * hierarchy and vice-versa.
> */
> enum shaper_lookup_mode {
> SHAPER_LOOKUP_BY_PORT,
> SHAPER_LOOKUP_BY_NETDEV,
> SHAPER_LOOKUP_BY_VF,
> SHAPER_LOOKUP_BY_QUEUE,
> SHAPER_LOOKUP_BY_TREE_ID,
> };
>
>
> /**
> * struct shaper_ops - Operations on shaper hierarchy
> * @get: Access the specified shaper.
> * @set: Modify the specifier shaper.
> * @move: Move the specifier shaper inside the hierarchy.
> * @add: Add a shaper inside the shaper hierarchy.
> * @delete: Delete the specified shaper .
> *
> * The netdevice exposes a pointer to these ops.
> *
> * It’s up to the driver or firmware to create the default shapers hierarchy,
> * according to the H/W capabilities.
> */
> struct shaper_ops {
> /* get - Fetch the specified shaper, if it exists
> * @dev: Netdevice to operate on.
> * @lookup_mode: How to perform the shaper lookup
> * @id: ID of the specified shaper,
> * relative to the specified @lookup_mode.
> * @shaper: Object to return shaper.
> * @extack: Netlink extended ACK for reporting errors.
> *
> * Multiple placement domain/id pairs can refer to the same shaper.
> * And multiple entities (e.g. VF and PF) can try to access the same
> * shaper concurrently.
> *
> * Values of @id depend on the @access_type:
> * * If @access_type is SHAPER_LOOKUP_BY_PORT or
> * SHAPER_LOOKUP_BY_NETDEV, then @placement_id is unused.
> * * If @access_type is SHAPER_LOOKUP_BY_VF,
> * then @id is a virtual function number, relative to @dev
> * which should be phisical function
> * * If @access_type is SHAPER_LOOKUP_BY_QUEUE,
> * Then @id represents the queue number, relative to @dev
> * * If @access_type is SHAPER_LOOKUP_BY_TREE_ID,
> * then @id is a @shaper_info.id and any shaper inside the
> * hierarcy can be accessed directly.
> *
> * Return:
> * * %0 - Success
> * * %-EOPNOTSUPP - Operation is not supported by hardware, driver,
> * or core for any reason. @extack should be set
> * to text describing the reason.
> * * Other negative error value on failure.
> */
> int (*get)(struct net_device *dev,
> enum shaper_lookup_mode lookup_mode, u32 id,
> struct shaper_info *shaper, struct netlink_ext_ack *extack);
>
> /* set - Update the specified shaper, if it exists
> * @dev: Netdevice to operate on.
> * @lookup_mode: How to perform the shaper lookup
> * @id: ID of the specified shaper,
> * relative to the specified @access_type.
> * @shaper: Configuration of shaper.
> * @extack: Netlink extended ACK for reporting errors.
> *
> * Configure the parameters of @shaper according to values supplied
> * in the following fields:
> * * @shaper.metric
> * * @shaper.bw_min
> * * @shaper.bw_max
> * * @shaper.burst
> * * @shaper.priority
> * * @shaper.weight
> * Values supplied in other fields of @shaper must be zero and,
> * other than verifying that, are ignored.
> *
> * Return:
> * * %0 - Success
> * * %-EOPNOTSUPP - Operation is not supported by hardware, driver,
> * or core for any reason. @extack should be set
> * to text describing the reason.
> * * Other negative error values on failure.
> */
> int (*set)(struct net_device *dev,
> enum shaper_lookup_mode lookup_mode, u32 id,
> const struct shaper_info *shaper,
> struct netlink_ext_ack *extack);
>
> /* Move - change the parent id of the specified shaper
> * @dev: netdevice to operate on.
> * @lookup_mode: how to perform the shaper lookup
> * @id: ID of the specified shaper,
> * relative to the specified @access_mode.
> * @new_parent_id: new ID of the parent shapers,
> * always relative to the SHAPER_LOOKUP_BY_TREE_ID
> * lookup mode
> * @extack: Netlink extended ACK for reporting errors.
> *
> * Move the specified shaper in the hierarchy replacing its
> * current parent shaper with @new_parent_id
> *
> * Return:
> * * %0 - Success
> * * %-EOPNOTSUPP - Operation is not supported by hardware, driver,
> * or core for any reason. @extack should be set
> * to text describing the reason.
> * * Other negative error values on failure.
> */
> int (*move)(struct net_device *dev,
> enum shaper_lookup_mode lookup_mode, u32 id,
> u32 new_parent_id, struct netlink_ext_ack *extack);
>
> /* add - Add a shaper inside the shaper hierarchy
> * @dev: netdevice to operate on.
> * @shaper: configuration of shaper.
> * @extack: Netlink extended ACK for reporting errors.
> *
> * @shaper.id must be set to SHAPER_NONE_ID as
> * the id for the shaper will be automatically allocated.
> * @shaper.parent_id determines where inside the shaper's tree
> * this node is inserted.
> *
> * Return:
> * * non-negative shaper id on success
> * * %-EOPNOTSUPP - Operation is not supported by hardware, driver,
> * or core for any reason. @extack should be set
> * to text describing the reason.
> * * Other negative error values on failure.
> *
> * Examples or reasons this operation may fail include:
> * * H/W resources limits.
> * * The parent is a ‘leaf’ node - attached to a queue.
> * * Can’t respect the requested bw limits.
> */
> int (*add)(struct net_device *dev, const struct shaper_info *shaper,
> struct netlink_ext_ack *extack);
>
> /* delete - Add a shaper inside the shaper hierarchy
> * @dev: netdevice to operate on.
> * @lookup_mode: how to perform the shaper lookup
> * @id: ID of the specified shaper,
> * relative to the specified @access_type.
> * @shaper: Object to return the deleted shaper configuration.
> * Ignored if NULL.
> * @extack: Netlink extended ACK for reporting errors.
> *
> * Return:
> * * %0 - Success
> * * %-EOPNOTSUPP - Operation is not supported by hardware, driver,
> * or core for any reason. @extack should be set
> * to text describing the reason.
> * * Other negative error values on failure.
> */
> int (*delete)(struct net_device *dev,
> enum shaper_lookup_mode lookup_mode,
> u32 id, struct shaper_info *shaper,
> struct netlink_ext_ack *extack); };
>
> /*
> * Examples:
> * - set shaping on a given queue
> * struct shaper_info info = { // fill this };
> * dev->shaper_ops->set(dev, SHAPER_LOOKUP_BY_QUEUE, queue_id,
> &info, NULL);
> *
> * - create a queue group with a queue group shaping limits.
> * Assuming the following topology already exists:
> * < netdev shaper >
> * / \
> * <queue 0 shaper> . . . <queue N shaper>
> *
> * struct shaper_info pinfo, ginfo;
> * dev->shaper_ops->get(dev, SHAPER_LOOKUP_BY_NETDEV, 0, &pinfo);
> *
> * ginfo.parent_id = pinfo.id;
> * // fill-in other shaper params...
> * new_node_id = dev->shaper_ops->add(dev, &ginfo);
> *
> * // now topology is:
> * // < netdev shaper >
> * // / | \
> * // / | <newly created shaper>
> * // / |
> * // <queue 0 shaper> . . . <queue N shaper>
> *
> * // move a shapers for queues 3..n out of such queue group
> * for (i = 0; i <= 2; ++i)
> * dev->shaper_ops->move(dev, SHAPER_LOOKUP_BY_QUEUE, i,
> new_node_id);
> *
> * // now topology is:
> * // < netdev shaper >
> * // / \
> * // <newly created shaper> <queue 3 shaper> ... <queue n shaper>
> * // / \
> * // <queue 0 shaper> ... <queue 2 shaper>
> */
> #endif
>
Simon / Paolo,
Can you please confirm below
1. Does privileged VF mean trusted VF ?
2. With this we can setup below as well, from PF ?
-- Per-VF Quantum
-- Per-VF Strict priority
-- Per-VF ratelimit
3. Wondering if it would be beneficial to apply this to ingress traffic as well, all modes may or may not apply,
but at the least it could be possible to apply PPS/BPS ratelimit. So that the configuration methodology
remains same across egress and ingress.
Thanks,
Sunil.
Powered by blists - more mailing lists