[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7195630a-1021-4e1e-b48b-a07945477863@redhat.com>
Date: Tue, 30 Jul 2024 15:37:38 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Jiri Pirko <jiri@...nulli.us>
Cc: Cosmin Ratiu <cratiu@...dia.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"jhs@...atatu.com" <jhs@...atatu.com>,
"sridhar.samudrala@...el.com" <sridhar.samudrala@...el.com>,
"john.fastabend@...il.com" <john.fastabend@...il.com>,
"madhu.chittim@...el.com" <madhu.chittim@...el.com>,
"horms@...nel.org" <horms@...nel.org>,
"sgoutham@...vell.com" <sgoutham@...vell.com>,
"kuba@...nel.org" <kuba@...nel.org>
Subject: Re: [RFC PATCH] net: introduce HW Rate Limiting Driver API
On 7/30/24 14:10, Jiri Pirko wrote:
> Wed, Jun 05, 2024 at 05:52:32PM CEST, pabeni@...hat.com wrote:
>> On Wed, 2024-06-05 at 15:04 +0000, Cosmin Ratiu wrote:
>>> On Wed, 2024-05-08 at 22:20 +0200, Paolo Abeni wrote:
>>>
>>>> +/**
>>>> + * struct net_shaper_info - represents a shaping node on the NIC H/W
>>>> + * @metric: Specify if the bw limits refers to PPS or BPS
>>>> + * @bw_min: Minimum guaranteed rate for this shaper
>>>> + * @bw_max: Maximum peak bw allowed for this shaper
>>>> + * @burst: Maximum burst for the peek rate of this shaper
>>>> + * @priority: Scheduling priority for this shaper
>>>> + * @weight: Scheduling weight for this shaper
>>>> + */
>>>> +struct net_shaper_info {
>>>> + enum net_shaper_metric metric;
>>>> + u64 bw_min; /* minimum guaranteed bandwidth, according to metric */
>>>> + u64 bw_max; /* maximum allowed bandwidth */
>>>> + u32 burst; /* maximum burst in bytes for bw_max */
>>>
>>> 'burst' really should be u64 if it can deal with bytes. In a 400Gbps
>>> link, u32 really is peanuts.
>>>
>>>> +/**
>>>> + * enum net_shaper_scope - the different scopes where a shaper could be attached
>>>> + * @NET_SHAPER_SCOPE_PORT: The root shaper for the whole H/W.
>>>> + * @NET_SHAPER_SCOPE_NETDEV: The main shaper for the given network device.
>>>> + * @NET_SHAPER_SCOPE_VF: The shaper is attached to the given virtual
>>>> + * function.
>>>> + * @NET_SHAPER_SCOPE_QUEUE_GROUP: The shaper groups multiple queues under the
>>>> + * same device.
>>>> + * @NET_SHAPER_SCOPE_QUEUE: The shaper is attached to the given device queue.
>>>> + *
>>>> + * NET_SHAPER_SCOPE_PORT and NET_SHAPER_SCOPE_VF are only available on
>>>> + * PF devices, usually inside the host/hypervisor.
>>>> + * NET_SHAPER_SCOPE_NETDEV, NET_SHAPER_SCOPE_QUEUE_GROUP and
>>>> + * NET_SHAPER_SCOPE_QUEUE are available on both PFs and VFs devices.
>>>> + */
>>>> +enum net_shaper_scope {
>>>> + NET_SHAPER_SCOPE_PORT,
>>>> + NET_SHAPER_SCOPE_NETDEV,
>>>> + NET_SHAPER_SCOPE_VF,
>>>> + NET_SHAPER_SCOPE_QUEUE_GROUP,
>>>> + NET_SHAPER_SCOPE_QUEUE,
>>>> +};
>>>
>>> How would modelling groups of VFs (as implemented in [1]) look like
>>> with this proposal?
>>> I could imagine a NET_SHAPER_SCOPE_VF_GROUP scope, with a shared shaper
>>> across multiple VFs.
>>
>> Following-up yday reviewer mtg - which was spent mainly on this topic -
>> - the current direction is to replace NET_SHAPER_SCOPE_QUEUE_GROUP with
>> a more generic 'scope', grouping of either queues, VF/netdev or even
>> other groups (allowing nesting).
>>
>>> How would managing membership of VFs in a group
>>> look like? Will the devlink API continue to be used for that? Or will
>>> something else be introduced?
>>
>> The idea is to introduce a new generic netlink interface, yaml-based,
>> to expose these features to user-space.
>>
>>> Looking a bit into the future now...
>>> I am nowadays thinking about extending the mlx5 VF group rate limit
>>> feature to support VFs from multiple PFs from the same NIC (the
>>> hardware can be configured to use a shared shaper across multiple
>>> ports), how could that feature be represented in this API, given that
>>> ops relate to a netdevice? Which netdevice should be used for this
>>> scenario?
>>
>> I must admit we[1] haven't thought yet about the scenario you describe
>> above. I guess we could encode the PF number and the VF number in the
>> handle major/minor and operate on any PF device belonging to the same
>> silicon, WDYT?
>
> Sometimes, there is no netdevice at all. The infra still should work I
> believe.
Note that in the most recent incarnation of the shaper APIs has been
removed any support for shaper 'above' the network device level (e.g. no
device/VFs groups). The idea is that devlink should be used for such
scenarios.
Cheers,
Paolo
Powered by blists - more mailing lists