[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a6beaa28-cd5d-4a8b-9df5-9f09b2632849@nvidia.com>
Date: Wed, 23 Apr 2025 09:50:34 +0300
From: Carolina Jubran <cjubran@...dia.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Cosmin Ratiu <cratiu@...dia.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"horms@...nel.org" <horms@...nel.org>,
"andrew+netdev@...n.ch" <andrew+netdev@...n.ch>,
"davem@...emloft.net" <davem@...emloft.net>, Tariq Toukan
<tariqt@...dia.com>, Gal Pressman <gal@...dia.com>,
"jiri@...nulli.us" <jiri@...nulli.us>,
"edumazet@...gle.com" <edumazet@...gle.com>,
Saeed Mahameed <saeedm@...dia.com>, "pabeni@...hat.com" <pabeni@...hat.com>
Subject: Re: net-shapers plan
On 14/04/2025 19:27, Jakub Kicinski wrote:
> On Mon, 14 Apr 2025 11:27:00 +0300 Carolina Jubran wrote:
>>> I hope you understand my concern, tho. Since you're providing the first
>>> implementation, if the users can grow dependent on such behavior we'd
>>> be in no position to explain later that it's just a quirk of mlx5 and
>>> not how the API is intended to operate.
>>
>> Thanks for bringing this up. I want to make it clear that traffic
>> classes must be properly matched to queues. We don’t rely on the
>> hardware fallback behavior in mlx5. If the driver or firmware isn’t
>> configured correctly, traffic class bandwidth control won’t work as
>> expected — the user will suffer from constant switching of the TX queue
>> between scheduling queues and head-of-line blocking. As a result, users
>> shouldn’t expect reliable performance or correct bandwidth allocation.
>> We don’t encourage configuring this without proper TX queue mapping, so
>> users won’t grow dependent on behavior that only happens to work without it.
>> We tried to highlight this in the plan section discussing queue
>> selection and head-of-line blocking: To make traffic class shaping work,
>> we must keep traffic classes separate for each transmit queue.
>
> Right, my concern is more that there is no requirement for explicit
> configuration of the queues, as long as traffic arrives silo'ed WRT
> DSCP markings. As long as a VF sorts the traffic it does not have
> to explicitly say (or even know) that queue A will land in TC N.
>
Even if the VF sends DSCP marked traffic, the packet's classification
into a traffic class still depends on the prio-to-TC mapping set by the
hypervisor. Without that mapping, the hardware can't reliably classify
packets, and traffic may not land in the intended TC.
Overall, for traffic class separation and scheduling to work as
intended, the VF and hypervisor need to be in sync. The VF provides the
markings, but the hypervisor owns the classification logic.
The hypervisor sets up the classification mechanism; it’s up to the VFs
to use it correctly, otherwise, packets will be misclassified. In a
virtualized setup, VFs are untrusted and don’t control classification or
shaping, they just select which queue to transmit on.
> BTW the classification is before all rewrites? IOW flower or any other
> forwarding rules cannot affect scheduling?
The classification happens after forwarding actions. So yes, if the user
modifies DSCP or VLAN priority as part of a TC rule, that rewritten
value is what we use for classification and scheduling. The
classification reflects how the packet will look on the wire.
Powered by blists - more mailing lists