[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <732253d6-69a4-e7ab-99a2-f310c0f22b12@intel.com>
Date: Fri, 23 Sep 2022 14:11:08 +0200
From: "Wilczynski, Michal" <michal.wilczynski@...el.com>
To: Jakub Kicinski <kuba@...nel.org>
CC: Edward Cree <ecree.xilinx@...il.com>, <netdev@...r.kernel.org>,
<alexandr.lobakin@...el.com>, <dchumak@...dia.com>,
<maximmi@...dia.com>, <jiri@...nulli.us>,
<simon.horman@...igine.com>, <jacob.e.keller@...el.com>,
<jesse.brandeburg@...el.com>, <przemyslaw.kitszel@...el.com>
Subject: Re: [RFC PATCH net-next v4 2/6] devlink: Extend devlink-rate api with
queues and new parameters
On 9/22/2022 10:29 PM, Jakub Kicinski wrote:
> On Thu, 22 Sep 2022 15:45:55 +0200 Wilczynski, Michal wrote:
>> On 9/22/2022 2:50 PM, Jakub Kicinski wrote:
>>> Anyway. My gut feeling is that this is cutting a corner. Seems
>>> most natural for the VF/PF level to be controlled by the admin
>>> and the queue level by whoever owns the queue. The hypervisor
>>> driver/FW should reconcile the two and compile the full hierarchy.
I'm not sure whether this is allowed on mailing list, but I'm attaching
a text file
with an ASCII drawing representing a tree I've send previously as
linear. Hope
you'll find this easier to read.
>> We tried already tc-htb, and it doesn't work for a couple of reasons,
>> even in this potential hybrid with devlink-rate. One of the problems
>> with tc-htb offload is that it forces you to allocate a new
>> queue, it doesn't allow for reassigning an existing queue to another
>> scheduling node. This is our main use case.
>>
>> Here's a discussion about tc-htb:
>> https://lore.kernel.org/netdev/20220704114513.2958937-1-michal.wilczynski@intel.com/
> This is a problem only for "SR-IOV case" or also for just the PF?
The way tc-htb is coded it's NOT possible to reassign queues from one
scheduling node to the
other, this is a generic problem with this implementation, regardless of
SR-IOV or PF. So even if we
wanted to reassign queues only for PF's this wouldn't be possible.
I feel like an example would help. So let's say I do this:
tc qdisc replace dev ens785 root handle 1: htb offload
tc class add dev ens785 parent 1: classid 1:2 htb rate 1000 ceil 2000
tc class add dev ens785 parent 1:2 classid 1:3 htb rate 1000 ceil 2000
tc class add dev ens785 parent 1:2 classid 1:4 htb rate 1000 ceil 2000
tc class add dev ens785 parent 1:3 classid 1:5 htb rate 1000 ceil 2000
tc class add dev ens785 parent 1:4 classid 1:6 htb rate 1000 ceil 2000
1: <-- root qdisc
|
1:2
/ \
/ \
1:3 1:4
| |
| |
1:5 1:6
| |
QID QID <---- here we'll have PFIFO qdiscs
At this point I would have two additional queues in the system, and the
kernel would enqueue packets
to those new queues according to 'tc flower' configuration. So
theoretically we should create a new queue
in a hardware and put it in a privileged position in the scheduling
tree. And I would happily write it this
way, but this is NOT what our customer want. He doesn't want any extra
queues in the system, he just
wants to make existing queues more privileged. And not just PF queues -
he's mostly interested in VF queues.
I'm not sure how to state use case more clearly.
>
>> So either I would have to invent a new offload type (?) for tc, or
>> completely rewrite and
>> probably break tc-htb that mellanox implemented.
>> Also in our use case it's possible to create completely new branches
>> from the root and
>> reassigning queues there. This wouldn't be possible with the method
>> you're proposing.
>>
>> So existing interface doesn't allow us to do what is required.
> For some definition of "what is required" which was not really
> disclosed clearly. Or I'm to slow to grasp.
In most basic variant what we want is a way to make hardware queues more
privileged, and modify
hierarchy of nodes/queues freely. We don't want to create new queues, as
required by tc-htb
implementation. This is main reason why tc-htb and devlink-rate hybrid
doesn't work for us.
BR,
Michał
View attachment "tx_tree.txt" of type "text/plain" (3353 bytes)
Powered by blists - more mailing lists