[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c89ce464-4374-a3c3-3f58-727a913af870@intel.com>
Date: Wed, 28 Sep 2022 13:53:24 +0200
From: "Wilczynski, Michal" <michal.wilczynski@...el.com>
To: Jiri Pirko <jiri@...nulli.us>, Edward Cree <ecree.xilinx@...il.com>
CC: <netdev@...r.kernel.org>, <alexandr.lobakin@...el.com>,
<dchumak@...dia.com>, <maximmi@...dia.com>,
<simon.horman@...igine.com>, <jacob.e.keller@...el.com>,
<jesse.brandeburg@...el.com>, <przemyslaw.kitszel@...el.com>
Subject: Re: [RFC PATCH net-next v4 2/6] devlink: Extend devlink-rate api with
queues and new parameters
On 9/26/2022 1:58 PM, Jiri Pirko wrote:
> Tue, Sep 20, 2022 at 01:09:04PM CEST, ecree.xilinx@...il.com wrote:
>> On 19/09/2022 14:12, Wilczynski, Michal wrote:
>>> Maybe a switchdev case would be a good parallel here. When you enable switchdev, you get port representors on
>>> the host for each VF that is already attached to the VM. Something that gives the host power to configure
>>> netdev that it doesn't 'own'. So it seems to me like giving user more power to configure things from the host
> Well, not really. It gives the user on hypervisor possibility
> to configure the eswitch vport side. The other side of the wire, which
> is in VM, is autonomous.
Frankly speaking the VM is still free to assign traffic to queues as before,
I guess the networking card scheduling algorithm will just drain those
queues at different pace.
>
>
>>> is acceptable.
>> Right that's the thing though: I instinctively Want this to be done
>> through representors somehow, because it _looks_ like it ought to
>> be scoped to a single netdev; but that forces the hierarchy to
>> respect netdev boundaries which as we've discussed is an unwelcome
>> limitation.
> Why exacly? Do you want to share a single queue between multiple vport?
> Or what exactly would the the usecase where you hit the limitation?
Like you've noticed in previous comment traffic is assigned from inside
the VM,
this tree simply represents scheduling algorithm in the HW i.e how fast
the card
will drain from each queue. So if you have a queue carrying real-time data,
and the rest carrying bulk, you might want to prioritze real-time data
it i.e put it on a completely different branch on the scheduling tree.
BR,
MichaĆ
>
>
>>> In my mind this is a device-wide configuration, since the ice driver registers each port as a separate pci device.
>>> And each of this devices have their own hardware Tx Scheduler tree global to that port. Queues that we're
>>> discussing are actually hardware queues, and are identified by hardware assigned txq_id.
>> In general, hardware being a single unit at the device level does
>> not necessarily mean its configuration should be device-wide.
>> For instance, in many NICs each port has a single hardware v-switch,
>> but we do not have some kind of "devlink filter" API to program it
>> directly. Instead we attach TC rules to _many_ netdevs, and driver
>> code transforms and combines these to program the unitary device.
>> "device-wide configuration" originally meant things like firmware
>> version or operating mode (legacy vs. switchdev) that do not relate
>> directly to netdevs.
>>
>> But I agree with you that your approach is the "least evil method";
>> if properly explained and documented then I don't have any
>> remaining objection to your patch, despite that I'm continuing to
>> take the opportunity to proselytise for "reprs >> devlink" ;)
>>
>> -ed
Powered by blists - more mailing lists