[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YzVEZWioeVNgMNvK@nanopsycho>
Date: Thu, 29 Sep 2022 09:08:21 +0200
From: Jiri Pirko <jiri@...nulli.us>
To: "Wilczynski, Michal" <michal.wilczynski@...el.com>
Cc: Edward Cree <ecree.xilinx@...il.com>, netdev@...r.kernel.org,
alexandr.lobakin@...el.com, dchumak@...dia.com, maximmi@...dia.com,
simon.horman@...igine.com, jacob.e.keller@...el.com,
jesse.brandeburg@...el.com, przemyslaw.kitszel@...el.com
Subject: Re: [RFC PATCH net-next v4 2/6] devlink: Extend devlink-rate api
with queues and new parameters
Wed, Sep 28, 2022 at 01:53:24PM CEST, michal.wilczynski@...el.com wrote:
>
>
>On 9/26/2022 1:58 PM, Jiri Pirko wrote:
>> Tue, Sep 20, 2022 at 01:09:04PM CEST, ecree.xilinx@...il.com wrote:
>> > On 19/09/2022 14:12, Wilczynski, Michal wrote:
>> > > Maybe a switchdev case would be a good parallel here. When you enable switchdev, you get port representors on
>> > > the host for each VF that is already attached to the VM. Something that gives the host power to configure
>> > > netdev that it doesn't 'own'. So it seems to me like giving user more power to configure things from the host
>> Well, not really. It gives the user on hypervisor possibility
>> to configure the eswitch vport side. The other side of the wire, which
>> is in VM, is autonomous.
>
>Frankly speaking the VM is still free to assign traffic to queues as before,
>I guess the networking card scheduling algorithm will just drain those
>queues at different pace.
That was not my point, my point is, that with per-queue shaping, you are
basically configuring the other side of the wire (VF), when this config
is out of the domain of hypervisor.
>
>>
>>
>> > > is acceptable.
>> > Right that's the thing though: I instinctively Want this to be done
>> > through representors somehow, because it _looks_ like it ought to
>> > be scoped to a single netdev; but that forces the hierarchy to
>> > respect netdev boundaries which as we've discussed is an unwelcome
>> > limitation.
>> Why exacly? Do you want to share a single queue between multiple vport?
>> Or what exactly would the the usecase where you hit the limitation?
>
>Like you've noticed in previous comment traffic is assigned from inside the
>VM,
>this tree simply represents scheduling algorithm in the HW i.e how fast the
>card
>will drain from each queue. So if you have a queue carrying real-time data,
>and the rest carrying bulk, you might want to prioritze real-time data
>it i.e put it on a completely different branch on the scheduling tree.
Yep, so, if you forget about how this is implemented in HW/FW, this is
the VM-side config, correct?
>
>BR,
>MichaĆ
>
>>
>>
>> > > In my mind this is a device-wide configuration, since the ice driver registers each port as a separate pci device.
>> > > And each of this devices have their own hardware Tx Scheduler tree global to that port. Queues that we're
>> > > discussing are actually hardware queues, and are identified by hardware assigned txq_id.
>> > In general, hardware being a single unit at the device level does
>> > not necessarily mean its configuration should be device-wide.
>> > For instance, in many NICs each port has a single hardware v-switch,
>> > but we do not have some kind of "devlink filter" API to program it
>> > directly. Instead we attach TC rules to _many_ netdevs, and driver
>> > code transforms and combines these to program the unitary device.
>> > "device-wide configuration" originally meant things like firmware
>> > version or operating mode (legacy vs. switchdev) that do not relate
>> > directly to netdevs.
>> >
>> > But I agree with you that your approach is the "least evil method";
>> > if properly explained and documented then I don't have any
>> > remaining objection to your patch, despite that I'm continuing to
>> > take the opportunity to proselytise for "reprs >> devlink" ;)
>> >
>> > -ed
>
Powered by blists - more mailing lists