netdev - Re: [RFC PATCH] net: introduce HW Rate Limiting Driver API

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c5dcf167aad610c6c623c5958bb252647773fddd.camel@redhat.com>
Date: Mon, 03 Jun 2024 13:11:39 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org, Jiri Pirko <jiri@...nulli.us>, Madhu Chittim
 <madhu.chittim@...el.com>, Sridhar Samudrala <sridhar.samudrala@...el.com>,
  Simon Horman <horms@...nel.org>, John Fastabend
 <john.fastabend@...il.com>, Sunil Kovvuri Goutham <sgoutham@...vell.com>,
 Jamal Hadi Salim <jhs@...atatu.com>
Subject: Re: [RFC PATCH] net: introduce HW Rate Limiting Driver API

Hi,

It looks like most of the open points here are related lack of clarity
or misunderstanding. I propose to discuss them in the upcoming
reviewer's meeting.

Some replies below, just in case I magically achieved superior written
natural language skills meanwhile ;) 

On Fri, 2024-05-31 at 09:00 -0700, Jakub Kicinski wrote:
> On Fri, 31 May 2024 11:22:46 +0200 Paolo Abeni wrote:
> > On Tue, 2024-05-28 at 10:18 -0700, Jakub Kicinski wrote:
> > > > +	u32 priority;	/* scheduling strict priority */
> > > > +	u32 weight;	/* scheduling WRR weight*/  
> > > 
> > > I wonder if we should somehow more clearly specify what a node can do.
> > > Like Andrew pointed out, if we have a WRR node, presumably the weights
> > > go on the children, since there's only one weigh param. But then the
> > > WRR node itself is either empty (no params) or has rate params... which
> > > is odd.
> > > 
> > > Maybe shaping nodes and RR nodes should be separate node classes,
> > > somehow?  
> > 
> > Possibly clarifying the meaning of 'weight' field would help?  
> > It means: this node is scheduled WRR according to the specified weight
> > among the sibling shapers with the same priority.
> > 
> > I think it's quite simpler than introducing different node classes with
> > separate handling. My understanding is that the latter would help only
> > to workaround some H/W limitation and will make the implementation more
> > difficult for more capable H/W.
> > 
> > What kind of problems do you foresee with the above definition?
> 
> The problem Andrew mentioned, basically. There may not be a path to
> transition one fully offloaded hierarchy to another (e.g. switching
> strict prio to WRR). TBH I haven't really grasped your proposal in:
> https://lore.kernel.org/all/db51b7ccff835dd5a96293fb84d527be081de062.camel@redhat.com/

The problem Andrew reported is that some H/W, in some configuration,
can't change per-queue shapers individually.

The solution proposed in the above link is to let the API change
multiple shapers with a single op "atomically".

> > > > + * NET_SHAPER_SCOPE_PORT and NET_SHAPER_SCOPE_VF are only available on
> > > > + * PF devices, usually inside the host/hypervisor.
> > > > + * NET_SHAPER_SCOPE_NETDEV, NET_SHAPER_SCOPE_QUEUE_GROUP and
> > > > + * NET_SHAPER_SCOPE_QUEUE are available on both PFs and VFs devices.
> > > > + */
> > > > +enum net_shaper_scope {
> > > > +	NET_SHAPER_SCOPE_PORT,
> > > > +	NET_SHAPER_SCOPE_NETDEV,
> > > > +	NET_SHAPER_SCOPE_VF,  
> > > 
> > > I realized now that we do indeed need this VF node (if we want to
> > > internally express the legacy SRIOV NDOs in this API), as much 
> > > as I hate it. Could you annotate somehow my nack on ever exposing
> > > the ability to hook on the VF to user space?  
> > 
> > This work sparked from the need to allow configuring a shaper on
> > specific queues of some VF from the host. I hope this is not what you
> > are nacking here? Could you please elaborate a bit what concern you
> > with 'hook on the VF to user space'? Would the ability of attaching a
> > shaper to the VF from the host hit your nack?
> 
> Queue configuration, for the VF, from the hypervisor?
> I thought it was from the VF.
> In any case, hypervisor has the representors. 
> Use the representor's NETDEV scope?

It looks like even the NETDEV scope + representor alternative should
fit the intended use-case.

> > > > +	NET_SHAPER_SCOPE_QUEUE_GROUP,  
> > > 
> > > We may need a definition for a queue group. Did I suggest this?  
> > 
> > I think this was mentioned separately by you, John Fastabend and Intel.
> 
> Oh ugh, I can't type. I think I meant to say "Why do we need..."
> 
> > > Isn't queue group just a bunch of queues feeding a trivial RR node?
> > > Why does it need to be a "scope"?  
> > 
> > The goal is allowing arbitrary manipulation at the queue group level.
> > e.g. you can have different queue groups with different priority, or
> > weigh or shaping, and below them arbitrary shaping at the queue level.
> > 
> > Note that a similar concept could be introduced for device (or VFs)
> > groups.
> > 
> > Why would you like to constraint the features avail at the queue
> > groups?
> 
> Wait! You don't have a way to create pure RR nodes other than queue
> group now? Perhaps that's what I'm missing...

No, it's not needed. To have RR on some queues, just set the same
weight and priority on them.

Queue groups could be used to do something more complex, e.g. shaping
on the specified set of RR queues.

> > > > +	NET_SHAPER_SCOPE_QUEUE,
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct net_shaper_ops - Operations on device H/W shapers
> > > > + * @add: Creates a new shaper in the specified scope.  
> > > 
> > > "in a scope"? Isn't the scope just defining the ingress and egress
> > > points of the scheduling hierarchy?  
> > 
> > This is purely lexical matter, right? The scope, and more specifically
> > the full 'handle' comprising more scoped-related information, specifies
> > 'where' the shaper is located. Do you have a suggested alternative
> > wording?
> 
> ... and also what confused me here.
> 
> How are you going to do 2 layers of grouping with arbitrary shaping?
> We need arbitrary inner nodes. Unless I'm missing a trick.

I guess this part really needs some talk. I also don't understand your
doubt above.

> > 
I hope we can discuss this tomorrow (and the other points, if/as
needed).

Thanks!

Paolo