[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izMLPkw1y93iRwoT5yuscSHZGuwhg1tfkF7SSkKAbgQKsg@mail.gmail.com>
Date: Tue, 19 Aug 2025 18:31:08 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Pavel Begunkov <asml.silence@...il.com>
Cc: Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
Eric Dumazet <edumazet@...gle.com>, Willem de Bruijn <willemb@...gle.com>,
Paolo Abeni <pabeni@...hat.com>, andrew+netdev@...n.ch, horms@...nel.org,
davem@...emloft.net, sdf@...ichev.me, dw@...idwei.uk,
michael.chan@...adcom.com, dtatulea@...dia.com, ap420073@...il.com,
linux-kernel@...r.kernel.org, io-uring@...r.kernel.org
Subject: Re: [PATCH net-next v3 14/23] net: add queue config validation callback
On Tue, Aug 19, 2025 at 2:54 PM Mina Almasry <almasrymina@...gle.com> wrote:
>
> On Mon, Aug 18, 2025 at 6:56 AM Pavel Begunkov <asml.silence@...il.com> wrote:
> >
> > From: Jakub Kicinski <kuba@...nel.org>
> >
> > I imagine (tm) that as the number of per-queue configuration
> > options grows some of them may conflict for certain drivers.
> > While the drivers can obviously do all the validation locally
> > doing so is fairly inconvenient as the config is fed to drivers
> > piecemeal via different ops (for different params and NIC-wide
> > vs per-queue).
> >
> > Add a centralized callback for validating the queue config
> > in queue ops. The callback gets invoked before each queue restart
> > and when ring params are modified.
> >
> > For NIC-wide changes the callback gets invoked for each active
> > (or active to-be) queue, and additionally with a negative queue
> > index for NIC-wide defaults. The NIC-wide check is needed in
> > case all queues have an override active when NIC-wide setting
> > is changed to an unsupported one. Alternatively we could check
> > the settings when new queues are enabled (in the channel API),
> > but accepting invalid config is a bad idea. Users may expect
> > that resetting a queue override will always work.
> >
> > The "trick" of passing a negative index is a bit ugly, we may
> > want to revisit if it causes confusion and bugs. Existing drivers
> > don't care about the index so it "just works".
> >
> > Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> > Signed-off-by: Pavel Begunkov <asml.silence@...il.com>
> > ---
> > include/net/netdev_queues.h | 12 ++++++++++++
> > net/core/dev.h | 2 ++
> > net/core/netdev_config.c | 20 ++++++++++++++++++++
> > net/core/netdev_rx_queue.c | 6 ++++++
> > net/ethtool/rings.c | 5 +++++
> > 5 files changed, 45 insertions(+)
> >
> > diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h
> > index b850cff71d12..d0cc475ec51e 100644
> > --- a/include/net/netdev_queues.h
> > +++ b/include/net/netdev_queues.h
> > @@ -147,6 +147,14 @@ void netdev_stat_queue_sum(struct net_device *netdev,
> > * defaults. Queue config structs are passed to this
> > * helper before the user-requested settings are applied.
> > *
> > + * @ndo_queue_cfg_validate: (Optional) Check if queue config is supported.
> > + * Called when configuration affecting a queue may be
> > + * changing, either due to NIC-wide config, or config
> > + * scoped to the queue at a specified index.
> > + * When NIC-wide config is changed the callback will
> > + * be invoked for all queues, and in addition to that
> > + * with a negative queue index for the base settings.
> > + *
> > * @ndo_queue_mem_alloc: Allocate memory for an RX queue at the specified index.
> > * The new memory is written at the specified address.
> > *
> > @@ -167,6 +175,10 @@ struct netdev_queue_mgmt_ops {
> > void (*ndo_queue_cfg_defaults)(struct net_device *dev,
> > int idx,
> > struct netdev_queue_config *qcfg);
> > + int (*ndo_queue_cfg_validate)(struct net_device *dev,
> > + int idx,
> > + struct netdev_queue_config *qcfg,
> > + struct netlink_ext_ack *extack);
> > int (*ndo_queue_mem_alloc)(struct net_device *dev,
> > struct netdev_queue_config *qcfg,
> > void *per_queue_mem,
> > diff --git a/net/core/dev.h b/net/core/dev.h
> > index a553a0f1f846..523d50e6f88d 100644
> > --- a/net/core/dev.h
> > +++ b/net/core/dev.h
> > @@ -99,6 +99,8 @@ void netdev_free_config(struct net_device *dev);
> > int netdev_reconfig_start(struct net_device *dev);
> > void __netdev_queue_config(struct net_device *dev, int rxq,
> > struct netdev_queue_config *qcfg, bool pending);
> > +int netdev_queue_config_revalidate(struct net_device *dev,
> > + struct netlink_ext_ack *extack);
> >
> > /* netdev management, shared between various uAPI entry points */
> > struct netdev_name_node {
> > diff --git a/net/core/netdev_config.c b/net/core/netdev_config.c
> > index bad2d53522f0..fc700b77e4eb 100644
> > --- a/net/core/netdev_config.c
> > +++ b/net/core/netdev_config.c
> > @@ -99,3 +99,23 @@ void netdev_queue_config(struct net_device *dev, int rxq,
> > __netdev_queue_config(dev, rxq, qcfg, true);
> > }
> > EXPORT_SYMBOL(netdev_queue_config);
> > +
> > +int netdev_queue_config_revalidate(struct net_device *dev,
> > + struct netlink_ext_ack *extack)
> > +{
> > + const struct netdev_queue_mgmt_ops *qops = dev->queue_mgmt_ops;
> > + struct netdev_queue_config qcfg;
> > + int i, err;
> > +
> > + if (!qops || !qops->ndo_queue_cfg_validate)
> > + return 0;
> > +
> > + for (i = -1; i < (int)dev->real_num_rx_queues; i++) {
> > + netdev_queue_config(dev, i, &qcfg);
>
> This function as written feels very useless tbh. There is no config
> passed in from the caller, so the function does a netdev_queue_config,
> which grabs the current-or-default-config (I'm not sure which tbh),
> and then validates that is applicable. But of course the current or
> default configs can be applied, right?
>
> I thought there would be a refactor in a future patch that makes this
> function useful, but I don't see one.
>
> The qcfg being applied needs to be passed in by the caller of this
> function, no? That would make sense to me (the caller is wondering if
> this new config is applicable).
>
OK, I misunderstood how this works on first read. netdev_queue_config
returns the pending config, not the current one, and that is what's
being validated. I'll give this a closer look.
--
Thanks,
Mina
Powered by blists - more mailing lists