[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250425162437.45765bc0@kernel.org>
Date: Fri, 25 Apr 2025 16:24:37 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Mina Almasry <almasrymina@...gle.com>
Cc: davem@...emloft.net, netdev@...r.kernel.org, edumazet@...gle.com,
pabeni@...hat.com, andrew+netdev@...n.ch, horms@...nel.org,
donald.hunter@...il.com, sdf@...ichev.me, dw@...idwei.uk,
asml.silence@...il.com, ap420073@...il.com, jdamato@...tly.com,
dtatulea@...dia.com, michael.chan@...adcom.com
Subject: Re: [RFC net-next 11/22] net: allocate per-queue config structs and
pass them thru the queue API
On Wed, 23 Apr 2025 14:17:30 -0700 Mina Almasry wrote:
> > @@ -129,6 +136,10 @@ struct netdev_stat_ops {
> > *
> > * @ndo_queue_mem_size: Size of the struct that describes a queue's memory.
> > *
> > + * @ndo_queue_cfg_defaults: (Optional) Populate queue config struct with
> > + * defaults. Queue config structs are passed to this
> > + * helper before the user-requested settings are applied.
> > + *
> > * @ndo_queue_mem_alloc: Allocate memory for an RX queue at the specified index.
> > * The new memory is written at the specified address.
> > *
> > @@ -146,12 +157,17 @@ struct netdev_stat_ops {
> > */
> > struct netdev_queue_mgmt_ops {
> > size_t ndo_queue_mem_size;
> > + void (*ndo_queue_cfg_defaults)(struct net_device *dev,
> > + int idx,
> > + struct netdev_queue_config *qcfg);
> > int (*ndo_queue_mem_alloc)(struct net_device *dev,
> > + struct netdev_queue_config *qcfg,
> > void *per_queue_mem,
> > int idx);
> > void (*ndo_queue_mem_free)(struct net_device *dev,
> > void *per_queue_mem);
> > int (*ndo_queue_start)(struct net_device *dev,
> > + struct netdev_queue_config *qcfg,
> > void *per_queue_mem,
> > int idx);
> > int (*ndo_queue_stop)(struct net_device *dev,
>
> Doesn't the stop call need to return the config of the queue that was
> stopped? Otherwise what do you pass to the start command if you want
> to restart the old queue?
As you say below, I expect driver to retain the config within qmem.
> In general I thought what when we extend the queue API to handle
> configs, the config of each queue would be part of the per_queue_mem
> attribute, which is now a void *. Because seems to me the config needs
> to be passed around with the per_queue_mem to all the functions?
> Maybe.
>
> I imagined you'd create a new wrapper struct that is the per queue
> mem, and that one will contain a void * that is driver specific
> memory, and a netdev_queue_config * inside of it as well, then the
> queue API will use the new struct instead of void * for all the
> per_queue_mem. At least that's what I was thinking.
That sounds nice from the API perspective, but I was worried about
perf impact. Some driver folks count each cycle and we may be mixing
cache cold and cache hot data in the config. At the very least not
all drivers will support all fields. So my gut feeling was that driver
authors will want to assign the fields to their own struct members,
anyway. Perhaps something we can revisit once we have more mileage?
> > + maxqs = max(dev->num_rx_queues, dev->num_tx_queues);
> > + cfg->qcfg = kcalloc(maxqs, sizeof(*cfg->qcfg), GFP_KERNEL_ACCOUNT);
>
> I just thought of this, but for devices that can new rx/tx queues on
> the fly, isn't num_rx_queues dynamically changing? How do you size
> this qcfg array in this case?
>
> Or do they go through a full driver reset everytime they add a queue
> which reallocates dev->num_rx_queues?
Yes, good question. Easiest way to deal with that I thought of was
to act like old memory school management. "normal" queues allocate
indexes 0 -> n, "dynamic" queues allocate n -> 0. Like stack vs heap
growth. And we need to patch up all this code that naively checks
queue ID vs num_real, we'll need a bitmap or a member in the structs.
>>> + __netdev_queue_config(dev, rxq_idx, &qcfg, false);
>
> Ah, on error, you reset the queue to its defaults, which I'm not sure
> is desirable. I think we want to restart the queue with whatever
> config it had before.
Not defaults to be clear. The last argument here (false) tells
the config code to use the _current_ config not the pending one.
So we should be reverting to what's currently installed.
Powered by blists - more mailing lists