[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250708133132.GL452973@horms.kernel.org>
Date: Tue, 8 Jul 2025 14:31:32 +0100
From: Simon Horman <horms@...nel.org>
To: Jeroen de Borst <jeroendb@...gle.com>
Cc: netdev@...r.kernel.org, hramamurthy@...gle.com, davem@...emloft.net,
edumazet@...gle.com, kuba@...nel.org, willemb@...gle.com,
pabeni@...hat.com, Bailey Forrest <bcf@...gle.com>,
Joshua Washington <joshwash@...gle.com>
Subject: Re: [PATCH net-next v2] gve: make IRQ handlers and page allocation
NUMA aware
On Mon, Jul 07, 2025 at 02:01:07PM -0700, Jeroen de Borst wrote:
> From: Bailey Forrest <bcf@...gle.com>
>
> All memory in GVE is currently allocated without regard for the NUMA
> node of the device. Because access to NUMA-local memory access is
> significantly cheaper than access to a remote node, this change attempts
> to ensure that page frags used in the RX path, including page pool
> frags, are allocated on the NUMA node local to the gVNIC device. Note
> that this attempt is best-effort. If necessary, the driver will still
> allocate non-local memory, as __GFP_THISNODE is not passed. Descriptor
> ring allocations are not updated, as dma_alloc_coherent handles that.
>
> This change also modifies the IRQ affinity setting to only select CPUs
> from the node local to the device, preserving the behavior that TX and
> RX queues of the same index share CPU affinity.
>
> Signed-off-by: Bailey Forrest <bcf@...gle.com>
> Signed-off-by: Joshua Washington <joshwash@...gle.com>
> Reviewed-by: Willem de Bruijn <willemb@...gle.com>
> Signed-off-by: Harshitha Ramamurthy <hramamurthy@...gle.com>
> Signed-off-by: Jeroen de Borst <jeroendb@...gle.com>
> ---
> v1: https://lore.kernel.org/netdev/20250627183141.3781516-1-hramamurthy@google.com/
> v2:
> - Utilize kvcalloc_node instead of kvzalloc_node for array-type
> allocations.
Thanks for the update.
I note that this addresses Jakub's review of v1.
I have a minor suggestion below, but I don't think it warrants
blocking progress of this patch.
Reviewed-by: Simon Horman <horms@...nel.org>
...
> diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
...
> @@ -533,6 +540,8 @@ static int gve_alloc_notify_blocks(struct gve_priv *priv)
> }
>
> /* Setup the other blocks - the first n-1 vectors */
> + node_mask = gve_get_node_mask(priv);
> + cur_cpu = cpumask_first(node_mask);
> for (i = 0; i < priv->num_ntfy_blks; i++) {
> struct gve_notify_block *block = &priv->ntfy_blocks[i];
> int msix_idx = i;
> @@ -549,9 +558,17 @@ static int gve_alloc_notify_blocks(struct gve_priv *priv)
> goto abort_with_some_ntfy_blocks;
> }
> block->irq = priv->msix_vectors[msix_idx].vector;
> - irq_set_affinity_hint(priv->msix_vectors[msix_idx].vector,
> - get_cpu_mask(i % active_cpus));
> + irq_set_affinity_and_hint(block->irq,
> + cpumask_of(cur_cpu));
> block->irq_db_index = &priv->irq_db_indices[i].index;
> +
> + cur_cpu = cpumask_next(cur_cpu, node_mask);
> + /* Wrap once CPUs in the node have been exhausted, or when
> + * starting RX queue affinities. TX and RX queues of the same
> + * index share affinity.
> + */
> + if (cur_cpu >= nr_cpu_ids || (i + 1) == priv->tx_cfg.max_queues)
> + cur_cpu = cpumask_first(node_mask);
FWIIW, maybe this can be written more succinctly as follows.
(Completely untested!)
/* TX and RX queues of the same index share affinity. */
if (i + 1 == priv->tx_cfg.max_queues)
cur_cpu = cpumask_first(node_mask);
else
cur_cpu = cpumask_next_wrap(cur_cpu, node_mask);
...
Powered by blists - more mailing lists