[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<SN6PR02MB41577E2FAA79E2803C3384B0D491A@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Wed, 14 May 2025 04:53:34 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Shradha Gupta <shradhagupta@...ux.microsoft.com>, Dexuan Cui
<decui@...rosoft.com>, Wei Liu <wei.liu@...nel.org>, Haiyang Zhang
<haiyangz@...rosoft.com>, "K. Y. Srinivasan" <kys@...rosoft.com>, Andrew Lunn
<andrew+netdev@...n.ch>, "David S. Miller" <davem@...emloft.net>, Eric
Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
<pabeni@...hat.com>, Konstantin Taranov <kotaranov@...rosoft.com>, Simon
Horman <horms@...nel.org>, Leon Romanovsky <leon@...nel.org>, Maxim Levitsky
<mlevitsk@...hat.com>, Erni Sri Satya Vennela <ernis@...ux.microsoft.com>,
Peter Zijlstra <peterz@...radead.org>
CC: "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Nipun Gupta
<nipun.gupta@....com>, Yury Norov <yury.norov@...il.com>, Jason Gunthorpe
<jgg@...pe.ca>, Jonathan Cameron <Jonathan.Cameron@...ei.com>, Anna-Maria
Behnsen <anna-maria@...utronix.de>, Kevin Tian <kevin.tian@...el.com>, Long
Li <longli@...rosoft.com>, Thomas Gleixner <tglx@...utronix.de>, Bjorn
Helgaas <bhelgaas@...gle.com>, Rob Herring <robh@...nel.org>, Manivannan
Sadhasivam <manivannan.sadhasivam@...aro.org>,
Krzysztof Wilczy�~Dski <kw@...ux.com>, Lorenzo
Pieralisi <lpieralisi@...nel.org>, "netdev@...r.kernel.org"
<netdev@...r.kernel.org>, "linux-rdma@...r.kernel.org"
<linux-rdma@...r.kernel.org>, Paul Rosswurm <paulros@...rosoft.com>, Shradha
Gupta <shradhagupta@...rosoft.com>
Subject: RE: [PATCH v3 3/4] net: mana: Allow irq_setup() to skip cpus for
affinity
From: Shradha Gupta <shradhagupta@...ux.microsoft.com> Sent: Friday, May 9, 2025 3:14 AM
>
> In order to prepare the MANA driver to allocate the MSI-X IRQs
> dynamically, we need to prepare the irq_setup() to allow skipping
s/prepare the irq_setup()/enhance irq_setup()/
> affinitizing IRQs to first CPU sibling group.
s/to first/to the first/
>
> This would be for cases when number of IRQs is less than or equal
s/when number/when the number/
> to number of online CPUs. In such cases for dynamically added IRQs
s/to number/to the number/
> the first CPU sibling group would already be affinitized with HWC IRQ
Add a period at the end of the sentence.
>
> Signed-off-by: Shradha Gupta <shradhagupta@...ux.microsoft.com>
> Reviewed-by: Haiyang Zhang <haiyangz@...rosoft.com>
> ---
> drivers/net/ethernet/microsoft/mana/gdma_main.c | 16 ++++++++++++++--
> 1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> index 4ffaf7588885..2de42ce43373 100644
> --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> @@ -1288,7 +1288,8 @@ void mana_gd_free_res_map(struct gdma_resource *r)
> r->size = 0;
> }
>
> -static int irq_setup(unsigned int *irqs, unsigned int len, int node)
> +static int irq_setup(unsigned int *irqs, unsigned int len, int node,
> + bool skip_first_cpu)
> {
> const struct cpumask *next, *prev = cpu_none_mask;
> cpumask_var_t cpus __free(free_cpumask_var);
> @@ -1303,9 +1304,20 @@ static int irq_setup(unsigned int *irqs, unsigned int len, int node)
> while (weight > 0) {
> cpumask_andnot(cpus, next, prev);
> for_each_cpu(cpu, cpus) {
> + /*
> + * if the CPU sibling set is to be skipped we
> + * just move on to the next CPUs without len--
> + */
> + if (unlikely(skip_first_cpu)) {
> + skip_first_cpu = false;
> + goto next_cpumask;
> + }
> +
> if (len-- == 0)
> goto done;
> +
> irq_set_affinity_and_hint(*irqs++, topology_sibling_cpumask(cpu));
> +next_cpumask:
> cpumask_andnot(cpus, cpus, topology_sibling_cpumask(cpu));
> --weight;
> }
With a little bit of reordering of the code, you could avoid the need for the "next_cpumask"
label and goto statement. "continue" is usually cleaner than a "goto". Here's what I'm thinking:
for_each_cpu(cpu, cpus) {
cpumask_andnot(cpus, cpus, topology_sibling_cpumask(cpu));
--weight;
If (unlikely(skip_first_cpu)) {
skip_first_cpu = false;
continue;
}
If (len-- == 0)
goto done;
irq_set_affinity_and_hint(*irqs++, topology_sibling_cpumask(cpu));
}
I wish there were some comments in irq_setup() explaining the overall intention of
the algorithm. I can see how the goal is to first assign CPUs that are local to the current
NUMA node, and then expand outward to CPUs that are further away. And you want
to *not* assign both siblings in a hyper-threaded core. But I can't figure out what
"weight" is trying to accomplish. Maybe this was discussed when the code first
went in, but I can't remember now. :-(
Michael
> @@ -1403,7 +1415,7 @@ static int mana_gd_setup_irqs(struct pci_dev *pdev)
> }
> }
>
> - err = irq_setup(irqs, (nvec - start_irq_index), gc->numa_node);
> + err = irq_setup(irqs, (nvec - start_irq_index), gc->numa_node, false);
> if (err)
> goto free_irq;
>
> --
> 2.34.1
>
Powered by blists - more mailing lists