[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<MW1PEPF0000E6910254736DF77F99E04DD4CA6A2@MW1PEPF0000E691.namprd21.prod.outlook.com>
Date: Tue, 9 Jan 2024 20:20:31 +0000
From: Haiyang Zhang <haiyangz@...rosoft.com>
To: Michael Kelley <mhklinux@...look.com>, Souradeep Chakrabarti
<schakrabarti@...ux.microsoft.com>, KY Srinivasan <kys@...rosoft.com>,
"wei.liu@...nel.org" <wei.liu@...nel.org>, Dexuan Cui <decui@...rosoft.com>,
"davem@...emloft.net" <davem@...emloft.net>, "edumazet@...gle.com"
<edumazet@...gle.com>, "kuba@...nel.org" <kuba@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>, Long Li <longli@...rosoft.com>,
"yury.norov@...il.com" <yury.norov@...il.com>, "leon@...nel.org"
<leon@...nel.org>, "cai.huoqing@...ux.dev" <cai.huoqing@...ux.dev>,
"ssengar@...ux.microsoft.com" <ssengar@...ux.microsoft.com>,
"vkuznets@...hat.com" <vkuznets@...hat.com>, "tglx@...utronix.de"
<tglx@...utronix.de>, "linux-hyperv@...r.kernel.org"
<linux-hyperv@...r.kernel.org>, "netdev@...r.kernel.org"
<netdev@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linux-rdma@...r.kernel.org"
<linux-rdma@...r.kernel.org>
CC: Souradeep Chakrabarti <schakrabarti@...rosoft.com>, Paul Rosswurm
<paulros@...rosoft.com>
Subject: RE: [PATCH 3/4 net-next] net: mana: add a function to spread IRQs per
CPUs
> -----Original Message-----
> From: Michael Kelley <mhklinux@...look.com>
> Sent: Tuesday, January 9, 2024 2:23 PM
> To: Souradeep Chakrabarti <schakrabarti@...ux.microsoft.com>; KY Srinivasan
> <kys@...rosoft.com>; Haiyang Zhang <haiyangz@...rosoft.com>;
> wei.liu@...nel.org; Dexuan Cui <decui@...rosoft.com>;
> davem@...emloft.net; edumazet@...gle.com; kuba@...nel.org;
> pabeni@...hat.com; Long Li <longli@...rosoft.com>; yury.norov@...il.com;
> leon@...nel.org; cai.huoqing@...ux.dev; ssengar@...ux.microsoft.com;
> vkuznets@...hat.com; tglx@...utronix.de; linux-hyperv@...r.kernel.org;
> netdev@...r.kernel.org; linux-kernel@...r.kernel.org; linux-
> rdma@...r.kernel.org
> Cc: Souradeep Chakrabarti <schakrabarti@...rosoft.com>; Paul Rosswurm
> <paulros@...rosoft.com>
> Subject: RE: [PATCH 3/4 net-next] net: mana: add a function to spread IRQs per
> CPUs
>
> [Some people who received this message don't often get email from
> mhklinux@...look.com. Learn why this is important at
> https://aka.ms/LearnAboutSenderIdentification ]
>
> From: Souradeep Chakrabarti <schakrabarti@...ux.microsoft.com> Sent:
> Tuesday, January 9, 2024 2:51 AM
> >
> > From: Yury Norov <yury.norov@...il.com>
> >
> > Souradeep investigated that the driver performs faster if IRQs are
> > spread on CPUs with the following heuristics:
> >
> > 1. No more than one IRQ per CPU, if possible;
> > 2. NUMA locality is the second priority;
> > 3. Sibling dislocality is the last priority.
> >
> > Let's consider this topology:
> >
> > Node 0 1
> > Core 0 1 2 3
> > CPU 0 1 2 3 4 5 6 7
> >
> > The most performant IRQ distribution based on the above topology
> > and heuristics may look like this:
> >
> > IRQ Nodes Cores CPUs
> > 0 1 0 0-1
> > 1 1 1 2-3
> > 2 1 0 0-1
> > 3 1 1 2-3
> > 4 2 2 4-5
> > 5 2 3 6-7
> > 6 2 2 4-5
> > 7 2 3 6-7
>
> I didn't pay attention to the detailed discussion of this issue
> over the past 2 to 3 weeks during the holidays in the U.S., but
> the above doesn't align with the original problem as I understood
> it. I thought the original problem was to avoid putting IRQs on
> both hyper-threads in the same core, and that the perf
> improvements are based on that configuration. At least that's
> what the commit message for Patch 4/4 in this series says.
>
> The above chart results in 8 IRQs being assigned to the 8 CPUs,
> probably with 1 IRQ per CPU. At least on x86, if the affinity
> mask for an IRQ contains multiple CPUs, matrix_find_best_cpu()
> should balance the IRQ assignments between the CPUs in the mask.
> So the original problem is still present because both hyper-threads
> in a core are likely to have an IRQ assigned.
>
> Of course, this example has 8 IRQs and 8 CPUs, so assigning an
> IRQ to every hyper-thread may be the only choice. If that's the
> case, maybe this just isn't a good example to illustrate the
> original problem and solution. But even with a better example
> where the # of IRQs is <= half the # of CPUs in a NUMA node,
> I don't think the code below accomplishes the original intent.
>
> Maybe I've missed something along the way in getting to this
> version of the patch. Please feel free to set me straight. :-)
>
> Michael
I have the same question as Michael. Also, I'm asking Souradeep
in another channel: So, the algorithm still uses up all current
NUMA node before moving on to the next NUMA node, right?
Except each IRQ is affinitized to 2 CPUs.
For example, a system with 2 IRQs:
IRQ Nodes Cores CPUs
0 1 0 0-1
1 1 1 2-3
Is this performing better than the algorithm in earlier patches? like below:
IRQ Nodes Cores CPUs
0 1 0 0
1 1 1 2
Thanks,
- Haiyang
Powered by blists - more mailing lists