[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<DS1PEPF00012A5F513F916690B9F94D3262CA6F2@DS1PEPF00012A5F.namprd21.prod.outlook.com>
Date: Fri, 12 Jan 2024 18:30:44 +0000
From: Haiyang Zhang <haiyangz@...rosoft.com>
To: Michael Kelley <mhklinux@...look.com>, Souradeep Chakrabarti
<schakrabarti@...ux.microsoft.com>
CC: Yury Norov <yury.norov@...il.com>, KY Srinivasan <kys@...rosoft.com>,
"wei.liu@...nel.org" <wei.liu@...nel.org>, Dexuan Cui <decui@...rosoft.com>,
"davem@...emloft.net" <davem@...emloft.net>, "edumazet@...gle.com"
<edumazet@...gle.com>, "kuba@...nel.org" <kuba@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>, Long Li <longli@...rosoft.com>,
"leon@...nel.org" <leon@...nel.org>, "cai.huoqing@...ux.dev"
<cai.huoqing@...ux.dev>, "ssengar@...ux.microsoft.com"
<ssengar@...ux.microsoft.com>, "vkuznets@...hat.com" <vkuznets@...hat.com>,
"tglx@...utronix.de" <tglx@...utronix.de>, "linux-hyperv@...r.kernel.org"
<linux-hyperv@...r.kernel.org>, "netdev@...r.kernel.org"
<netdev@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linux-rdma@...r.kernel.org"
<linux-rdma@...r.kernel.org>, Souradeep Chakrabarti
<schakrabarti@...rosoft.com>, Paul Rosswurm <paulros@...rosoft.com>
Subject: RE: [PATCH 3/4 net-next] net: mana: add a function to spread IRQs per
CPUs
> -----Original Message-----
> From: Michael Kelley <mhklinux@...look.com>
> Sent: Friday, January 12, 2024 11:37 AM
> To: Souradeep Chakrabarti <schakrabarti@...ux.microsoft.com>
> Cc: Yury Norov <yury.norov@...il.com>; KY Srinivasan <kys@...rosoft.com>;
> Haiyang Zhang <haiyangz@...rosoft.com>; wei.liu@...nel.org; Dexuan Cui
> <decui@...rosoft.com>; davem@...emloft.net; edumazet@...gle.com;
> kuba@...nel.org; pabeni@...hat.com; Long Li <longli@...rosoft.com>;
> leon@...nel.org; cai.huoqing@...ux.dev; ssengar@...ux.microsoft.com;
> vkuznets@...hat.com; tglx@...utronix.de; linux-hyperv@...r.kernel.org;
> netdev@...r.kernel.org; linux-kernel@...r.kernel.org; linux-
> rdma@...r.kernel.org; Souradeep Chakrabarti <schakrabarti@...rosoft.com>;
> Paul Rosswurm <paulros@...rosoft.com>
> Subject: RE: [PATCH 3/4 net-next] net: mana: add a function to spread
> IRQs per CPUs
>
> [Some people who received this message don't often get email from
> mhklinux@...look.com. Learn why this is important at
> https://aka.ms/LearnAboutSenderIdentification ]
>
> From: Souradeep Chakrabarti <schakrabarti@...ux.microsoft.com> Sent:
> Wednesday, January 10, 2024 10:13 PM
> >
> > The test topology was used to check the performance between
> > cpu_local_spread() and the new approach is :
> > Case 1
> > IRQ Nodes Cores CPUs
> > 0 1 0 0-1
> > 1 1 1 2-3
> > 2 1 2 4-5
> > 3 1 3 6-7
> >
> > and with existing cpu_local_spread()
> > Case 2
> > IRQ Nodes Cores CPUs
> > 0 1 0 0
> > 1 1 0 1
> > 2 1 1 2
> > 3 1 1 3
> >
> > Total 4 channels were used, which was set up by ethtool.
> > case 1 with ntttcp has given 15 percent better performance, than
> > case 2. During the test irqbalance was disabled as well.
> >
> > Also you are right, with 64CPU system this approach will spread
> > the irqs like the cpu_local_spread() but in the future we will offer
> > MANA nodes, with more than 64 CPUs. There it this new design will
> > give better performance.
> >
> > I will add this performance benefit details in commit message of
> > next version.
>
> Here are my concerns:
>
> 1. The most commonly used VMs these days have 64 or fewer
> vCPUs and won't see any performance benefit.
>
> 2. Larger VMs probably won't see the full 15% benefit because
> all vCPUs in the local NUMA node will be assigned IRQs. For
> example, in a VM with 96 vCPUs and 2 NUMA nodes, all 48
> vCPUs in NUMA node 0 will all be assigned IRQs. The remaining
> 16 IRQs will be spread out on the 48 CPUs in NUMA node 1
> in a way that avoids sharing a core. But overall the means
> that 75% of the IRQs will still be sharing a core and
> presumably not see any perf benefit.
>
> 3. Your experiment was on a relatively small scale: 4 IRQs
> spread across 2 cores vs. across 4 cores. Have you run any
> experiments on VMs with 128 vCPUs (for example) where
> most of the IRQs are not sharing a core? I'm wondering if
> the results with 4 IRQs really scale up to 64 IRQs. A lot can
> be different in a VM with 64 cores and 2 NUMA nodes vs.
> 4 cores in a single node.
>
> 4. The new algorithm prefers assigning to all vCPUs in
> each NUMA hop over assigning to separate cores. Are there
> experiments showing that is the right tradeoff? What
> are the results if assigning to separate cores is preferred?
I remember in a customer case, putting the IRQs on the same
NUMA node has better perf. But I agree, this should be re-tested
on MANA nic.
- Haiyang
Powered by blists - more mailing lists