lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <PUZP153MB07886CE88351F6B7A2AA0096CC97A@PUZP153MB0788.APCP153.PROD.OUTLOOK.COM> Date: Tue, 19 Dec 2023 10:18:49 +0000 From: Souradeep Chakrabarti <schakrabarti@...rosoft.com> To: Yury Norov <yury.norov@...il.com>, Souradeep Chakrabarti <schakrabarti@...ux.microsoft.com>, KY Srinivasan <kys@...rosoft.com>, Haiyang Zhang <haiyangz@...rosoft.com>, "wei.liu@...nel.org" <wei.liu@...nel.org>, Dexuan Cui <decui@...rosoft.com>, "davem@...emloft.net" <davem@...emloft.net>, "edumazet@...gle.com" <edumazet@...gle.com>, "kuba@...nel.org" <kuba@...nel.org>, "pabeni@...hat.com" <pabeni@...hat.com>, Long Li <longli@...rosoft.com>, "leon@...nel.org" <leon@...nel.org>, "cai.huoqing@...ux.dev" <cai.huoqing@...ux.dev>, "ssengar@...ux.microsoft.com" <ssengar@...ux.microsoft.com>, "vkuznets@...hat.com" <vkuznets@...hat.com>, "tglx@...utronix.de" <tglx@...utronix.de>, "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org> CC: Paul Rosswurm <paulros@...rosoft.com> Subject: RE: [EXTERNAL] [PATCH 3/3] net: mana: add a function to spread IRQs per CPUs >-----Original Message----- >From: Yury Norov <yury.norov@...il.com> >Sent: Monday, December 18, 2023 3:02 AM >To: Souradeep Chakrabarti <schakrabarti@...ux.microsoft.com>; KY Srinivasan ><kys@...rosoft.com>; Haiyang Zhang <haiyangz@...rosoft.com>; >wei.liu@...nel.org; Dexuan Cui <decui@...rosoft.com>; davem@...emloft.net; >edumazet@...gle.com; kuba@...nel.org; pabeni@...hat.com; Long Li ><longli@...rosoft.com>; yury.norov@...il.com; leon@...nel.org; >cai.huoqing@...ux.dev; ssengar@...ux.microsoft.com; vkuznets@...hat.com; >tglx@...utronix.de; linux-hyperv@...r.kernel.org; netdev@...r.kernel.org; linux- >kernel@...r.kernel.org; linux-rdma@...r.kernel.org >Cc: Souradeep Chakrabarti <schakrabarti@...rosoft.com>; Paul Rosswurm ><paulros@...rosoft.com> >Subject: [EXTERNAL] [PATCH 3/3] net: mana: add a function to spread IRQs per >CPUs > >[Some people who received this message don't often get email from >yury.norov@...il.com. Learn why this is important at >https://aka.ms/LearnAboutSenderIdentification ] > >Souradeep investigated that the driver performs faster if IRQs are spread on CPUs >with the following heuristics: > >1. No more than one IRQ per CPU, if possible; 2. NUMA locality is the second >priority; 3. Sibling dislocality is the last priority. > >Let's consider this topology: > >Node 0 1 >Core 0 1 2 3 >CPU 0 1 2 3 4 5 6 7 > >The most performant IRQ distribution based on the above topology and heuristics >may look like this: > >IRQ Nodes Cores CPUs >0 1 0 0-1 >1 1 1 2-3 >2 1 0 0-1 >3 1 1 2-3 >4 2 2 4-5 >5 2 3 6-7 >6 2 2 4-5 >7 2 3 6-7 > >The irq_setup() routine introduced in this patch leverages the >for_each_numa_hop_mask() iterator and assigns IRQs to sibling groups as >described above. > >According to [1], for NUMA-aware but sibling-ignorant IRQ distribution based on >cpumask_local_spread() performance test results look like this: > >./ntttcp -r -m 16 >NTTTCP for Linux 1.4.0 >--------------------------------------------------------- >08:05:20 INFO: 17 threads created >08:05:28 INFO: Network activity progressing... >08:06:28 INFO: Test run completed. >08:06:28 INFO: Test cycle finished. >08:06:28 INFO: ##### Totals: ##### >08:06:28 INFO: test duration :60.00 seconds >08:06:28 INFO: total bytes :630292053310 >08:06:28 INFO: throughput :84.04Gbps >08:06:28 INFO: retrans segs :4 >08:06:28 INFO: cpu cores :192 >08:06:28 INFO: cpu speed :3799.725MHz >08:06:28 INFO: user :0.05% >08:06:28 INFO: system :1.60% >08:06:28 INFO: idle :96.41% >08:06:28 INFO: iowait :0.00% >08:06:28 INFO: softirq :1.94% >08:06:28 INFO: cycles/byte :2.50 >08:06:28 INFO: cpu busy (all) :534.41% > >For NUMA- and sibling-aware IRQ distribution, the same test works 15% faster: > >./ntttcp -r -m 16 >NTTTCP for Linux 1.4.0 >--------------------------------------------------------- >08:08:51 INFO: 17 threads created >08:08:56 INFO: Network activity progressing... >08:09:56 INFO: Test run completed. >08:09:56 INFO: Test cycle finished. >08:09:56 INFO: ##### Totals: ##### >08:09:56 INFO: test duration :60.00 seconds >08:09:56 INFO: total bytes :741966608384 >08:09:56 INFO: throughput :98.93Gbps >08:09:56 INFO: retrans segs :6 >08:09:56 INFO: cpu cores :192 >08:09:56 INFO: cpu speed :3799.791MHz >08:09:56 INFO: user :0.06% >08:09:56 INFO: system :1.81% >08:09:56 INFO: idle :96.18% >08:09:56 INFO: iowait :0.00% >08:09:56 INFO: softirq :1.95% >08:09:56 INFO: cycles/byte :2.25 >08:09:56 INFO: cpu busy (all) :569.22% > >[1] >https://lore.kernel/ >.org%2Fall%2F20231211063726.GA4977%40linuxonhyperv3.guj3yctzbm1etfxqx2v >ob5hsef.xx.internal.cloudapp.net%2F&data=05%7C02%7Cschakrabarti%40micros >oft.com%7Ca385a5a5d661458219c208dbff47a7ab%7C72f988bf86f141af91ab2d7 >cd011db47%7C1%7C0%7C638384455520036393%7CUnknown%7CTWFpbGZsb3d >8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D% >7C3000%7C%7C%7C&sdata=kzoalzSu6frB0GIaUM5VWsz04%2FsB%2FBdXwXKb26 >IhqkE%3D&reserved=0 > >Signed-off-by: Yury Norov <yury.norov@...il.com> >Co-developed-by: Souradeep Chakrabarti <schakrabarti@...ux.microsoft.com> >--- > .../net/ethernet/microsoft/mana/gdma_main.c | 28 +++++++++++++++++++ > 1 file changed, 28 insertions(+) > >diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c >b/drivers/net/ethernet/microsoft/mana/gdma_main.c >index 6367de0c2c2e..11e64e42e3b2 100644 >--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c >+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c >@@ -1243,6 +1243,34 @@ void mana_gd_free_res_map(struct gdma_resource >*r) > r->size = 0; > } > >+static __maybe_unused int irq_setup(unsigned int *irqs, unsigned int >+len, int node) { >+ const struct cpumask *next, *prev = cpu_none_mask; >+ cpumask_var_t cpus __free(free_cpumask_var); >+ int cpu, weight; >+ >+ if (!alloc_cpumask_var(&cpus, GFP_KERNEL)) >+ return -ENOMEM; >+ >+ rcu_read_lock(); >+ for_each_numa_hop_mask(next, node) { >+ weight = cpumask_weight_andnot(next, prev); >+ while (weight-- > 0) { Make it while (weight > 0) { >+ cpumask_andnot(cpus, next, prev); >+ for_each_cpu(cpu, cpus) { >+ if (len-- == 0) >+ goto done; >+ irq_set_affinity_and_hint(*irqs++, >topology_sibling_cpumask(cpu)); >+ cpumask_andnot(cpus, cpus, topology_sibling_cpumask(cpu)); Here do --weight, else this code will traverse the same node N^2 times, where each node has N cpus . >+ } >+ } >+ prev = next; >+ } >+done: >+ rcu_read_unlock(); >+ return 0; >+} >+ > static int mana_gd_setup_irqs(struct pci_dev *pdev) { > unsigned int max_queues_per_port = num_online_cpus(); >-- >2.40.1
Powered by blists - more mailing lists