lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 16 Nov 2023 11:21:08 +0000
From: Souradeep Chakrabarti <schakrabarti@...rosoft.com>
To: Haiyang Zhang <haiyangz@...rosoft.com>, Souradeep Chakrabarti
	<schakrabarti@...ux.microsoft.com>, KY Srinivasan <kys@...rosoft.com>,
	"wei.liu@...nel.org" <wei.liu@...nel.org>, Dexuan Cui <decui@...rosoft.com>,
	"davem@...emloft.net" <davem@...emloft.net>, "edumazet@...gle.com"
	<edumazet@...gle.com>, "kuba@...nel.org" <kuba@...nel.org>,
	"pabeni@...hat.com" <pabeni@...hat.com>, Long Li <longli@...rosoft.com>,
	"sharmaajay@...rosoft.com" <sharmaajay@...rosoft.com>, "leon@...nel.org"
	<leon@...nel.org>, "cai.huoqing@...ux.dev" <cai.huoqing@...ux.dev>,
	"ssengar@...ux.microsoft.com" <ssengar@...ux.microsoft.com>,
	"vkuznets@...hat.com" <vkuznets@...hat.com>, "tglx@...utronix.de"
	<tglx@...utronix.de>, "linux-hyperv@...r.kernel.org"
	<linux-hyperv@...r.kernel.org>, "netdev@...r.kernel.org"
	<netdev@...r.kernel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "linux-rdma@...r.kernel.org"
	<linux-rdma@...r.kernel.org>
CC: Paul Rosswurm <paulros@...rosoft.com>
Subject: RE: [PATCH net-next] net: mana: Assigning IRQ affinity on HT cores



>-----Original Message-----
>From: Haiyang Zhang <haiyangz@...rosoft.com>
>Sent: Thursday, November 16, 2023 4:25 AM
>To: Souradeep Chakrabarti <schakrabarti@...ux.microsoft.com>; KY Srinivasan
><kys@...rosoft.com>; wei.liu@...nel.org; Dexuan Cui <decui@...rosoft.com>;
>davem@...emloft.net; edumazet@...gle.com; kuba@...nel.org;
>pabeni@...hat.com; Long Li <longli@...rosoft.com>;
>sharmaajay@...rosoft.com; leon@...nel.org; cai.huoqing@...ux.dev;
>ssengar@...ux.microsoft.com; vkuznets@...hat.com; tglx@...utronix.de; linux-
>hyperv@...r.kernel.org; netdev@...r.kernel.org; linux-kernel@...r.kernel.org;
>linux-rdma@...r.kernel.org
>Cc: Souradeep Chakrabarti <schakrabarti@...rosoft.com>; Paul Rosswurm
><paulros@...rosoft.com>
>Subject: RE: [PATCH net-next] net: mana: Assigning IRQ affinity on HT cores
>
>
>
>> -----Original Message-----
>> From: Souradeep Chakrabarti <schakrabarti@...ux.microsoft.com>
>> Sent: Wednesday, November 15, 2023 8:49 AM
>> To: KY Srinivasan <kys@...rosoft.com>; Haiyang Zhang
>> <haiyangz@...rosoft.com>; wei.liu@...nel.org; Dexuan Cui
>> <decui@...rosoft.com>; davem@...emloft.net; edumazet@...gle.com;
>> kuba@...nel.org; pabeni@...hat.com; Long Li <longli@...rosoft.com>;
>> sharmaajay@...rosoft.com; leon@...nel.org; cai.huoqing@...ux.dev;
>> ssengar@...ux.microsoft.com; vkuznets@...hat.com; tglx@...utronix.de;
>> linux-hyperv@...r.kernel.org; netdev@...r.kernel.org; linux-
>> kernel@...r.kernel.org; linux-rdma@...r.kernel.org
>> Cc: Souradeep Chakrabarti <schakrabarti@...rosoft.com>; Paul Rosswurm
>> <paulros@...rosoft.com>; Souradeep Chakrabarti
>> <schakrabarti@...ux.microsoft.com>
>> Subject: [PATCH net-next] net: mana: Assigning IRQ affinity on HT
>> cores
>>
>> Existing MANA design assigns IRQ affinity to every sibling CPUs, which
>> causes IRQ coalescing and may reduce the network performance with RSS.
>>
>> Improve the performance by adhering the configuration for RSS, which
>> prioritise IRQ affinity on HT cores.
>>
>> Signed-off-by: Souradeep Chakrabarti
>> <schakrabarti@...ux.microsoft.com>
>> ---
>>  .../net/ethernet/microsoft/mana/gdma_main.c   | 126 ++++++++++++++++-
>> -
>>  1 file changed, 117 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c
>> b/drivers/net/ethernet/microsoft/mana/gdma_main.c
>> index 6367de0c2c2e..839be819d46e 100644
>> --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
>> +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
>> @@ -1243,13 +1243,115 @@ void mana_gd_free_res_map(struct
>> gdma_resource *r)
>>  	r->size = 0;
>>  }
>>
>> +static void cpu_mask_set(cpumask_var_t *filter_mask, cpumask_var_t
>> **filter_mask_list)
>> +{
>> +	unsigned int core_count = 0, cpu;
>> +	cpumask_var_t *filter_mask_list_tmp;
>> +
>> +	BUG_ON(!filter_mask || !filter_mask_list);
>> +	filter_mask_list_tmp = *filter_mask_list;
>> +	cpumask_copy(*filter_mask, cpu_online_mask);
>> +	/* for each core create a cpumask lookup table,
>> +	 * which stores all the corresponding siblings
>> +	 */
>> +	for_each_cpu(cpu, *filter_mask) {
>> +
>> 	BUG_ON(!alloc_cpumask_var(&filter_mask_list_tmp[core_count],
>> GFP_KERNEL));
>> +		cpumask_or(filter_mask_list_tmp[core_count],
>> filter_mask_list_tmp[core_count],
>> +			   topology_sibling_cpumask(cpu));
>> +		cpumask_andnot(*filter_mask, *filter_mask,
>> topology_sibling_cpumask(cpu));
>> +		core_count++;
>> +	}
>> +}
>> +
>> +static int irq_setup(int *irqs, int nvec) {
>> +	cpumask_var_t filter_mask;
>> +	cpumask_var_t *filter_mask_list;
>> +	unsigned int cpu_first, cpu, irq_start, cores = 0;
>> +	int i, core_count = 0, numa_node, cpu_count = 0, err = 0;
>> +
>> +	BUG_ON(!alloc_cpumask_var(&filter_mask, GFP_KERNEL));
>> +	cpus_read_lock();
>> +	cpumask_copy(filter_mask, cpu_online_mask);
>> +	/* count the number of cores
>> +	 */
>> +	for_each_cpu(cpu, filter_mask) {
>> +		cpumask_andnot(filter_mask, filter_mask,
>> topology_sibling_cpumask(cpu));
>> +		cores++;
>> +	}
>> +	filter_mask_list = kcalloc(cores, sizeof(cpumask_var_t), GFP_KERNEL);
>> +	if (!filter_mask_list) {
>> +		err = -ENOMEM;
>> +		goto free_irq;
>> +	}
>> +	/* if number of cpus are equal to max_queues per port, then
>> +	 * one extra interrupt for the hardware channel communication.
>> +	 */
>> +	if (nvec - 1 == num_online_cpus()) {
>> +		irq_start = 1;
>> +		cpu_first = cpumask_first(cpu_online_mask);
>> +		irq_set_affinity_and_hint(irqs[0], cpumask_of(cpu_first));
>> +	} else {
>> +		irq_start = 0;
>> +	}
>> +	/* reset the core_count and num_node to 0.
>> +	 */
>> +	core_count = 0;
>> +	numa_node = 0;
>
>Please start with gc->numa_node here. I know it's 0 for now. But the host will
>provide real numa node# close to the device in the future.
Thank you. Will take care of it in next version.
>
>Also, as we discussed, consider using the NUMA distance to select the next numa
>node (in a separate patch).
>
>> +	cpu_mask_set(&filter_mask, &filter_mask_list);
>> +	/* for each interrupt find the cpu of a particular
>> +	 * sibling set and if it belongs to the specific numa
>> +	 * then assign irq to it and clear the cpu bit from
>> +	 * the corresponding sibling list from filter_mask_list.
>> +	 * Increase the cpu_count for that node.
>> +	 * Once all cpus for a numa node is assigned, then
>> +	 * move to different numa node and continue the same.
>> +	 */
>> +	for (i = irq_start; i < nvec; ) {
>> +		cpu_first = cpumask_first(filter_mask_list[core_count]);
>> +		if (cpu_first < nr_cpu_ids && cpu_to_node(cpu_first) ==
>> numa_node) {
>> +			irq_set_affinity_and_hint(irqs[i],
>> cpumask_of(cpu_first));
>> +			cpumask_clear_cpu(cpu_first,
>> filter_mask_list[core_count]);
>> +			cpu_count = cpu_count + 1;
>> +			i = i + 1;
>> +			/* checking if all the cpus are used from the
>> +			 * particular node.
>> +			 */
>> +			if (cpu_count == nr_cpus_node(numa_node)) {
>> +				numa_node = numa_node + 1;
>> +				if (numa_node == num_online_nodes()) {
>> +					cpu_mask_set(&filter_mask,
>> &filter_mask_list);
>> +					numa_node = 0;
>Ditto.
>
>Other things look good to me.
>
>Thanks,
>- Haiyang


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ