lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87v7x8woeq.ffs@tglx>
Date: Thu, 31 Oct 2024 11:35:25 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: 'Guanjun' <guanjun@...ux.alibaba.com>, corbet@....net, axboe@...nel.dk,
 mst@...hat.com, jasowang@...hat.com, xuanzhuo@...ux.alibaba.com,
 eperezma@...hat.com, vgoyal@...hat.com, stefanha@...hat.com,
 miklos@...redi.hu, peterz@...radead.org, akpm@...ux-foundation.org,
 paulmck@...nel.org, thuth@...hat.com, rostedt@...dmis.org, bp@...en8.de,
 xiongwei.song@...driver.com, linux-doc@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-block@...r.kernel.org,
 virtualization@...ts.linux.dev, linux-fsdevel@...r.kernel.org
Cc: guanjun@...ux.alibaba.com
Subject: Re: [PATCH RFC v1 1/2] genirq/affinity: add support for limiting
 managed interrupts

On Thu, Oct 31 2024 at 15:46, guanjun@...ux.alibaba.com wrote:
>  #ifdef CONFIG_SMP
>  
> +static unsigned int __read_mostly managed_irqs_per_node;
> +static struct cpumask managed_irqs_cpumsk[MAX_NUMNODES] __cacheline_aligned_in_smp = {
> +	[0 ... MAX_NUMNODES-1] = {CPU_BITS_ALL}
> +};
>  
> +static void __group_prepare_affinity(struct cpumask *premask,
> +				     cpumask_var_t *node_to_cpumask)
> +{
> +	nodemask_t nodemsk = NODE_MASK_NONE;
> +	unsigned int ncpus, n;
> +
> +	get_nodes_in_cpumask(node_to_cpumask, premask, &nodemsk);
> +
> +	for_each_node_mask(n, nodemsk) {
> +		cpumask_and(&managed_irqs_cpumsk[n], &managed_irqs_cpumsk[n], premask);
> +		cpumask_and(&managed_irqs_cpumsk[n], &managed_irqs_cpumsk[n], node_to_cpumask[n]);

How is this managed_irqs_cpumsk array protected against concurrency?

> +		ncpus = cpumask_weight(&managed_irqs_cpumsk[n]);
> +		if (ncpus < managed_irqs_per_node) {
> +			/* Reset node n to current node cpumask */
> +			cpumask_copy(&managed_irqs_cpumsk[n], node_to_cpumask[n]);

This whole logic is incomprehensible and aside of the concurrency
problem it's broken when CPUs are made present at run-time because these
cpu masks are static and represent the stale state of the last
invocation.

Given the limitations of the x86 vector space, which is not going away
anytime soon, there are only two options IMO to handle such a scenario.

   1) Tell the nvme/block layer to disable queue affinity management

   2) Restrict the devices and queues to the nodes they sit on

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ