linux-kernel - Re: [PATCH v1 1/1] irqchip/gic-v3-its: fixup IRQ affinities to account for online CPUs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87bkyktzeh.wl-maz@kernel.org>
Date:   Sat, 05 Mar 2022 11:24:54 +0000
From:   Marc Zyngier <maz@...nel.org>
To:     David Decotigny <decot+git@...gle.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org,
        David Decotigny <ddecotig@...gle.com>
Subject: Re: [PATCH v1 1/1] irqchip/gic-v3-its: fixup IRQ affinities to account for online CPUs

On Fri, 04 Mar 2022 19:52:38 +0000,
David Decotigny <decot+git@...gle.com> wrote:
> 
> From: David Decotigny <ddecotig@...gle.com>
> 
> In some cases (eg. when booting with maxcpus=X), it is possible that
> the preset IRQ affinity masks don't intersect with the set of online
> CPUs. This patch extends the fallback strategy implemented when
> IRQD_AFFINITY_MANAGED is not set to all cases. This is logged the
> first time that happens.
> 
> Fixes: c5d6082d35e0 ("irqchip/gic-v3-its: Balance initial LPI affinity across CPUs")
> 

Missing SoB?

> ---
>  drivers/irqchip/irq-gic-v3-its.c | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index cd772973114a..de862fd9ad73 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -1618,11 +1618,6 @@ static int its_select_cpu(struct irq_data *d,
>  		/* Try the intersection of the affinity and online masks */
>  		cpumask_and(tmpmask, aff_mask, cpu_online_mask);
>  
> -		/* If that doesn't fly, the online mask is the last resort */
> -		if (cpumask_empty(tmpmask))
> -			cpumask_copy(tmpmask, cpu_online_mask);
> -
> -		cpu = cpumask_pick_least_loaded(d, tmpmask);
>  	} else {
>  		cpumask_and(tmpmask, irq_data_get_affinity_mask(d), cpu_online_mask);
>  
> @@ -1630,9 +1625,13 @@ static int its_select_cpu(struct irq_data *d,
>  		if ((its_dev->its->flags & ITS_FLAGS_WORKAROUND_CAVIUM_23144) &&
>  		    node != NUMA_NO_NODE)
>  			cpumask_and(tmpmask, tmpmask, cpumask_of_node(node));
> -
> -		cpu = cpumask_pick_least_loaded(d, tmpmask);
>  	}
> +
> +	/* If that doesn't fly, the online mask is the last resort */
> +	if (WARN_ON_ONCE(cpumask_empty(tmpmask)))
> +		cpumask_copy(tmpmask, cpu_online_mask);
> +
> +	cpu = cpumask_pick_least_loaded(d, tmpmask);
>  out:
>  	free_cpumask_var(tmpmask);
>  

Known issue, see [1] and [2].

I don't think the above is the right approach. For managed interrupts,
we shouldn't try and enable these interrupts until the CPUs are up,
and activating them on *another* CPU breaks the abstraction entirely.

The right way to do it would be to not activate them (marking them as
shutdown) until the CPUs are up and running, similarly to what x86
does (but we obviously can't reuse the matrix allocator for that).
Looking into this is on my TODO list, just didn't get the time to work
on it.

	M.

[1] https://lore.kernel.org/r/78615d08-1764-c895-f3b7-bfddfbcbdfb9@huawei.com
[2] https://lore.kernel.org/r/20220124073440.88598-1-wangxiongfeng2@huawei.com

-- 
Without deviation from the norm, progress is not possible.