[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <877d7sar5k.wl-maz@kernel.org>
Date: Thu, 14 Apr 2022 11:35:51 +0100
From: Marc Zyngier <maz@...nel.org>
To: Marek Szyprowski <m.szyprowski@...sung.com>
Cc: linux-kernel <linux-kernel@...r.kernel.org>,
'Linux Samsung SOC' <linux-samsung-soc@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
John Garry <john.garry@...wei.com>,
Xiongfeng Wang <wangxiongfeng2@...wei.com>,
David Decotigny <ddecotig@...gle.com>,
Krzysztof Kozlowski <krzk@...nel.org>
Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs
Hi Marek,
On Thu, 14 Apr 2022 10:09:31 +0100,
Marek Szyprowski <m.szyprowski@...sung.com> wrote:
>
> Hi Marc,
>
> On 13.04.2022 19:26, Marc Zyngier wrote:
> > Hi Marek,
> >
> > On Wed, 13 Apr 2022 15:59:21 +0100,
> > Marek Szyprowski <m.szyprowski@...sung.com> wrote:
> >> Hi Marc,
> >>
> >> On 05.04.2022 20:50, Marc Zyngier wrote:
> >>> When booting with maxcpus=<small number> (or even loading a driver
> >>> while most CPUs are offline), it is pretty easy to observe managed
> >>> affinities containing a mix of online and offline CPUs being passed
> >>> to the irqchip driver.
> >>>
> >>> This means that the irqchip cannot trust the affinity passed down
> >>> from the core code, which is a bit annoying and requires (at least
> >>> in theory) all drivers to implement some sort of affinity narrowing.
> >>>
> >>> In order to address this, always limit the cpumask to the set of
> >>> online CPUs.
> >>>
> >>> Signed-off-by: Marc Zyngier <maz@...nel.org>
> >> This patch landed in linux next-20220413 as commit 33de0aa4bae9
> >> ("genirq: Always limit the affinity to online CPUs"). Unfortunately it
> >> breaks booting of most ARM 32bit Samsung Exynos based boards.
> >>
> >> I don't see anything specific in the log, though. Booting just hangs at
> >> some point. The only Samsung Exynos boards that boot properly are those
> >> Exynos4412 based.
> >>
> >> I assume that this is related to the Multi Core Timer IRQ configuration
> >> specific for that SoCs. Exynos4412 uses PPI interrupts, while all other
> >> Exynos SoCs have separate IRQ lines for each CPU.
> >>
> >> Let me know how I can help debugging this issue.
> > Thanks for the heads up. Can you pick the last working kernel, enable
> > CONFIG_GENERIC_IRQ_DEBUGFS, and dump the /sys/kernel/debug/irq/irqs/
> > entries for the timer IRQs?
>
> Exynos4210, Trats board, next-20220411:
Thanks for all of the debug, super helpful. The issue is that we don't
handle the 'force' case, which a handful of drivers are using when
bringing up CPUs (and doing so before the CPUs are marked online).
Can you please give the below hack a go?
Thanks,
M.
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index f71ecc100545..f1d5a94c6c9f 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -266,10 +266,16 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
prog_mask = mask;
}
- /* Make sure we only provide online CPUs to the irqchip */
+ /*
+ * Make sure we only provide online CPUs to the irqchip,
+ * unless we are being asked to force the affinity (in which
+ * case we do as we are told).
+ */
cpumask_and(&tmp_mask, prog_mask, cpu_online_mask);
- if (!cpumask_empty(&tmp_mask))
+ if (!force && !cpumask_empty(&tmp_mask))
ret = chip->irq_set_affinity(data, &tmp_mask, force);
+ else if (force)
+ ret = chip->irq_set_affinity(data, mask, force);
else
ret = -EINVAL;
--
Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists