[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5dcf8d22-e9b3-f306-4c5f-256707e08fbf@samsung.com>
Date: Thu, 14 Apr 2022 11:09:31 +0200
From: Marek Szyprowski <m.szyprowski@...sung.com>
To: Marc Zyngier <maz@...nel.org>
Cc: linux-kernel <linux-kernel@...r.kernel.org>,
'Linux Samsung SOC' <linux-samsung-soc@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
John Garry <john.garry@...wei.com>,
Xiongfeng Wang <wangxiongfeng2@...wei.com>,
David Decotigny <ddecotig@...gle.com>,
Krzysztof Kozlowski <krzk@...nel.org>
Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs
Hi Marc,
On 13.04.2022 19:26, Marc Zyngier wrote:
> Hi Marek,
>
> On Wed, 13 Apr 2022 15:59:21 +0100,
> Marek Szyprowski <m.szyprowski@...sung.com> wrote:
>> Hi Marc,
>>
>> On 05.04.2022 20:50, Marc Zyngier wrote:
>>> When booting with maxcpus=<small number> (or even loading a driver
>>> while most CPUs are offline), it is pretty easy to observe managed
>>> affinities containing a mix of online and offline CPUs being passed
>>> to the irqchip driver.
>>>
>>> This means that the irqchip cannot trust the affinity passed down
>>> from the core code, which is a bit annoying and requires (at least
>>> in theory) all drivers to implement some sort of affinity narrowing.
>>>
>>> In order to address this, always limit the cpumask to the set of
>>> online CPUs.
>>>
>>> Signed-off-by: Marc Zyngier <maz@...nel.org>
>> This patch landed in linux next-20220413 as commit 33de0aa4bae9
>> ("genirq: Always limit the affinity to online CPUs"). Unfortunately it
>> breaks booting of most ARM 32bit Samsung Exynos based boards.
>>
>> I don't see anything specific in the log, though. Booting just hangs at
>> some point. The only Samsung Exynos boards that boot properly are those
>> Exynos4412 based.
>>
>> I assume that this is related to the Multi Core Timer IRQ configuration
>> specific for that SoCs. Exynos4412 uses PPI interrupts, while all other
>> Exynos SoCs have separate IRQ lines for each CPU.
>>
>> Let me know how I can help debugging this issue.
> Thanks for the heads up. Can you pick the last working kernel, enable
> CONFIG_GENERIC_IRQ_DEBUGFS, and dump the /sys/kernel/debug/irq/irqs/
> entries for the timer IRQs?
Exynos4210, Trats board, next-20220411:
root@...get:~# cat /proc/interrupts | grep mct
40: 0 0 GIC-0 89 Level mct_comp_irq
41: 4337 0 GIC-0 74 Level mct_tick0
42: 0 11061 GIC-0 80 Level mct_tick1
root@...get:~# cat /sys/kernel/debug/irq/irqs/41
handler: handle_fasteoi_irq
device: (null)
status: 0x00003504
_IRQ_NOPROBE
_IRQ_NOAUTOEN
istate: 0x00000000
ddepth: 0
wdepth: 0
dstate: 0x13403604
IRQ_TYPE_LEVEL_HIGH
IRQD_LEVEL
IRQD_ACTIVATED
IRQD_IRQ_STARTED
IRQD_NO_BALANCING
IRQD_SINGLE_TARGET
IRQD_AFFINITY_SET
IRQD_DEFAULT_TRIGGER_SET
IRQD_HANDLE_ENFORCE_IRQCTX
node: 0
affinity: 0
effectiv: 0
domain: :soc:interrupt-controller@...90000
hwirq: 0x4a
chip: GIC-0
flags: 0x15
IRQCHIP_SET_TYPE_MASKED
IRQCHIP_MASK_ON_SUSPEND
IRQCHIP_SKIP_SET_WAKE
root@...get:~# cat /sys/kernel/debug/irq/irqs/42
handler: handle_fasteoi_irq
device: (null)
status: 0x00003504
_IRQ_NOPROBE
_IRQ_NOAUTOEN
istate: 0x00000000
ddepth: 0
wdepth: 0
dstate: 0x13403604
IRQ_TYPE_LEVEL_HIGH
IRQD_LEVEL
IRQD_ACTIVATED
IRQD_IRQ_STARTED
IRQD_NO_BALANCING
IRQD_SINGLE_TARGET
IRQD_AFFINITY_SET
IRQD_DEFAULT_TRIGGER_SET
IRQD_HANDLE_ENFORCE_IRQCTX
node: 0
affinity: 1
effectiv: 1
domain: :soc:interrupt-controller@...90000
hwirq: 0x50
chip: GIC-0
flags: 0x15
IRQCHIP_SET_TYPE_MASKED
IRQCHIP_MASK_ON_SUSPEND
IRQCHIP_SKIP_SET_WAKE
root@...get:~#
> Also, see below.
>
>>> ---
>>> kernel/irq/manage.c | 25 +++++++++++++++++--------
>>> 1 file changed, 17 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
>>> index c03f71d5ec10..f71ecc100545 100644
>>> --- a/kernel/irq/manage.c
>>> +++ b/kernel/irq/manage.c
>>> @@ -222,11 +222,16 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
>>> {
>>> struct irq_desc *desc = irq_data_to_desc(data);
>>> struct irq_chip *chip = irq_data_get_irq_chip(data);
>>> + const struct cpumask *prog_mask;
>>> int ret;
>>>
>>> + static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
>>> + static struct cpumask tmp_mask;
>>> +
>>> if (!chip || !chip->irq_set_affinity)
>>> return -EINVAL;
>>>
>>> + raw_spin_lock(&tmp_mask_lock);
>>> /*
>>> * If this is a managed interrupt and housekeeping is enabled on
>>> * it check whether the requested affinity mask intersects with
>>> @@ -248,24 +253,28 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
>>> */
>>> if (irqd_affinity_is_managed(data) &&
>>> housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) {
>>> - const struct cpumask *hk_mask, *prog_mask;
>>> -
>>> - static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
>>> - static struct cpumask tmp_mask;
>>> + const struct cpumask *hk_mask;
>>>
>>> hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ);
>>>
>>> - raw_spin_lock(&tmp_mask_lock);
>>> cpumask_and(&tmp_mask, mask, hk_mask);
>>> if (!cpumask_intersects(&tmp_mask, cpu_online_mask))
>>> prog_mask = mask;
>>> else
>>> prog_mask = &tmp_mask;
>>> - ret = chip->irq_set_affinity(data, prog_mask, force);
>>> - raw_spin_unlock(&tmp_mask_lock);
>>> } else {
>>> - ret = chip->irq_set_affinity(data, mask, force);
>>> + prog_mask = mask;
>>> }
>>> +
>>> + /* Make sure we only provide online CPUs to the irqchip */
>>> + cpumask_and(&tmp_mask, prog_mask, cpu_online_mask);
>>> + if (!cpumask_empty(&tmp_mask))
>>> + ret = chip->irq_set_affinity(data, &tmp_mask, force);
>>> + else
>>> + ret = -EINVAL;
> Can you also check that with the patch applied, it is this path that
> is taken and that it is the timer interrupts that get rejected? If
> that's the case, can you put a dump_stack() here and give me that
> stack trace? The use of irq_force_affinity() in the driver looks
> suspicious...
[ 0.158241] smp: Bringing up secondary CPUs ...
[ 0.166118] irq_do_set_affinity irq 42
[ 0.166160] CPU: 1 PID: 0 Comm: swapper/1 Not tainted
5.18.0-rc1-00002-g33de0aa4bae9-dirty #11708
[ 0.166176] Hardware name: Samsung Exynos (Flattened Device Tree)
[ 0.166188] unwind_backtrace from show_stack+0x10/0x14
[ 0.166220] show_stack from dump_stack_lvl+0x58/0x70
[ 0.166239] dump_stack_lvl from irq_do_set_affinity+0x188/0x1c8
[ 0.166258] irq_do_set_affinity from irq_set_affinity_locked+0xf8/0x17c
[ 0.166274] irq_set_affinity_locked from irq_force_affinity+0x34/0x54
[ 0.166290] irq_force_affinity from exynos4_mct_starting_cpu+0xdc/0x11c
[ 0.166308] exynos4_mct_starting_cpu from
cpuhp_invoke_callback+0x190/0x38c
[ 0.166328] cpuhp_invoke_callback from
cpuhp_invoke_callback_range+0x98/0xb4
[ 0.166345] cpuhp_invoke_callback_range from
notify_cpu_starting+0x54/0x94
[ 0.166364] notify_cpu_starting from secondary_start_kernel+0x160/0x26c
[ 0.166383] secondary_start_kernel from 0x40101a00
[ 0.166498] CPU1: thread -1, cpu 1, socket 9, mpidr 80000901
[ 0.166515] CPU1: Spectre v2: using BPIALL workaround
[ 0.265631] smp: Brought up 1 node, 2 CPUs
[ 0.268660] SMP: Total of 2 processors activated (96.00 BogoMIPS).
[ 0.274583] CPU: All CPU(s) started in SVC mode.
> Finally, is there a QEMU emulation of one of these failing boards?
yes, smdkc210, I've reproduced it with the following command:
qemu-system-arm -serial null -serial stdio -kernel /tftpboot/qemu/zImage
-dtb /tftpboot/qemu/exynos4210-smdkv310.dtb -append
"console=ttySAC1,115200n8 earlycon root=/dev/mmcblk0p2 rootwait
init=/bin/bash ip=::::target::off cpuidle.off=1" -M smdkc210 -drive
file=qemu-smdkv310-mmcblk0.raw,if=sd,bus=0,index=2,format=raw
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Powered by blists - more mailing lists