lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 14 Apr 2022 11:09:31 +0200
From:   Marek Szyprowski <m.szyprowski@...sung.com>
To:     Marc Zyngier <maz@...nel.org>
Cc:     linux-kernel <linux-kernel@...r.kernel.org>,
        'Linux Samsung SOC' <linux-samsung-soc@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        John Garry <john.garry@...wei.com>,
        Xiongfeng Wang <wangxiongfeng2@...wei.com>,
        David Decotigny <ddecotig@...gle.com>,
        Krzysztof Kozlowski <krzk@...nel.org>
Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs

Hi Marc,

On 13.04.2022 19:26, Marc Zyngier wrote:
> Hi Marek,
>
> On Wed, 13 Apr 2022 15:59:21 +0100,
> Marek Szyprowski <m.szyprowski@...sung.com> wrote:
>> Hi Marc,
>>
>> On 05.04.2022 20:50, Marc Zyngier wrote:
>>> When booting with maxcpus=<small number> (or even loading a driver
>>> while most CPUs are offline), it is pretty easy to observe managed
>>> affinities containing a mix of online and offline CPUs being passed
>>> to the irqchip driver.
>>>
>>> This means that the irqchip cannot trust the affinity passed down
>>> from the core code, which is a bit annoying and requires (at least
>>> in theory) all drivers to implement some sort of affinity narrowing.
>>>
>>> In order to address this, always limit the cpumask to the set of
>>> online CPUs.
>>>
>>> Signed-off-by: Marc Zyngier <maz@...nel.org>
>> This patch landed in linux next-20220413 as commit 33de0aa4bae9
>> ("genirq: Always limit the affinity to online CPUs"). Unfortunately it
>> breaks booting of most ARM 32bit Samsung Exynos based boards.
>>
>> I don't see anything specific in the log, though. Booting just hangs at
>> some point. The only Samsung Exynos boards that boot properly are those
>> Exynos4412 based.
>>
>> I assume that this is related to the Multi Core Timer IRQ configuration
>> specific for that SoCs. Exynos4412 uses PPI interrupts, while all other
>> Exynos SoCs have separate IRQ lines for each CPU.
>>
>> Let me know how I can help debugging this issue.
> Thanks for the heads up. Can you pick the last working kernel, enable
> CONFIG_GENERIC_IRQ_DEBUGFS, and dump the /sys/kernel/debug/irq/irqs/
> entries for the timer IRQs?

Exynos4210, Trats board, next-20220411:

root@...get:~# cat /proc/interrupts | grep mct
  40:          0          0 GIC-0  89 Level     mct_comp_irq
  41:       4337          0 GIC-0  74 Level     mct_tick0
  42:          0      11061 GIC-0  80 Level     mct_tick1
root@...get:~# cat /sys/kernel/debug/irq/irqs/41
handler:  handle_fasteoi_irq
device:   (null)
status:   0x00003504
             _IRQ_NOPROBE
             _IRQ_NOAUTOEN
istate:   0x00000000
ddepth:   0
wdepth:   0
dstate:   0x13403604
             IRQ_TYPE_LEVEL_HIGH
             IRQD_LEVEL
             IRQD_ACTIVATED
             IRQD_IRQ_STARTED
             IRQD_NO_BALANCING
             IRQD_SINGLE_TARGET
             IRQD_AFFINITY_SET
             IRQD_DEFAULT_TRIGGER_SET
             IRQD_HANDLE_ENFORCE_IRQCTX
node:     0
affinity: 0
effectiv: 0
domain:  :soc:interrupt-controller@...90000
  hwirq:   0x4a
  chip:    GIC-0
   flags:   0x15
              IRQCHIP_SET_TYPE_MASKED
              IRQCHIP_MASK_ON_SUSPEND
              IRQCHIP_SKIP_SET_WAKE
root@...get:~# cat /sys/kernel/debug/irq/irqs/42
handler:  handle_fasteoi_irq
device:   (null)
status:   0x00003504
             _IRQ_NOPROBE
             _IRQ_NOAUTOEN
istate:   0x00000000
ddepth:   0
wdepth:   0
dstate:   0x13403604
             IRQ_TYPE_LEVEL_HIGH
             IRQD_LEVEL
             IRQD_ACTIVATED
             IRQD_IRQ_STARTED
             IRQD_NO_BALANCING
             IRQD_SINGLE_TARGET
             IRQD_AFFINITY_SET
             IRQD_DEFAULT_TRIGGER_SET
             IRQD_HANDLE_ENFORCE_IRQCTX
node:     0
affinity: 1
effectiv: 1
domain:  :soc:interrupt-controller@...90000
  hwirq:   0x50
  chip:    GIC-0
   flags:   0x15
              IRQCHIP_SET_TYPE_MASKED
              IRQCHIP_MASK_ON_SUSPEND
              IRQCHIP_SKIP_SET_WAKE
root@...get:~#

> Also, see below.
>
>>> ---
>>>    kernel/irq/manage.c | 25 +++++++++++++++++--------
>>>    1 file changed, 17 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
>>> index c03f71d5ec10..f71ecc100545 100644
>>> --- a/kernel/irq/manage.c
>>> +++ b/kernel/irq/manage.c
>>> @@ -222,11 +222,16 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
>>>    {
>>>    	struct irq_desc *desc = irq_data_to_desc(data);
>>>    	struct irq_chip *chip = irq_data_get_irq_chip(data);
>>> +	const struct cpumask  *prog_mask;
>>>    	int ret;
>>>    
>>> +	static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
>>> +	static struct cpumask tmp_mask;
>>> +
>>>    	if (!chip || !chip->irq_set_affinity)
>>>    		return -EINVAL;
>>>    
>>> +	raw_spin_lock(&tmp_mask_lock);
>>>    	/*
>>>    	 * If this is a managed interrupt and housekeeping is enabled on
>>>    	 * it check whether the requested affinity mask intersects with
>>> @@ -248,24 +253,28 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
>>>    	 */
>>>    	if (irqd_affinity_is_managed(data) &&
>>>    	    housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) {
>>> -		const struct cpumask *hk_mask, *prog_mask;
>>> -
>>> -		static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
>>> -		static struct cpumask tmp_mask;
>>> +		const struct cpumask *hk_mask;
>>>    
>>>    		hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ);
>>>    
>>> -		raw_spin_lock(&tmp_mask_lock);
>>>    		cpumask_and(&tmp_mask, mask, hk_mask);
>>>    		if (!cpumask_intersects(&tmp_mask, cpu_online_mask))
>>>    			prog_mask = mask;
>>>    		else
>>>    			prog_mask = &tmp_mask;
>>> -		ret = chip->irq_set_affinity(data, prog_mask, force);
>>> -		raw_spin_unlock(&tmp_mask_lock);
>>>    	} else {
>>> -		ret = chip->irq_set_affinity(data, mask, force);
>>> +		prog_mask = mask;
>>>    	}
>>> +
>>> +	/* Make sure we only provide online CPUs to the irqchip */
>>> +	cpumask_and(&tmp_mask, prog_mask, cpu_online_mask);
>>> +	if (!cpumask_empty(&tmp_mask))
>>> +		ret = chip->irq_set_affinity(data, &tmp_mask, force);
>>> +	else
>>> +		ret = -EINVAL;
> Can you also check that with the patch applied, it is this path that
> is taken and that it is the timer interrupts that get rejected? If
> that's the case, can you put a dump_stack() here and give me that
> stack trace? The use of irq_force_affinity() in the driver looks
> suspicious...

[    0.158241] smp: Bringing up secondary CPUs ...
[    0.166118] irq_do_set_affinity irq 42
[    0.166160] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 
5.18.0-rc1-00002-g33de0aa4bae9-dirty #11708
[    0.166176] Hardware name: Samsung Exynos (Flattened Device Tree)
[    0.166188]  unwind_backtrace from show_stack+0x10/0x14
[    0.166220]  show_stack from dump_stack_lvl+0x58/0x70
[    0.166239]  dump_stack_lvl from irq_do_set_affinity+0x188/0x1c8
[    0.166258]  irq_do_set_affinity from irq_set_affinity_locked+0xf8/0x17c
[    0.166274]  irq_set_affinity_locked from irq_force_affinity+0x34/0x54
[    0.166290]  irq_force_affinity from exynos4_mct_starting_cpu+0xdc/0x11c
[    0.166308]  exynos4_mct_starting_cpu from 
cpuhp_invoke_callback+0x190/0x38c
[    0.166328]  cpuhp_invoke_callback from 
cpuhp_invoke_callback_range+0x98/0xb4
[    0.166345]  cpuhp_invoke_callback_range from 
notify_cpu_starting+0x54/0x94
[    0.166364]  notify_cpu_starting from secondary_start_kernel+0x160/0x26c
[    0.166383]  secondary_start_kernel from 0x40101a00
[    0.166498] CPU1: thread -1, cpu 1, socket 9, mpidr 80000901
[    0.166515] CPU1: Spectre v2: using BPIALL workaround
[    0.265631] smp: Brought up 1 node, 2 CPUs
[    0.268660] SMP: Total of 2 processors activated (96.00 BogoMIPS).
[    0.274583] CPU: All CPU(s) started in SVC mode.

> Finally, is there a QEMU emulation of one of these failing boards?

yes, smdkc210, I've reproduced it with the following command:

qemu-system-arm -serial null -serial stdio -kernel /tftpboot/qemu/zImage 
-dtb /tftpboot/qemu/exynos4210-smdkv310.dtb -append 
"console=ttySAC1,115200n8 earlycon root=/dev/mmcblk0p2 rootwait 
init=/bin/bash ip=::::target::off cpuidle.off=1" -M smdkc210 -drive 
file=qemu-smdkv310-mmcblk0.raw,if=sd,bus=0,index=2,format=raw


Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ