lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7d18317b-5a3a-435e-8620-6f978dc84e8f@kernel.org>
Date: Tue, 22 Jul 2025 09:37:53 +0200
From: Jiri Slaby <jirislaby@...nel.org>
To: Thomas Gleixner <tglx@...utronix.de>, LKML <linux-kernel@...r.kernel.org>
Cc: Liangyan <liangyan.peng@...edance.com>,
 Yicong Shen <shenyicong.1023@...edance.com>
Subject: Re: [patch 4/4] genirq: Prevent migration live lock in
 handle_edge_irq()

On 18. 07. 25, 20:54, Thomas Gleixner wrote:
> Yicon reported and Liangyan debugged a live lock in handle_edge_irq()
> related to interrupt migration.
> 
> If the interrupt affinity is moved to a new target CPU and the interrupt is
> currently handled on the previous target CPU for edge type interrupts the
> handler might get stuck on the previous target:
> 
> CPU 0 (previous target)		CPU 1 (new target)
> 
>    handle_edge_irq()
>     repeat:
> 	handle_event()		handle_edge_irq()
> 			        if (INPROGESS) {
> 				  set(PENDING);
> 				  mask();
> 				  return;
> 				}
> 	if (PENDING) {
> 	  clear(PENDING);
> 	  unmask();
> 	  goto repeat;
> 	}
> 
> The migration in software never completes and CPU0 continues to handle the
> pending events forever. This happens when the device raises interrupts with
> a high rate and always before handle_event() completes and before the CPU0
> handler can clear INPROGRESS so that CPU1 sets the PENDING flag over and
> over. This has been observed in virtual machines.
> 
> Prevent this by checking whether the CPU which observes the INPROGRESS flag
> is the new affinity target. If that's the case, do not set the PENDING flag
> and wait for the INPROGRESS flag to be cleared instead, so that the new
> interrupt is handled on the new target CPU and the previous CPU is released
> from the action.
> 
> This is restricted to the edge type handler and only utilized on systems,
> which use single CPU targets for interrupt affinity.

LGTM

Reviewed-by: Jiri Slaby <jirislaby@...nel.org>


> Reported-by: Yicong Shen <shenyicong.1023@...edance.com>
> Reported-by: Liangyan <liangyan.peng@...edance.com>
> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
> Link: https://lore.kernel.org/all/20250701163558.2588435-1-liangyan.peng@bytedance.com

thanks,
-- 
js
suse labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ