lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 22 Mar 2024 07:50:54 +0000
From: Liuye <liu.yeC@....com>
To: Jiri Slaby <jirislaby@...nel.org>,
        "daniel.thompson@...aro.org"
	<daniel.thompson@...aro.org>
CC: "dianders@...omium.org" <dianders@...omium.org>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "jason.wessel@...driver.com" <jason.wessel@...driver.com>,
        "kgdb-bugreport@...ts.sourceforge.net"
	<kgdb-bugreport@...ts.sourceforge.net>,
        "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>,
        "linux-serial@...r.kernel.org"
	<linux-serial@...r.kernel.org>
Subject: 答复: [PATCH V4] kdb: Fix the deadlock issue in KDB debugging.

>On 21. 03. 24, 12:50, liu.yec@....com wrote:
>> From: LiuYe <liu.yeC@....com>
>> 
>> Currently, if CONFIG_KDB_KEYBOARD is enabled, then kgdboc will attempt 
>> to use schedule_work() to provoke a keyboard reset when transitioning 
>> out of the debugger and back to normal operation.
>> This can cause deadlock because schedule_work() is not NMI-safe.
>> 
>> The stack trace below shows an example of the problem. In this case 
>> the master cpu is not running from NMI but it has parked the slave 
>> CPUs using an NMI and the parked CPUs is holding spinlocks needed by 
>> schedule_work().
>
>I am missing here an explanation (perhaps because I cannot find any docs for irq_work) why irq_work works in this case.

Just need to postpone schedule_work to the slave CPU exiting the NMI context, and there will be no deadlock problem. 
irq_work will only respond to handle schedule_work after master cpu exiting the current interrupt context. 
When the master CPU exits the interrupt context, other CPUs will naturally exit the NMI context, so there will be no deadlock.

>And why you need to schedule another work in the irq_work and not do the job directly.

In the function kgdboc_restore_input_helper , use mutex_lock for protection. The mutex lock cannot be used in interrupt context.
Guess that the process needs to run in the context of the process. Therefore, call schedule_work in irq_work. Keep the original flow unchanged.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ