lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <186cdeea58094d06b351b07eefa2189d@h3c.com>
Date: Wed, 17 Apr 2024 11:01:56 +0000
From: Liuye <liu.yeC@....com>
To: Greg KH <gregkh@...uxfoundation.org>,
        "daniel.thompson@...aro.org"
	<daniel.thompson@...aro.org>,
        "andy.shevchenko@...il.com"
	<andy.shevchenko@...il.com>
CC: "dianders@...omium.org" <dianders@...omium.org>,
        "jason.wessel@...driver.com" <jason.wessel@...driver.com>,
        "jirislaby@...nel.org" <jirislaby@...nel.org>,
        "kgdb-bugreport@...ts.sourceforge.net"
	<kgdb-bugreport@...ts.sourceforge.net>,
        "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>,
        "linux-serial@...r.kernel.org"
	<linux-serial@...r.kernel.org>
Subject: Re:[PATCH V11] kdb: Fix the deadlock issue in KDB debugging.

>From: LiuYe <liu.yeC@....com>
>
>Currently, if CONFIG_KDB_KEYBOARD is enabled, then kgdboc will
>attempt to use schedule_work() to provoke a keyboard reset when
>transitioning out of the debugger and back to normal operation.
>This can cause deadlock because schedule_work() is not NMI-safe.
>
>The stack trace below shows an example of the problem. In this
>case the master cpu is not running from NMI but it has parked
>the slave CPUs using an NMI and the parked CPUs is holding
>spinlocks needed by schedule_work().
>
>Example:
>BUG: spinlock lockup suspected on CPU#0. owner_cpu: 1
>CPU1: Call Trace:
>__schedule
>schedule
>schedule_hrtimeout_range_clock
>mutex_unlock
>ep_scan_ready_list
>schedule_hrtimeout_range
>ep_poll
>wake_up_q
>SyS_epoll_wait
>entry_SYSCALL_64_fastpath
>
>CPU0: Call Trace:
>dump_stack
>spin_dump
>do_raw_spin_lock
>_raw_spin_lock
>try_to_wake_up
>wake_up_process
>insert_work
>__queue_work
>queue_work_on
>kgdboc_post_exp_handler
>kgdb_cpu_enter
>kgdb_handle_exception
>__kgdb_notify
>kgdb_notify
>notifier_call_chain
>notify_die
>do_int3
>int3
>
>We fix the problem by using irq_work to call schedule_work()
>instead of calling it directly. This is because we cannot
>resynchronize the keyboard state from the hardirq context
>provided by irq_work. This must be done from the task context
>in order to call the input subsystem.
>
>Therefore, we have to defer the work twice. First, safely
>switch from the debug trap context (similar to NMI) to the
>hardirq, and then switch from the hardirq to the system work queue.
>
>Signed-off-by: LiuYe <liu.yeC@....com>
>Co-developed-by: Daniel Thompson <daniel.thompson@...aro.org>
>Signed-off-by: Daniel Thompson <daniel.thompson@...aro.org>
>
>---
>V10 -> V11: Revert to V9
>V9 -> V10 : Add Signed-off-by of Greg KH and Andy Shevchenko, Acked-by of Daniel Thompson
>V8 -> V9: Modify call trace format and move irq_work.h before module.h
>V7 -> V8: Update the description information and comments in the code.
>	: Submit this patch based on version linux-6.9-rc2.
>V6 -> V7: Add comments in the code.
>V5 -> V6: Replace with a more professional and accurate answer.
>V4 -> V5: Answer why schedule another work in the irq_work and not do the job directly.
>V3 -> V4: Add changelogs
>V2 -> V3: Add description information
>V1 -> V2: using irq_work to solve this properly.
>---
>---
> drivers/tty/serial/kgdboc.c | 24 +++++++++++++++++++++++-
> 1 file changed, 23 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
>index 7ce7bb164..32410fec7 100644
>--- a/drivers/tty/serial/kgdboc.c
>+++ b/drivers/tty/serial/kgdboc.c
>@@ -19,6 +19,7 @@
> #include <linux/console.h>
> #include <linux/vt_kern.h>
> #include <linux/input.h>
>+#include <linux/irq_work.h>
> #include <linux/module.h>
> #include <linux/platform_device.h>
> #include <linux/serial_core.h>
>@@ -82,6 +83,19 @@ static struct input_handler kgdboc_reset_handler = {
> 
> static DEFINE_MUTEX(kgdboc_reset_mutex);
> 
>+/*
>+ * This code ensures that the keyboard state, which is changed during kdb
>+ * execution, is resynchronized when we leave the debug trap. The resync
>+ * logic calls into the input subsystem to force a reset. The calls into
>+ * the input subsystem must be executed from normal task context.
>+ *
>+ * We need to trigger the resync from the debug trap, which executes in an
>+ * NMI (or similar) context. To make it safe to call into the input
>+ * subsystem we end up having use two deferred execution techniques.
>+ * Firstly, we use irq_work, which is NMI-safe, to provoke a callback from
>+ * hardirq context. Then, from the hardirq callback we use the system
>+ * workqueue to provoke the callback that actually performs the resync.
>+ */
> static void kgdboc_restore_input_helper(struct work_struct *dummy)
> {
> 	/*
>@@ -99,10 +113,17 @@ static void kgdboc_restore_input_helper(struct work_struct *dummy)
> 
> static DECLARE_WORK(kgdboc_restore_input_work, kgdboc_restore_input_helper);
> 
>+static void kgdboc_queue_restore_input_helper(struct irq_work *unused)
>+{
>+	schedule_work(&kgdboc_restore_input_work);
>+}
>+
>+static DEFINE_IRQ_WORK(kgdboc_restore_input_irq_work, kgdboc_queue_restore_input_helper);
>+
> static void kgdboc_restore_input(void)
> {
> 	if (likely(system_state == SYSTEM_RUNNING))
>-		schedule_work(&kgdboc_restore_input_work);
>+		irq_work_queue(&kgdboc_restore_input_irq_work);
> }
> 
> static int kgdboc_register_kbd(char **cptr)
>@@ -133,6 +154,7 @@ static void kgdboc_unregister_kbd(void)
> 			i--;
> 		}
> 	}
>+	irq_work_sync(&kgdboc_restore_input_irq_work);
> 	flush_work(&kgdboc_restore_input_work);
> }
> #else /* ! CONFIG_KDB_KEYBOARD */
>-- 
>2.25.1

What is the current status of PATCH V11? Are there any additional modifications needed?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ