lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56ed54fd241c462189d2d030ad51eac6@h3c.com>
Date: Thu, 14 Mar 2024 07:06:22 +0000
From: Liuye <liu.yeC@....com>
To: Daniel Thompson <daniel.thompson@...aro.org>
CC: "jason.wessel@...driver.com" <jason.wessel@...driver.com>,
        "dianders@...omium.org" <dianders@...omium.org>,
        "gregkh@...uxfoundation.org"
	<gregkh@...uxfoundation.org>,
        "jirislaby@...nel.org" <jirislaby@...nel.org>,
        "kgdb-bugreport@...ts.sourceforge.net"
	<kgdb-bugreport@...ts.sourceforge.net>,
        "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>,
        "linux-serial@...r.kernel.org"
	<linux-serial@...r.kernel.org>
Subject: 答复: 答复: 答复: 答复: 答复: [PATCH] kdb: Fix the deadlock issue in KDB debugging.

>On Wed, Mar 13, 2024 at 01:22:17AM +0000, Liuye wrote:
>> >On Tue, Mar 12, 2024 at 10:04:54AM +0000, Liuye wrote:
>> >> >On Tue, Mar 12, 2024 at 08:37:11AM +0000, Liuye wrote:
>> >> >> I know that you said schedule_work is not NMI save, which is the 
>> >> >> first issue. Perhaps it can be fixed using irq_work_queue. But 
>> >> >> even if irq_work_queue is used to implement it, there will still 
>> >> >> be a deadlock problem because slave cpu1 still has not released 
>> >> >> the running queue lock of master CPU0.
>> >> >
>> >> >This doesn't sound right to me. Why do you think CPU1 won't 
>> >> >release the run queue lock?
>> >>
>> >> In this example, CPU1 is waiting for CPU0 to release 
>> >> dbg_slave_lock.
>> >
>> >That shouldn't be a problem. CPU0 will have released that lock by the 
>> >time the irq work is dispatched.
>>
>> Release dbg_slave_lock in CPU0. Before that, shcedule_work needs to be 
>> handled, and we are back to the previous issue.
>
>Sorry but I still don't understand what problem you think can happen here. What is wrong with calling schedule_work() from the IRQ work handler?
>
>Both irq_work_queue() and schedule_work() are calls to queue deferred work. It does not matter when the work is queued (providing we are lock safe). What matters is when the work is actually executed.
>
>Please can you describe the problem you think exists based on when the work is executed.

CPU0 enters the KDB process when processing serial port interrupts and triggers an IPI (NMI) to other CPUs. 
After entering a stable state, CPU0 is in interrupt context, while other CPUs are in NMI context. 
Before other CPUs enter NMI context, there is a chance to obtain the running queue of CPU0. 
At this time, when CPU0 is processing kgdboc_restore_input, calling schedule_work, need_more_worker here determines the chance to wake up processes on system_wq. 
This will cause CPU0 to acquire the running queue lock of this core, which is held by other CPUs. 
but other CPUs are still in NMI context and have not exited because waiting for CPU0 to release the dbg_slave_lock after schedule_work.

After thinking about it, the problem is not whether schedule_work is NMI safe, but that processes on system_wq should not be awakened immediately when schedule_work is called. 
I replaced schedule_work with schedule_delayed_work, and this solved my problem.

The new patch is as follows:

Index: drivers/tty/serial/kgdboc.c
===================================================================
--- drivers/tty/serial/kgdboc.c (revision 57862)
+++ drivers/tty/serial/kgdboc.c (working copy)
@@ -92,12 +92,12 @@
        mutex_unlock(&kgdboc_reset_mutex);
 }

-static DECLARE_WORK(kgdboc_restore_input_work, kgdboc_restore_input_helper);
+static DECLARE_DELAYED_WORK(kgdboc_restore_input_work, kgdboc_restore_input_helper);

 static void kgdboc_restore_input(void)
 {
        if (likely(system_state == SYSTEM_RUNNING))
-               schedule_work(&kgdboc_restore_input_work);
+               schedule_delayed_work(&kgdboc_restore_input_work,2*HZ);
 }

 static int kgdboc_register_kbd(char **cptr)
@@ -128,7 +128,7 @@
                        i--;
                }
        }
-       flush_work(&kgdboc_restore_input_work);
+       flush_delayed_work(&kgdboc_restore_input_work);
 }
 #else /* ! CONFIG_KDB_KEYBOARD */
 #define kgdboc_register_kbd(x) 0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ