lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHp75VdZF-Qi-9ahhXTLxdQqVb7wBJu7KqjD8xj6byVC5W+-yw@mail.gmail.com>
Date: Wed, 10 Apr 2024 06:59:30 +0300
From: Andy Shevchenko <andy.shevchenko@...il.com>
To: liu.yec@....com
Cc: daniel.thompson@...aro.org, gregkh@...uxfoundation.org, 
	dianders@...omium.org, jason.wessel@...driver.com, jirislaby@...nel.org, 
	kgdb-bugreport@...ts.sourceforge.net, linux-kernel@...r.kernel.org, 
	linux-serial@...r.kernel.org
Subject: Re: [PATCH V10] kdb: Fix the deadlock issue in KDB debugging.

On Wed, Apr 10, 2024 at 5:07 AM <liu.yec@....com> wrote:
>
> From: LiuYe <liu.yeC@....com>
>
> Currently, if CONFIG_KDB_KEYBOARD is enabled, then kgdboc will
> attempt to use schedule_work() to provoke a keyboard reset when
> transitioning out of the debugger and back to normal operation.
> This can cause deadlock because schedule_work() is not NMI-safe.
>
> The stack trace below shows an example of the problem. In this
> case the master cpu is not running from NMI but it has parked
> the slave CPUs using an NMI and the parked CPUs is holding
> spinlocks needed by schedule_work().
>
> Example:
> BUG: spinlock lockup suspected on CPU#0. owner_cpu: 1
> CPU1: Call Trace:
> __schedule
> schedule
> schedule_hrtimeout_range_clock
> mutex_unlock
> ep_scan_ready_list
> schedule_hrtimeout_range
> ep_poll
> wake_up_q
> SyS_epoll_wait
> entry_SYSCALL_64_fastpath
>
> CPU0: Call Trace:
> dump_stack
> spin_dump
> do_raw_spin_lock
> _raw_spin_lock
> try_to_wake_up
> wake_up_process
> insert_work
> __queue_work
> queue_work_on
> kgdboc_post_exp_handler
> kgdb_cpu_enter
> kgdb_handle_exception
> __kgdb_notify
> kgdb_notify
> notifier_call_chain
> notify_die
> do_int3
> int3
>
> We fix the problem by using irq_work to call schedule_work()
> instead of calling it directly. This is because we cannot
> resynchronize the keyboard state from the hardirq context
> provided by irq_work. This must be done from the task context
> in order to call the input subsystem.
>
> Therefore, we have to defer the work twice. First, safely
> switch from the debug trap context (similar to NMI) to the
> hardirq, and then switch from the hardirq to the system work queue.

..

> Signed-off-by: Greg KH <gregkh@...uxfoundation.org>
> Signed-off-by: Andy Shevchenko <andy.shevchenko@...il.com>

> V9 -> V10 : Add Signed-off-by of Greg KH and Andy Shevchenko, Acked-by of Daniel Thompson

Huh?!

-- 
With Best Regards,
Andy Shevchenko

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ