lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2024041014-padlock-aggregate-4705@gregkh>
Date: Wed, 10 Apr 2024 07:30:22 +0200
From: Greg KH <gregkh@...uxfoundation.org>
To: liu.yec@....com
Cc: andy.shevchenko@...il.com, daniel.thompson@...aro.org,
	dianders@...omium.org, jason.wessel@...driver.com,
	jirislaby@...nel.org, kgdb-bugreport@...ts.sourceforge.net,
	linux-kernel@...r.kernel.org, linux-serial@...r.kernel.org
Subject: Re: [PATCH V10] kdb: Fix the deadlock issue in KDB debugging.

On Wed, Apr 10, 2024 at 10:06:15AM +0800, liu.yec@....com wrote:
> From: LiuYe <liu.yeC@....com>
> 
> Currently, if CONFIG_KDB_KEYBOARD is enabled, then kgdboc will
> attempt to use schedule_work() to provoke a keyboard reset when
> transitioning out of the debugger and back to normal operation.
> This can cause deadlock because schedule_work() is not NMI-safe.
> 
> The stack trace below shows an example of the problem. In this
> case the master cpu is not running from NMI but it has parked
> the slave CPUs using an NMI and the parked CPUs is holding
> spinlocks needed by schedule_work().
> 
> Example:
> BUG: spinlock lockup suspected on CPU#0. owner_cpu: 1
> CPU1: Call Trace:
> __schedule
> schedule
> schedule_hrtimeout_range_clock
> mutex_unlock
> ep_scan_ready_list
> schedule_hrtimeout_range
> ep_poll
> wake_up_q
> SyS_epoll_wait
> entry_SYSCALL_64_fastpath
> 
> CPU0: Call Trace:
> dump_stack
> spin_dump
> do_raw_spin_lock
> _raw_spin_lock
> try_to_wake_up
> wake_up_process
> insert_work
> __queue_work
> queue_work_on
> kgdboc_post_exp_handler
> kgdb_cpu_enter
> kgdb_handle_exception
> __kgdb_notify
> kgdb_notify
> notifier_call_chain
> notify_die
> do_int3
> int3
> 
> We fix the problem by using irq_work to call schedule_work()
> instead of calling it directly. This is because we cannot
> resynchronize the keyboard state from the hardirq context
> provided by irq_work. This must be done from the task context
> in order to call the input subsystem.
> 
> Therefore, we have to defer the work twice. First, safely
> switch from the debug trap context (similar to NMI) to the
> hardirq, and then switch from the hardirq to the system work queue.
> 
> Signed-off-by: LiuYe <liu.yeC@....com>
> Co-developed-by: Daniel Thompson <daniel.thompson@...aro.org>
> Signed-off-by: Daniel Thompson <daniel.thompson@...aro.org>
> Signed-off-by: Greg KH <gregkh@...uxfoundation.org>

I have NOT signed off on this commit.  You just said that I made a legal
statement about this commit without that actually being true???

Sorry, but that is flat out not acceptable at all.  Please go work with
your company lawyers to figure out what you did and come back with an
explaination of exactly what this is and how it will not happen again.

greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ