[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bb5cadc3-0940-7f5c-7a1b-8120ddac9039@samsung.com>
Date: Fri, 6 May 2022 13:25:25 +0200
From: Marek Szyprowski <m.szyprowski@...sung.com>
To: Petr Mladek <pmladek@...e.com>,
John Ogness <john.ogness@...utronix.de>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel@...r.kernel.org,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"linux-arm-msm@...r.kernel.org" <linux-arm-msm@...r.kernel.org>,
Bjorn Andersson <bjorn.andersson@...aro.org>,
Andy Gross <agross@...nel.org>
Subject: Re: [PATCH printk v5 1/1] printk: extend console_lock for
per-console locking
Hi All,
On 27.04.2022 09:08, Marek Szyprowski wrote:
> On 26.04.2022 15:16, Petr Mladek wrote:
>> On Tue 2022-04-26 14:07:42, Petr Mladek wrote:
>>> On Mon 2022-04-25 23:04:28, John Ogness wrote:
>>>> Currently threaded console printers synchronize against each
>>>> other using console_lock(). However, different console drivers
>>>> are unrelated and do not require any synchronization between
>>>> each other. Removing the synchronization between the threaded
>>>> console printers will allow each console to print at its own
>>>> speed.
>>>>
>>>> But the threaded consoles printers do still need to synchronize
>>>> against console_lock() callers. Introduce a per-console mutex
>>>> and a new console boolean field @blocked to provide this
>>>> synchronization.
>>>>
>>>> console_lock() is modified so that it must acquire the mutex
>>>> of each console in order to set the @blocked field. Console
>>>> printing threads will acquire their mutex while printing a
>>>> record. If @blocked was set, the thread will go back to sleep
>>>> instead of printing.
>>>>
>>>> The reason for the @blocked boolean field is so that
>>>> console_lock() callers do not need to acquire multiple console
>>>> mutexes simultaneously, which would introduce unnecessary
>>>> complexity due to nested mutex locking. Also, a new field
>>>> was chosen instead of adding a new @flags value so that the
>>>> blocked status could be checked without concern of reading
>>>> inconsistent values due to @flags updates from other contexts.
>>>>
>>>> Threaded console printers also need to synchronize against
>>>> console_trylock() callers. Since console_trylock() may be
>>>> called from any context, the per-console mutex cannot be used
>>>> for this synchronization. (mutex_trylock() cannot be called
>>>> from atomic contexts.) Introduce a global atomic counter to
>>>> identify if any threaded printers are active. The threaded
>>>> printers will also check the atomic counter to identify if the
>>>> console has been locked by another task via console_trylock().
>>>>
>>>> Note that @console_sem is still used to provide synchronization
>>>> between console_lock() and console_trylock() callers.
>>>>
>>>> A locking overview for console_lock(), console_trylock(), and the
>>>> threaded printers is as follows (pseudo code):
>>>>
>>>> console_lock()
>>>> {
>>>> down(&console_sem);
>>>> for_each_console(con) {
>>>> mutex_lock(&con->lock);
>>>> con->blocked = true;
>>>> mutex_unlock(&con->lock);
>>>> }
>>>> /* console_lock acquired */
>>>> }
>>>>
>>>> console_trylock()
>>>> {
>>>> if (down_trylock(&console_sem) == 0) {
>>>> if (atomic_cmpxchg(&console_kthreads_active, 0,
>>>> -1) == 0) {
>>>> /* console_lock acquired */
>>>> }
>>>> }
>>>> }
>>>>
>>>> threaded_printer()
>>>> {
>>>> mutex_lock(&con->lock);
>>>> if (!con->blocked) {
>>>> /* console_lock() callers blocked */
>>>>
>>>> if
>>>> (atomic_inc_unless_negative(&console_kthreads_active)) {
>>>> /* console_trylock() callers blocked */
>>>>
>>>> con->write();
>>>>
>>>> atomic_dec(&console_lock_count);
>>>> }
>>>> }
>>>> mutex_unlock(&con->lock);
>>>> }
>>>>
>>>> The console owner and waiter logic now only applies between contexts
>>>> that have taken the console_lock via console_trylock(). Threaded
>>>> printers never take the console_lock, so they do not have a
>>>> console_lock to handover. Tasks that have used console_lock() will
>>>> block the threaded printers using a mutex and if the console_lock
>>>> is handed over to an atomic context, it would be unable to unblock
>>>> the threaded printers. However, the console_trylock() case is
>>>> really the only scenario that is interesting for handovers anyway.
>>>>
>>>> @panic_console_dropped must change to atomic_t since it is no longer
>>>> protected exclusively by the console_lock.
>>>>
>>>> Since threaded printers remain asleep if they see that the console
>>>> is locked, they now must be explicitly woken in __console_unlock().
>>>> This means wake_up_klogd() calls following a console_unlock() are
>>>> no longer necessary and are removed.
>>>>
>>>> Also note that threaded printers no longer need to check
>>>> @console_suspended. The check for the @blocked field implicitly
>>>> covers the suspended console case.
>>>>
>>>> Signed-off-by: John Ogness <john.ogness@...utronix.de>
>>> Nice, it it better than v4. I am going to push this for linux-next.
>>>
>>> Reviewed-by: Petr Mladek <pmladek@...e.com>
>> JFYI, I have just pushed this patch instead of the one
>> from v4 into printk/linux.git, branch rework/kthreads.
>>
>> It means that this branch has been rebased. It will be
>> used in the next refresh of linux-next.
>
> This patchset landed in linux next-20220426. In my tests I've found
> that it causes deadlock on all my Amlogic Meson G12B/SM1 based boards:
> Odroid C4/N2 and Khadas VIM3/VIM3l. The deadlock happens when system
> boots to userspace and getty (with automated login) is executed. I
> even see the bash prompt, but then the console is freezed. Reverting
> this patch (e00cc0e1cbf4) on top of linux-next (together with
> 6b3d71e87892 to make revert clean) fixes the issue.
>
The Amlogic Meson related issue has been investigated and fixed:
https://lore.kernel.org/all/b7c81f02-039e-e877-d7c3-6834728d2117@samsung.com/
but I just found that there is one more issue.
It appears on QCom-based DragonBoard 410c SBC
(arch/arm64/boot/dts/qcom/apq8016-sbc.dts). To see it on today's linux
next-20220506, one has to revert
42cd402b8fd4672b692400fe5f9eecd55d2794ac, otherwise lockdep triggers
other warning and it is turned off too early:
================================
WARNING: inconsistent lock state
5.18.0-rc5-next-20220506+ #11869 Not tainted
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
ffff80000aaa8478 (&port_lock_key){?.+.}-{2:2}, at: msm_uart_irq+0x38/0x750
{HARDIRQ-ON-W} state was registered at:
lock_acquire.part.0+0xe0/0x230
lock_acquire+0x68/0x84
_raw_spin_lock+0x5c/0x80
__msm_console_write+0x1ac/0x220
msm_console_write+0x48/0x60
__console_emit_next_record+0x188/0x420
printk_kthread_func+0x3a0/0x3bc
kthread+0x118/0x11c
ret_from_fork+0x10/0x20
irq event stamp: 12182
hardirqs last enabled at (12181): [<ffff800008e3d2a8>]
cpuidle_enter_state+0xc4/0x30c
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.18.0-rc5-next-20220506+ #11869
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
Call trace:
dump_backtrace.part.0+0xd0/0xe0
show_stack+0x18/0x6c
dump_stack_lvl+0x8c/0xb8
dump_stack+0x18/0x34
print_usage_bug.part.0+0x208/0x22c
mark_lock+0x710/0x954
__lock_acquire+0x9fc/0x20cc
lock_acquire.part.0+0xe0/0x230
lock_acquire+0x68/0x84
_raw_spin_lock_irqsave+0x80/0xcc
msm_uart_irq+0x38/0x750
__handle_irq_event_percpu+0xac/0x3d0
handle_irq_event+0x4c/0x120
handle_fasteoi_irq+0xa4/0x1a0
generic_handle_domain_irq+0x3c/0x60
gic_handle_irq+0x44/0xc4
call_on_irq_stack+0x2c/0x54
do_interrupt_handler+0x80/0x84
el1_interrupt+0x34/0x64
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x64/0x68
cpuidle_enter_state+0xcc/0x30c
cpuidle_enter+0x38/0x50
do_idle+0x22c/0x2bc
cpu_startup_entry+0x28/0x30
rest_init+0x110/0x190
arch_post_acpi_subsys_init+0x0/0x18
start_kernel+0x6c4/0x704
__primary_switched+0xc0/0xc8
INIT: version 2.88 booting
[info] Using makefile-style concurrent boot in runlevel S.
Reverting the following patches on top of linux next-20220506
fixes/hides this lockdep warning:
6b3d71e87892 ("printk: remove @console_locked")
8e274732115f ("printk: extend console_lock for per-console locking")
09c5ba0aa2fc ("printk: add kthread console printers")
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Powered by blists - more mailing lists