[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20231106210730.115192-1-john.ogness@linutronix.de>
Date: Mon, 6 Nov 2023 22:13:21 +0106
From: John Ogness <john.ogness@...utronix.de>
To: Petr Mladek <pmladek@...e.com>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel@...r.kernel.org, Mukesh Ojha <quic_mojha@...cinc.com>
Subject: [PATCH printk v2 0/9] fix console flushing
Hi,
While testing various flushing scenarios, I stumbled on a
couple issues that cause console flushing to fail. While
discussing the v1 [0] series, a couple more issues arose.
This series addresses all the issues:
1. The prb_next_seq() optimization caused inconsistent return
values. Fix prb_next_seq() to the originally intended
behavior but keep an optimization.
2. pr_flush() might not wait until the most recently stored
printk() message if non-finalized records precede it. Fix
pr_flush() to wait for all records to print that are at
least reserved at the time of the call.
3. In panic, the panic messages will not print if non-finalized
records precede them. Add a special condition so that
readers on the panic CPU can drop non-finalized records.
4. It is possible (and easy to reproduce) a scenario where the
console on the panic CPU hands over to a waiter of a stopped
CPU. Do not use the handover feature in panic.
5. If messages are being dropped during panic, non-panic CPUs
are silenced. But by then it is already too late and most
likely the panic messages have been dropped. Change the
non-panic CPU silencing logic to restrict non-panic CPUs
from flooding the ringbuffer.
This series also performing some minor cleanups to remove open
coded checks about the panic context and improve documentation
language regarding data-less records.
Because of multiple refactoring done in recent history, it
would be helpful to provide the LTS maintainers with the proper
backported patches. I am happy to do this.
The changes since v1:
- Rename NO_LPOS to EMPTY_LINE_LPOS.
- Add and cleanup documentation to clarify language regarding
data-less records and special lpos values.
- Implement a new prb_next_seq() optimization to preserve the
intended behavior. This is essentially my rfc [1] with
memory barriers added and based on an alternate implemenation
suggested by pmladek [2].
- Introduce new prb_next_reserve_seq() function to return the
sequence number after @head_id.
- Use prb_next_reserve_seq() instead of prb_next_seq() for
pr_flush().
- Implement dropping non-finalized records in panic within
_prb_read_valid() instead of printk_get_next_message(). This
also makes use of the new prb_next_reserve_seq().
- Use the alternate implementation from pmladek [3] to avoid
the handover feature in panic.
- Implement a new strategy to avoid dropping panic messages
when non-panic CPUs are flooding the ringbuffer.
John Ogness
[0] https://lore.kernel.org/lkml/20231013204340.1112036-1-john.ogness@linutronix.de
[1] https://lore.kernel.org/lkml/20231019132545.1190490-1-john.ogness@linutronix.de
[2] https://lore.kernel.org/lkml/ZTkxOJbDLPy12n41@alley
[3] https://lore.kernel.org/lkml/ZS-r3QnpKzm7UVip@alley
John Ogness (8):
printk: ringbuffer: Do not skip non-finalized records with
prb_next_seq()
printk: ringbuffer: Clarify special lpos values
printk: For @suppress_panic_printk check for other CPU in panic
printk: Add this_cpu_in_panic()
printk: ringbuffer: Cleanup reader terminology
printk: Wait for all reserved records with pr_flush()
printk: Skip non-finalized records in panic
printk: Avoid non-panic CPUs flooding ringbuffer
Petr Mladek (1):
printk: Disable passing console lock owner completely during panic()
kernel/printk/internal.h | 1 +
kernel/printk/printk.c | 108 ++++++----
kernel/printk/printk_ringbuffer.c | 343 +++++++++++++++++++++++++-----
kernel/printk/printk_ringbuffer.h | 21 +-
4 files changed, 382 insertions(+), 91 deletions(-)
base-commit: b4908d68609b57ad1ba4b80bd72c4d2260387e31
--
2.39.2
Powered by blists - more mailing lists