[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <X8ElwBh9tw+OLHF+@alley>
Date: Fri, 27 Nov 2020 17:13:52 +0100
From: Petr Mladek <pmladek@...e.com>
To: Paul Gortmaker <paul.gortmaker@...driver.com>
Cc: linux-kernel@...r.kernel.org, Andi Kleen <ak@...ux.intel.com>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
John Ogness <john.ogness@...utronix.de>
Subject: Re: [PATCH 0/3] clear_warn_once: add timed interval resetting
On Thu 2020-11-26 01:30:26, Paul Gortmaker wrote:
> The existing clear_warn_once functionality is currently a manually
> issued state reset via the file /sys/kernel/debug/clear_warn_once when
> debugfs is mounted. The idea being that a developer would be running
> some tests, like LTP or similar, and want to check reproducibility
> without having to reboot.
>
> But you currently can't make use of clear_warn_once unless you've got
> debugfs enabled and mounted - which may not be desired by some people
> in some deployment situations.
>
> The functionality added here allows for periodic resets in addition to
> the one-shot reset it already had. Then we allow for a boot-time setting
> of the periodic resets so it can be used even when debugfs isn't mounted.
>
> By having a periodic reset, we also open the door for having the various
> "once" functions act as long period ratelimited messages, where a sysadmin
> can pick an hour or a day reset if they are facing an issue and are
> wondering "did this just happen once, or am I only being informed once?"
What is the primary problem that you wanted to solve, please?
Do you have an example what particular printk_once() you were
interested into?
I guess that the main problem is that
/sys/kernel/debug/clear_warn_once is available only when debugfs is
mounted. And the periodic reset is just one possible solution
that looks like a nice to have. Do I get it correctly, please?
I am not completely against the idea. But I have some concerns.
1. It allows to convert printk_once() into printk_ratelimited()
with some special semantic and interface. It opens possibilities
for creativity. It might be good and it also might create
problems that are hard to foresight now.
printk_ratelimited() is problematic, definitely, see below.
2. printk_ratelimited() is typically used when a message might get
printed too often. It prevents overloading consoles, log daemons.
Also it helps to see other messages that might get lost otherwise.
I have seen many discussions about what is the right ratelimit
for a particular message. I have to admit that it was mainly
related to console speed. The messages were lost with slow
consoles. People want to see more on fast consoles.
The periodic warn once should not have this problem because the
period would typically be long. And it would produce only
one message on each location.
The problem is that it is a global setting. It would reset
all printk_once() callers. And I see two problems here:
+ Periodic reset might cause printing related problems
in the wrong order. Messages from victims first. Messages
about the root of the problem later (from next cycle).
It might create confusion.
+ People have problems to set the right ratelimit for
a particular message. It would be even bigger problem
to set the right ratelimit for the entire system.
I do not know. Maybe I am just too paranoid today. Anyway, there
are other possibilities:
+ Move clear_warn_once from debugfs to a location that is always
available. For example, into /proc
+ Allow to change printk_once() to printk_n_times() globally. I mean
that it would print the same message only N-times instead on once.
It will print only first few occurrences, so it will not have
the problem with ordering.
Any other opinion?
Best Regards,
Petr
Powered by blists - more mailing lists