[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5b00e7f5-385d-bbc8-886a-d09cc844d07d@yandex-team.ru>
Date: Thu, 13 Feb 2020 16:05:17 +0300
From: Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
To: Petr Mladek <pmladek@...e.com>
Cc: Kirill Tkhai <ktkhai@...tuozzo.com>,
Peter Zijlstra <peterz@...radead.org>,
linux-kernel@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Dmitry Monakhov <dmtrmonakhov@...dex-team.ru>
Subject: Re: [PATCH] kernel/watchdog: flush all printk nmi buffers when
hardlockup detected
On 12/02/2020 17.54, Petr Mladek wrote:
> On Tue 2020-02-11 15:36:02, Konstantin Khlebnikov wrote:
>> On 11/02/2020 11.14, Kirill Tkhai wrote:
>>> Hi, Konstantin,
>>>
>>> On 10.02.2020 12:48, Konstantin Khlebnikov wrote:
>>>> In NMI context printk() could save messages into per-cpu buffers and
>>>> schedule flush by irq_work when IRQ are unblocked. This means message
>>>> about hardlockup appears in kernel log only when/if lockup is gone.
>>>>
>>>> Comment in irq_work_queue_on() states that remote IPI aren't NMI safe
>>>> thus printk() cannot schedule flush work to another cpu.
>>>>
>>>> This patch adds simple atomic counter of detected hardlockups and
>>>> flushes all per-cpu printk buffers in context softlockup watchdog
>>>> at any other cpu when it sees changes of this counter.
>>>>
>>>> Signed-off-by: Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
>>>> ---
>>>> include/linux/nmi.h | 1 +
>>>> kernel/watchdog.c | 22 ++++++++++++++++++++++
>>>> kernel/watchdog_hld.c | 1 +
>>>> 3 files changed, 24 insertions(+)
>>>>
>>>> diff --git a/include/linux/nmi.h b/include/linux/nmi.h
>>>> index 9003e29cde46..8406df72ae5a 100644
>>>> --- a/include/linux/nmi.h
>>>> +++ b/include/linux/nmi.h
>>>> @@ -84,6 +84,7 @@ static inline void reset_hung_task_detector(void) { }
>>>> #if defined(CONFIG_HARDLOCKUP_DETECTOR)
>>>> extern void hardlockup_detector_disable(void);
>>>> extern unsigned int hardlockup_panic;
>>>> +extern atomic_t hardlockup_detected;
>>>> #else
>>>> static inline void hardlockup_detector_disable(void) {}
>>>> #endif
>>>> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
>>>> index b6b1f54a7837..9f5c68fababe 100644
>>>> --- a/kernel/watchdog.c
>>>> +++ b/kernel/watchdog.c
>>>> @@ -92,6 +92,26 @@ static int __init hardlockup_all_cpu_backtrace_setup(char *str)
>>>> }
>>>> __setup("hardlockup_all_cpu_backtrace=", hardlockup_all_cpu_backtrace_setup);
>>>> # endif /* CONFIG_SMP */
>>>> +
>>>> +atomic_t hardlockup_detected = ATOMIC_INIT(0);
>>>> +
>>>> +static inline void flush_hardlockup_messages(void)
>>>> +{
>>>> + static atomic_t flushed = ATOMIC_INIT(0);
>>>> +
>>>> + /* flush messages from hard lockup detector */
>>>> + if (atomic_read(&hardlockup_detected) != atomic_read(&flushed)) {
>>>> + atomic_set(&flushed, atomic_read(&hardlockup_detected));
>>>> + printk_safe_flush();
>>>> + }
>>>> +}
>>>
>>> Do we really need two variables here? They may come into two different
>>> cache lines, and there will be double cache pollution just because of
>>> this simple check. Why not the below?
>>
>> I don't think anybody could notice read-only access to second variable.
>> This executes once in several seconds.
>>
>> Watchdogs already use same pattern (monotonic counter + snapshot) in
>> couple places. So code looks more clean in this way.
>
> It is not only about speed. It is also about code complexity
> and correctness. Using two variables is more complex.
> Calling atomic_read(&hardlockup_detected) twice does > not look like a correct pattern.
It guarantees "at least once" which is enough for this case.
>
> The single variable patter is used for similar things there
> as well, for example, see hard_watchdog_warn,
> hardlockup_allcpu_dumped.
Ouch. This works only once and there is no way to rearm it back.
Now I see why this thing never worked for me recent years =)
Maybe it's better reset sysctl_hardlockup_all_cpu_backtrace
to let user set sysctl back to 1.
Or rearm it back in softlockup watchdog after timeout.
>
> I would call the variable "hardlockup_dump_flush" and
> use the same logic as for hardlockup_allcpu_dumped.
>
> Note that simple READ_ONCE(), WRITE_ONCE() are not enough
> because they do not provide smp barriers.
It's hard to imagine arch which actually needs barrires here.
Softlockup timers will eventually see the change.
>
> Best Regards,
> Petr
>
Powered by blists - more mailing lists