[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46df3838-e1a6-ee95-b398-bef0896d2b03@huawei.com>
Date: Mon, 10 May 2021 16:29:11 +0800
From: luojiaxing <luojiaxing@...wei.com>
To: Petr Mladek <pmladek@...e.com>,
Sergey Senozhatsky <senozhatsky@...omium.org>
CC: <sergey.senozhatsky@...il.com>, <rostedt@...dmis.org>,
<john.ogness@...utronix.de>, <linux-kernel@...r.kernel.org>,
<linuxarm@...wei.com>, <bobo.shaobowang@...wei.com>
Subject: Re: [PATCH] printk: stop spining waiter when console resume to flush
prb
On 2021/5/8 0:13, Petr Mladek wrote:
> On Thu 2021-05-06 23:07:19, Sergey Senozhatsky wrote:
>> Thanks for Cc-ing Petr
>>
>> On (21/05/06 15:39), Petr Mladek wrote:
>>> Many printk messages might get accumulated when consoles were suspended.
>>> They are proceed when console_unlock() is called in resume_console().
>>>
>>> The possibility to pass the console lock owner was added to reduce the risk
>>> of softlockup when too many messages were handled in an atomic context.
>>>
>>> Now, resume_console() is always in a preemptible context that is safe
>>> to handle all accumulated messages. The possibility to pass the console
>>> lock owner actually makes things worse. The new owner might be in an atomic
>>> context and might cause softlockup when processing all messages accumulated
>>> when the console was suspended.
>>>
>>> Create new console_unlock_preemptible() that will not allow to pass
>>> the console lock owner. As a result, all accumulated messages will
>>> be proceed in the safe preemptible process.
>> If we have a lot of pending messages in the logbuf, then there is
>> something chatty - some context (task, irq) or maybe several contexts.
>> And those contexts can continue adding messages, while we print them
>> _exclusively_ from preemptible context only. without ever throttling down
>> printk() callers - something that console_owner spinning and handover
>> does for us. And those printk() callers can even preempt
>> console_unlock_preemptible() and cause delays and lost messages.
> Luo, please, correct me if I am wrong.
Hi, Petr, I reply the test result on your first reply for this patch.
Please check it when you are free.
>
> This patch a about one well defined scenario. The messages are
> accumulated between suspend_console() and resume_console().
> It is "small" part of the system hibernation. And we need
> to get them out now. There might be many if something special
> was debugged.
>
> I am pretty sure that Luo did not see any flood of messages:
>
> + Flood in more contexts would be balanced by switching
> the console_owner.
>
> + Flood in one context would be naturally throttled because
> this context will become the console_owner.
>
> In each case, these messages would be generated after
> console_resume(). Luo's original patch was explicitly talking
> about messages accumulated during the suspend.
>
> Luo, could you please provide some log showing the problem?
Sure, But, it is not easy to find that the printk got such problem in
the user's logs.
So I'm attaching a log of the simulation test. You can clearly see that
the driver thread calling dev_info() is blocked.
My test method is as follows:
Kernel thread A causes the console to enter suspend and then resume it
10 seconds later.
Kernel thread B repeatedly invokes dev_info() for 15 seconds after the
console suspend.
Part of dmesg log I save before:
[ 288.013869] hisi_sas_v3_hw 0000:b4:02.0: print test
[ 288.013870] hisi_sas_v3_hw 0000:b4:02.0: print test
[ 288.013871] hisi_sas_v3_hw 0000:b4:02.0: print test
[ 288.013872] hisi_sas_v3_hw 0000:b4:02.0: print test
[ 288.013873] hisi_sas_v3_hw 0000:b4:02.0: print test
[ 288.013874] hisi_sas_v3_hw 0000:b4:02.0: print test
[ 288.013875] hisi_sas_v3_hw 0000:b4:02.0: print test
[ 288.013876] hisi_sas_v3_hw 0000:b4:02.0: print test
[ 288.013877] hisi_sas_v3_hw 0000:b4:02.0: print test
[ 289.670256] hisi_sas_v3_hw 0000:b4:02.0: print test
[ 299.286325] hisi_sas_v3_hw 0000:b4:02.0: print test
[ 299.291198] hisi_sas_v3_hw 0000:b4:02.0: print test
[ 299.296063] hisi_sas_v3_hw 0000:b4:02.0: print test
[ 299.300924] hisi_sas_v3_hw 0000:b4:02.0: print test
[ 299.305787] hisi_sas_v3_hw 0000:b4:02.0: print test
The log shows that when thread B invokes dev_info, dev_info is blocked
for several seconds.
Thanks
Jiaxing
>
> Best Regards,
> Petr
>
> .
>
Powered by blists - more mailing lists