[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <874krcsquv.fsf@gmx.net>
Date: Mon, 15 Jun 2020 09:58:00 +0200
From: Stephen Berman <stephen.berman@....net>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
linux-kernel@...r.kernel.org, linux-acpi@...r.kernel.org
Subject: Re: power-off delay/hang due to commit 6d25be57 (mainline)
On Sun, 14 Jun 2020 19:10:05 +0200 Sebastian Andrzej Siewior <bigeasy@...utronix.de> wrote:
> On 2020-06-14 14:12:18 [+0200], Stephen Berman wrote:
[...]
>> What am I supposed to do after "echo t > /proc/sysrq-trigger"? Both
>> before and after doing that I get an error trying to open the file:
>>
>> root [ ~ ]# cat /proc/sysrq-trigger
>> cat: /proc/sysrq-trigger: Input/output error
>
> echo "t > /proc/sysrq-trigger"
>
> not cat.
Ok, sorry, I had misunderstood, but now I've looked at the
documentation. I had in fact already done `echo t >
/proc/sysrq-trigger' in an xterm (as root) and there was no output.
Later, after booting kernel 5.1.0 because of the message flooding with
5.6.4, I did `echo t > /proc/sysrq-trigger' in a virtual tty (also as
root) and the only output was: `[ <timestamp>] sysrq: Show State'. Is
this expected? (In /proc/sys/kernel/sysrq there is `1'.) I couldn't
try it in a 5.6.4 virtual tty because of the message flooding (see
below).
[...]
>> I applied this patch to 5.6.4 and recompiled, and on the next boot with
>> that kernel, the kernel buffer (and kernel and system logs) began to get
>> flooded with these messages:
>>
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.943987]
>> acpi_os_execute_deferred(843) Start ffff8fb82c7b6500
>> 000000003edf1e05(ffff8fb82c492990)
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944102] acpi_os_execute(1101)
>> Adding acpi_ev_asynch_enable_gpe+0x0/0x2f ffff8fb82c492990
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944104]
>> acpi_os_execute_deferred(845) End ffff8fb82c7b6500
>> 000000003edf1e05(ffff8fb82c492990)
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944105]
>> acpi_os_execute_deferred(843) Start ffff8fb82b844800
>> 000000002ba560ea(ffff8fb82c492990)
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944124]
>> acpi_os_execute_deferred(845) End ffff8fb82b844800
>> 000000002ba560ea(ffff8fb82c492990)
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944288]
>> acpi_os_execute_deferred(843) Start ffff8fb82c7b6540
>> 000000003edf1e05(ffff8fb82c492990)
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944387] acpi_os_execute(1101)
>> Adding acpi_ev_asynch_enable_gpe+0x0/0x2f ffff8fb82c492990
>>
>> and so on without stopping. I could start X and avoid seeing the
>> messages, but was afraid the logs would fill up the root partition if I
>> let it keep going, so I rebooted with another kernel.
>>
>> Was this message flood because I booted with "ignore_loglevel
>> initcall_debug"? In the logs there are also lots of messages like this:
>
> Is there a acpi_os_execute_* flood? The *few* at what appears to system
> startup and might be normal. If there appear *many* more and are
> constantly printing (check with dmesg) then we might be to something.
The kernel log shows 305 of these messages in the 4 minutes and 17
seconds between the start of klogd and when I rebooted.
[...]
> I attached the updated acpi patch. It limits the prints to the
> kacpi_notify_wq queue which appears to stuck at shutdown.
I applied this patch and rebuilt the kernel. After booting 5.6.4 this
morning there was again message flooding and this time I was unable to
log in because of it, so had to do a hard reboot and booted with 5.1.0.
The kernel log shows 207 acpi_os_execute_deferred_notify messages in 3
minutes and 6 seconds; here are the first:
Jun 15 08:55:22 strobe-jhalfs kernel: [ 5.943848] acpi_os_execute(1109) Adding acpi_ev_asynch_enable_gpe+0x0/0x2f ffff8d7aaba18990 <ffff8d7aab157fc0>
Jun 15 08:55:22 strobe-jhalfs kernel: [ 5.943851] acpi_os_execute_deferred_notify(851) Start ffff8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(ffff8d7aaba18990)
Jun 15 08:55:22 strobe-jhalfs kernel: [ 5.943865] acpi_os_execute_deferred_notify(853) End ffff8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(ffff8d7aaba18990)
Jun 15 08:55:22 strobe-jhalfs kernel: [ 5.944104] acpi_os_execute(1109) Adding acpi_ev_asynch_enable_gpe+0x0/0x2f ffff8d7aaba18990 <ffff8d7aab157fc0>
Jun 15 08:55:22 strobe-jhalfs kernel: [ 5.944106] acpi_os_execute_deferred_notify(851) Start ffff8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(ffff8d7aaba18990)
Jun 15 08:55:22 strobe-jhalfs kernel: [ 5.944115] acpi_os_execute_deferred_notify(853) End ffff8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(ffff8d7aaba18990)
Jun 15 08:55:22 strobe-jhalfs kernel: [ 5.944406] acpi_os_execute(1109) Adding acpi_ev_asynch_enable_gpe+0x0/0x2f ffff8d7aaba18990 <ffff8d7aab157fc0>
Jun 15 08:55:22 strobe-jhalfs kernel: [ 5.944408] acpi_os_execute_deferred_notify(851) Start ffff8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(ffff8d7aaba18990)
Jun 15 08:55:22 strobe-jhalfs kernel: [ 5.944417] acpi_os_execute_deferred_notify(853) End ffff8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(ffff8d7aaba18990)
> The interesting part is to see if there is a acpi_os_execute() adding a
> specific event multiple times which does not complete. Maybe at runtime,
> maybe at shutdown time. If that is the case then ignoring this specific
> event might fix the shutdown problem. With all this information so far,
> I don't see a relation with this problem and the commit…
In those 3 minutes and 8 seconds there were 5 "Adding
acpi_ev_asynch_enable_gpe" messages at the beginning, then 185 "Adding
acpi_ev_notify_dispatch" messages, which kept coming until I rebooted.
Steve Berman
Powered by blists - more mailing lists