lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <874krcsquv.fsf@gmx.net>
Date:   Mon, 15 Jun 2020 09:58:00 +0200
From:   Stephen Berman <stephen.berman@....net>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org, linux-acpi@...r.kernel.org
Subject: Re: power-off delay/hang due to commit 6d25be57 (mainline)

On Sun, 14 Jun 2020 19:10:05 +0200 Sebastian Andrzej Siewior <bigeasy@...utronix.de> wrote:

> On 2020-06-14 14:12:18 [+0200], Stephen Berman wrote:
[...]
>> What am I supposed to do after "echo t > /proc/sysrq-trigger"?  Both
>> before and after doing that I get an error trying to open the file:
>> 
>> root [ ~ ]# cat /proc/sysrq-trigger
>> cat: /proc/sysrq-trigger: Input/output error
>
>  echo "t > /proc/sysrq-trigger"
>
> not cat.

Ok, sorry, I had misunderstood, but now I've looked at the
documentation.  I had in fact already done `echo t >
/proc/sysrq-trigger' in an xterm (as root) and there was no output.
Later, after booting kernel 5.1.0 because of the message flooding with
5.6.4, I did `echo t > /proc/sysrq-trigger' in a virtual tty (also as
root) and the only output was: `[ <timestamp>] sysrq: Show State'.  Is
this expected?  (In /proc/sys/kernel/sysrq there is `1'.)  I couldn't
try it in a 5.6.4 virtual tty because of the message flooding (see
below).

[...]
>> I applied this patch to 5.6.4 and recompiled, and on the next boot with
>> that kernel, the kernel buffer (and kernel and system logs) began to get
>> flooded with these messages:
>> 
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.943987]
>> acpi_os_execute_deferred(843) Start ffff8fb82c7b6500
>> 000000003edf1e05(ffff8fb82c492990)
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944102] acpi_os_execute(1101)
>> Adding acpi_ev_asynch_enable_gpe+0x0/0x2f ffff8fb82c492990
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944104]
>> acpi_os_execute_deferred(845) End ffff8fb82c7b6500
>> 000000003edf1e05(ffff8fb82c492990)
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944105]
>> acpi_os_execute_deferred(843) Start ffff8fb82b844800
>> 000000002ba560ea(ffff8fb82c492990)
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944124]
>> acpi_os_execute_deferred(845) End ffff8fb82b844800
>> 000000002ba560ea(ffff8fb82c492990)
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944288]
>> acpi_os_execute_deferred(843) Start ffff8fb82c7b6540
>> 000000003edf1e05(ffff8fb82c492990)
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944387] acpi_os_execute(1101)
>> Adding acpi_ev_asynch_enable_gpe+0x0/0x2f ffff8fb82c492990
>> 
>> and so on without stopping.  I could start X and avoid seeing the
>> messages, but was afraid the logs would fill up the root partition if I
>> let it keep going, so I rebooted with another kernel.
>> 
>> Was this message flood because I booted with "ignore_loglevel
>> initcall_debug"?  In the logs there are also lots of messages like this:
>
> Is there a acpi_os_execute_* flood? The *few* at what appears to system
> startup and might be normal. If there appear *many* more and are
> constantly printing (check with dmesg) then we might be to something.

The kernel log shows 305 of these messages in the 4 minutes and 17
seconds between the start of klogd and when I rebooted.

[...]
> I attached the updated acpi patch. It limits the prints to the
> kacpi_notify_wq queue which appears to stuck at shutdown.

I applied this patch and rebuilt the kernel.  After booting 5.6.4 this
morning there was again message flooding and this time I was unable to
log in because of it, so had to do a hard reboot and booted with 5.1.0.
The kernel log shows 207 acpi_os_execute_deferred_notify messages in 3
minutes and 6 seconds; here are the first:

Jun 15 08:55:22 strobe-jhalfs kernel: [    5.943848] acpi_os_execute(1109) Adding acpi_ev_asynch_enable_gpe+0x0/0x2f ffff8d7aaba18990 <ffff8d7aab157fc0>
Jun 15 08:55:22 strobe-jhalfs kernel: [    5.943851] acpi_os_execute_deferred_notify(851) Start ffff8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(ffff8d7aaba18990)
Jun 15 08:55:22 strobe-jhalfs kernel: [    5.943865] acpi_os_execute_deferred_notify(853) End   ffff8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(ffff8d7aaba18990)
Jun 15 08:55:22 strobe-jhalfs kernel: [    5.944104] acpi_os_execute(1109) Adding acpi_ev_asynch_enable_gpe+0x0/0x2f ffff8d7aaba18990 <ffff8d7aab157fc0>
Jun 15 08:55:22 strobe-jhalfs kernel: [    5.944106] acpi_os_execute_deferred_notify(851) Start ffff8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(ffff8d7aaba18990)
Jun 15 08:55:22 strobe-jhalfs kernel: [    5.944115] acpi_os_execute_deferred_notify(853) End   ffff8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(ffff8d7aaba18990)
Jun 15 08:55:22 strobe-jhalfs kernel: [    5.944406] acpi_os_execute(1109) Adding acpi_ev_asynch_enable_gpe+0x0/0x2f ffff8d7aaba18990 <ffff8d7aab157fc0>
Jun 15 08:55:22 strobe-jhalfs kernel: [    5.944408] acpi_os_execute_deferred_notify(851) Start ffff8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(ffff8d7aaba18990)
Jun 15 08:55:22 strobe-jhalfs kernel: [    5.944417] acpi_os_execute_deferred_notify(853) End   ffff8d7aab157fc0 acpi_ev_asynch_enable_gpe+0x0/0x2f(ffff8d7aaba18990)

> The interesting part is to see if there is a acpi_os_execute() adding a
> specific event multiple times which does not complete. Maybe at runtime,
> maybe at shutdown time. If that is the case then ignoring this specific
> event might fix the shutdown problem. With all this information so far,
> I don't see a relation with this problem and the commit…

In those 3 minutes and 8 seconds there were 5 "Adding
acpi_ev_asynch_enable_gpe" messages at the beginning, then 185 "Adding
acpi_ev_notify_dispatch" messages, which kept coming until I rebooted.

Steve Berman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ