lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 6 May 2020 23:57:13 +0200
From:   Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To:     Stephen Berman <stephen.berman@....net>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org
Subject: Re: power-off delay/hang due to commit 6d25be57 (mainline)

On 2020-05-01 17:46:48 [+0200], Stephen Berman wrote:
> I'm experiencing a delay or hang in powering off my computer after `halt
> -d -f -i -p' and I've bisected it to this commit in the mainline tree:

You refer to a normal "poweroff" or is this some kind of "shutdown now"
kind of thing? Unless I'm mistaken, the `halt' command above will turn
off the machine without stopping user tasks (as in "reaching shutdown
level").

…
> The delays have varied in length from ~20 seconds to seeminingly
> indefinitely long (the longest I've waited before pressing the start
> button on the machine is 6 minutes).  With kernels prior to this commit,
> my machine powers off within 4 seconds after the halt invocation.

So you say that normally the machine shuts down immediately but now it
takes 20secs+ if it shuts down at all?

…
>   sr 5:0:0:0: tag#21 timing out command, waited 120s

The CD-drive is polled regularly so it notices when a CD us inserted /
removed. It seems that one of the requests wasn't answered and it ended
up in timeout.

…
> minutes.  But after running emacs in a tty (and then killing it), the
> delay was seemingly indefinite, as it is with my usual workflow with X,
> emacs, firefox, etc. running (but I always kill them before invoking
> `shutdown -h now' from a tty).  During the bisection I tested each
> kernel build by running startx, emacs, firefox, killing these, then
> `shutdown -h now'.  I've attached the git log of the bisection augmented
> with power-off timing notes.  (FWIW, when I reboot the machine with
> `shutdown -r now' there is no delay with the problematic kernels.)

So reboot is not affected, just the halt. Could you please check if the
"poweroff" variant is also affected?

…
> Another pointer I got was to the recent commit 62849a96, which fixes a
> bug due to the commit that causes my problem.  I applied 62849a96 to
> 5.6.4, but it did not prevent the delay/hang.

You should see a warning if you were facing the problem described in the
commit.

…
>                                     Please let me know if there's any
> other information I can provide or anything else I can do to help with
> debugging this issue.  I hope you can fix it.

Can you send dmesg output of the system / lspci -k? I'm mainly
interested what drivers are bound to storage devices (you can send it
offlist if you want).

Can you log the output on the serial console?

If the commit you cited is really the problem then it would mean that a
worker isn't scheduled for some reason. Could you please enable
CONFIG_WQ_WATCHDOG to see if workqueue core code notices that a worker
isn't making progress?

> Thanks,
> Steve Berman

Sebastian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ