lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 2 Feb 2024 13:31:01 +0100
From: Konrad Dybcio <konrad.dybcio@...aro.org>
To: Tejun Heo <tj@...nel.org>, Konrad Dybcio <konradybcio@...nel.org>
Cc: linux-kernel@...r.kernel.org, Naohiro.Aota@....com, kernel-team@...a.com,
 Bjorn Andersson <andersson@...nel.org>
Subject: Re: Workqueue regression

On 2.02.2024 02:52, Tejun Heo wrote:
> Hello,
> 
> On Thu, Feb 01, 2024 at 09:57:59PM +0100, Konrad Dybcio wrote:
>> So, commit "Implement system-wide nr_active enforcement for unbound workqueues"
>> broke *something* and now performing a suspend-wakeup cycle on a Qualcomm
>> SC8280XP-based (arm64) platform hangs when performing the resume tasks,
>> presumably somewhere near PCIe reinitialization (but that may be a red herring).
>>
>> Reverting the commit (and the ones on top of it due to conflicts) fixes
>> the issue on next-20240130 and later (plus some out-of-tree patches that
>> are largely unrelated).
>>
>> Not sure where to start looking.
> 
> Hmm... sorry about that. Can you please boot with `console_no_suspend` and
> retry? Once the system gets stuck, you can wait for several minutes till the
> workqueue watchdog triggers and dumps the state or, if you can, trigger
> `sysrq-t` which has workqueue state dump at the end.
> 
> If the system doesn't become live enough after suspend/resume cycle to get
> more info, the following might help:

Looks like it's too far gone indeed..

> 
> $ echo test_resume > /sys/power/disk
> $ echo disk > /sys/power/state

Sadly, hibernation is not a thing on this platform.. Without going into much
detail of how messy the power management stuff is, you can either have
"on", "off" or "power collapsed" (bound to s2idle).. Trying to trigger this
sequence makes the thing lock up and die due to unclocked accesses with or
without the WQ regression.

Konrad

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ