lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240308164040.GA683683@bhelgaas>
Date: Fri, 8 Mar 2024 10:40:40 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Kai-Heng Feng <kai.heng.feng@...onical.com>
Cc: Michael Schaller <michael@...aller.de>, bhelgaas@...gle.com,
	linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
	regressions@...ts.linux.dev, macro@...am.me.uk,
	ajayagarwal@...gle.com, sathyanarayanan.kuppuswamy@...ux.intel.com,
	gregkh@...uxfoundation.org, hkallweit1@...il.com,
	michael.a.bottini@...ux.intel.com, johan+linaro@...nel.org
Subject: Re: [Regression] [PCI/ASPM] [ASUS PN51] Reboot on resume attempt
 (bisect done; commit found)

On Thu, Mar 07, 2024 at 02:51:05PM +0800, Kai-Heng Feng wrote:
> On Wed, Jan 10, 2024 at 8:40 PM Michael Schaller <michael@...aller.de> wrote:
> > On 10.01.24 04:43, Kai-Heng Feng wrote:
> > > On Fri, Jan 5, 2024 at 11:51 PM Bjorn Helgaas <helgaas@...nel.org> wrote:
> > >> On Fri, Jan 05, 2024 at 12:18:32PM +0100, Michael Schaller wrote:
> > >>> On 05.01.24 04:25, Kai-Heng Feng wrote:
> > >>>> Just wondering, does `echo 0 > /sys/power/pm_asysnc` help?
> > >>>
> > >>> Yes, `echo 0 | sudo tee /sys/power/pm_async` does indeed also result in a
> > >>> working resume. I've tested this on kernel 6.6.9 (which still has commit
> > >>> 08d0cc5f3426). I've also attached the relevant dmesg output of the
> > >>> suspend/resume cycle in case this helps.
> > >>
> > >> Thanks for testing that!
> > >>
> > >>> Furthermore does this mean that commit 08d0cc5f3426 isn't at fault but
> > >>> rather that we are dealing with a timing issue?
> > >>
> > >> PCI does have a few software timing requirements, mostly related to
> > >> reset and power state (D0/D3cold).  ASPM has some timing parameters,
> > >> too, but I think they're all requirements on the hardware, not on
> > >> software.
> > >>
> > >> Adding an arbitrary delay anywhere shouldn't break anything, and other
> > >> than those few required situations, it shouldn't fix anything either.
> > >
> > > At least it means 8d0cc5f3426 isn't the culprit?
> > >
> > > Michael, does the issue happen when iwlwifi module is not loaded? It
> > > can be related to iwlwifi firmware.
> > >
> > The issue still happens if the iwlwifi module has been blacklisted and
> > after a reboot. This was again with vanilla kernel 6.6.9 and I've
> > confirmed via dmesg that iwlwifi wasn't loaded.
> 
> Can you please give latest mainline kernel a try? With commit
> f93e71aea6c60ebff8adbd8941e678302d377869 (Revert "PCI/ASPM: Remove
> pcie_aspm_pm_state_change()") reverted.
> 
> Also do you have efi-pstore enabled? Is there anything logged in
> /var/lib/systemd/pstore (assuming systemd is used)?

It seems possible that some recent ASPM fixes could help this issue.
These fixes are not upstream yet, but should appear in v6.9-rc1.

Your (Michael's) bisection identified 08d0cc5f3426 ("PCI/ASPM: Remove
pcie_aspm_pm_state_change()"), which appeared in v6.0.  This was
intended to solve the problem of ASPM config changes made via sysfs
getting lost.

We removed 08d0cc5f3426 in v6.7 with f93e71aea6c6 ("Revert "PCI/ASPM:
Remove pcie_aspm_pm_state_change()"") to address the reboot after
resume problem that you reported.

e4dbf699467e ("PCI/ASPM: Update save_state when configuration
changes") is planned for v6.9-rc1 and should solve the same problem
08d0cc5f3426 tried to solve, but in a different way.

390fd84739c5 ("PCI/ASPM: Save L1 PM Substates Capability for
suspend/resume") is also planned for v6.9-rc1 and fixes some problems
with restoring L1 Substates config during resume.  These substates are
enabled for your 03:00.0 device, so this commit may also be related.

That's all a long way to say that I think testing v6.9-rc1 or later
(or linux-next as of Mar 7 or later) would be very interesting.

> > I've also checked if there is a newer firmware but Ubuntu 23.10 is
> > already using the newest firmware available from
> > https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/log/iwlwifi-8265-36.ucode
> > (version 36.ca7b901d.0 according to dmesg).
> >
> > Michael
> >
> > >>
> > >> Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ