lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250228160133.GA51628@bhelgaas>
Date: Fri, 28 Feb 2025 10:01:33 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Naveen Kumar P <naveenkumar.parna@...il.com>
Cc: linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
	kernelnewbies <kernelnewbies@...nelnewbies.org>,
	linux-acpi@...r.kernel.org
Subject: Re: PCI: hotplug_event: PCIe PLDA Device BAR Reset

On Wed, Feb 26, 2025 at 06:28:33PM +0530, Naveen Kumar P wrote:
> On Wed, Feb 26, 2025 at 2:08 AM Bjorn Helgaas <helgaas@...nel.org> wrote:
> > On Tue, Feb 25, 2025 at 06:46:02PM +0530, Naveen Kumar P wrote:
> > > On Tue, Feb 25, 2025 at 1:24 AM Bjorn Helgaas <helgaas@...nel.org> wrote:
> > > > On Tue, Feb 25, 2025 at 12:29:00AM +0530, Naveen Kumar P wrote:
> > > > > On Mon, Feb 24, 2025 at 11:03 PM Bjorn Helgaas <helgaas@...nel.org> wrote:
> > > > > > On Mon, Feb 24, 2025 at 05:45:35PM +0530, Naveen Kumar P wrote:
> > > > > > > On Wed, Feb 19, 2025 at 10:36 PM Bjorn Helgaas <helgaas@...nel.org> wrote:
> > > > > > > > On Wed, Feb 19, 2025 at 05:52:47PM +0530, Naveen Kumar P wrote:
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > I am writing to seek assistance with an issue we are
> > > > > > > > > experiencing with a PCIe device (PLDA Device 5555)
> > > > > > > > > connected through PCI Express Root Port 1 to the
> > > > > > > > > host bridge.
> > > > > > > > >
> > > > > > > > > We have observed that after booting the system, the
> > > > > > > > > Base Address Register (BAR0) memory of this device
> > > > > > > > > gets reset to 0x0 after approximately one hour or
> > > > > > > > > more (the timing is inconsistent). This was verified
> > > > > > > > > using the lspci output and the setpci -s 01:00.0
> > > > > > > > > BASE_ADDRESS_0 command.
> > > > > ...

> I have downloaded the 6.13 kernel source and added additional debug
> logs in hotplug_event(), then built the kernel. After that rebooted
> with the new kernel using the following parameters:
> BOOT_IMAGE=/vmlinuz-6.13.0+ root=/dev/mapper/vg00-rootvol ro quiet
> libata.force=noncq pci=nomsi pcie_aspm=off pcie_ports=on "dyndbg=file
> drivers/pci/* +p; file drivers/acpi/* +p"

Why "pci=nomsi"?  I don't think that should make a difference.  Also,
it contributes to the fact that Linux doesn't request OS control of
several features that it ordinarily does, so you end up in a somewhat
unusual state (which *should* still work, of course):

  acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig Segments HPX-Type3]
  acpi PNP0A08:00: _OSC: not requesting OS control; OS requires [ExtendedConfig ASPM ClockPM MSI]

Same for "pcie_aspm=off".

Why "pcie_ports=on"?  That's not a valid parameter:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/pcie/portdrv.c?id=v6.13#n619

> Complete dmesg log and the patch(to get additional debug information)
> are attached to this email.
> 
> Any further guidance on these observations?

I'm out of ideas.  I would instrument the PCI config accessors to log
all the reads and writes to your device (01:00.0) to see what we do to
the device.  Maybe there's some hint:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/access.c?id=v6.13#n35

> Additionally, I noticed that the initial bootup logs with the
> "0.000000" timestamp are missing in the dmesg log with this new
> kernel. I'm unsure what might be causing this issue.

Probably overflowed the message buffer.  You can try increasing the
buffer size:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kernel-parameters.txt?id=v6.13#n3190

You can also experiment with the dyndbg parameter to be more selective
about the ACPI messages if some aren't useful.

Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ