[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <02ec93d6-3cf6-410d-a887-1a625fb7be82@yahoo.com>
Date: Tue, 5 Nov 2024 05:24:42 -0600
From: Dullfire <dullfire@...oo.com>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: davem@...emloft.net, sparclinux@...r.kernel.org, netdev@...r.kernel.org,
linux-pci@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>
Subject: Re: Kernel panic with niu module
On 11/4/24 17:44, Bjorn Helgaas wrote:
> [+cc Thomas, author of 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X
> entries")]
>
> On Mon, Nov 04, 2024 at 05:34:42AM -0600, Dullfire wrote:
>> I have also bisected the kernel, and determined that upstream commit
>> 7d5ec3d3612396dc6d4b76366d20ab9fc06f399f revealed this issue. This commit
>> adds read to the mask status before any write to PCI_MSIX_ENTRY_DATA, thus
>> provoking the issue.
>
> 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries") appeared in
> v5.14 in 2021. Surely other drivers use MSI-X and would have been
> tested on sparcv9 since then? Just based on the age of 7d5ec3d36123,
> I would guess some kind of niu issue. But Thomas will know much more.
Yeah, I wasn't very clear: I believe this problem is specific to the niu
module. My suspicion is hardware errata and/or an issue in the builtin
hypervisor.
My T5240 has several other PCIe devices, none of which exhibit this issue.
I will have to check later if any use MSIX.
Speaking of test cases: It is worth pointing out that any write to ENTRY_DATA
appears to be sufficient to allow subsequent reads to that MSIX table entry
to work. Notably, booting into a pre 7d5ec3d36123 kernel, and then rebooting
into a newer kernel will succeed, because the registers were written to under
the old kernel. I had to power off the unit to reproduce the issue if a
kernel successfully initialized the device.
Regards,
Jonathan Currier
Powered by blists - more mailing lists