[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <cfe44924-3419-4f31-8ab3-87b769d21a5b@nvidia.com>
Date: Wed, 21 Jan 2026 14:00:06 +0000
From: Jon Hunter <jonathanh@...dia.com>
To: Radu Rendec <rrendec@...hat.com>, Thomas Gleixner <tglx@...utronix.de>,
Manivannan Sadhasivam <mani@...nel.org>
Cc: Daniel Tsai <danielsftsai@...gle.com>, Marek Behún
<kabel@...nel.org>, Krishna Chaitanya Chundru <quic_krichai@...cinc.com>,
Bjorn Helgaas <bhelgaas@...gle.com>, Rob Herring <robh@...nel.org>,
Krzysztof Wilczyński <kwilczynski@...nel.org>,
Lorenzo Pieralisi <lpieralisi@...nel.org>, Jingoo Han
<jingoohan1@...il.com>, Brian Masney <bmasney@...hat.com>,
Eric Chanudet <echanude@...hat.com>,
Alessandro Carminati <acarmina@...hat.com>, Jared Kangas
<jkangas@...hat.com>, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org,
"linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>
Subject: Re: [PATCH v3 3/3] PCI: dwc: Enable MSI affinity support
On 20/01/2026 22:30, Radu Rendec wrote:
...
>> So it looks like it is hanging when disabling the non-boot CPUs. So far
>> it only appears to happen on Tegra194.
>>
>> Let me know if you have any suggestions.
>
> Ouch. I'm afraid this is going to be much harder to figure out than the
> previous one, especially since I can't get access easily to a board to
> test on. I will try to reserve a board and reproduce the bug.
>
> Meanwhile, if you (or someone else in your team) can spare a few cycles,
> could you please try to reproduce the bug again with the debug patch
> below applied, and a few other changes:
> * enable debug messages in kernel/irq/cpuhotplug.c;
> * save the contents of /proc/interrupts to a file before suspending;
> * add "no_console_suspend" to the kernel command line (although it
> looks like you already have it).
>
> It will be much more verbose during suspend but hopefully we can at
> least figure out how far along it goes and how it's related to the MSI
> affinity configuration.
Thanks. I have dumped the boot log with the prints here:
https://pastebin.com/G8c2ssdt
And the dump of /proc/interrupts here:
https://pastebin.com/Wqzxw3r6
Looks like the last thing I see entering suspend is ...
irq_chip_redirect_set_affinity: irq 162 mask 0x7f
That appears to be a PCIe interrupt. Let me know if there are more tests
I can run.
Cheers
Jon
--
nvpublic
Powered by blists - more mailing lists