linux-kernel - [PATCH] genirq/redirect: Prevent writing MSI message on affinity change

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <87tsw6aglz.ffs@tglx>
Date: Tue, 27 Jan 2026 22:30:16 +0100
From: Thomas Gleixner <tglx@...nel.org>
To: Jon Hunter <jonathanh@...dia.com>, Radu Rendec <rrendec@...hat.com>,
 Manivannan Sadhasivam <mani@...nel.org>
Cc: Daniel Tsai <danielsftsai@...gle.com>, Marek Behún
 <kabel@...nel.org>,
 Krishna Chaitanya Chundru <quic_krichai@...cinc.com>, Bjorn Helgaas
 <bhelgaas@...gle.com>, Rob Herring <robh@...nel.org>, Krzysztof
 Wilczyński
 <kwilczynski@...nel.org>, Lorenzo Pieralisi <lpieralisi@...nel.org>,
 Jingoo Han <jingoohan1@...il.com>, Brian Masney <bmasney@...hat.com>, Eric
 Chanudet <echanude@...hat.com>, Alessandro Carminati
 <acarmina@...hat.com>, Jared Kangas <jkangas@...hat.com>,
 linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
 "linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>
Subject: [PATCH] genirq/redirect: Prevent writing MSI message on affinity
 change

The interrupts which are handled by the redirection infrastructure provide
a irq_set_affinity() callback, which solely determines the target CPU for
redirection via irq_work and und updates the effective affinity mask.

Contrary to regular MSI interrupts this affinity setting does not change
the underlying interrupt message as the message is only created at setup
time to deliver to the demultiplexing interrupt.

Therefore the message write in msi_domain_set_affinity() is a pointless
exercise. In principle the write is harmless, but a Tegra system exposes a
full system hang during suspend due to that write.

It's unclear why the check for the PCI device state PCI_D0 in
pci_msi_domain_write_msg(), which prevents the actual hardware access if
a device is powered down state, fails on this particular system, but
that's a different problem which needs to be investigated by the Tegra
experts.

The irq_set_affinity() callback can advise msi_domain_set_affinity() not to
write the MSI message by returning IRQ_SET_MASK_OK_DONE instead of
IRQ_SET_MASK_OK. Do exactly that.

Just to make it clear again:

This is not a correctness issue of the redirection code as returning
IRQ_SET_MASK_OK in that context is completely correct. From the core
code point of view this is solely a optimization to avoid an redundant
hardware write.

As a byproduct it papers over the underlying problem on the Tegra platform,
which fails to put the PCIe device[s] out of PCI_D0 despite the fact that
the devices and busses have been shut down. The redirect infrastructure
just unearthed the underlying issue, which is prone to happen in quite some
other code paths which use the PCI_D0 check to prevent hardware access to
powered down devices.

This therefore has neither a 'Fixes:' nor a 'Closes:' tag associated as the
underlying problem, which is outside the scope of the interrupt code, is
still unresolved.

Reported-by: Jon Hunter <jonathanh@...dia.com>
Signed-off-by: Thomas Gleixner <tglx@...nel.org>
Tested-by: Jon Hunter <jonathanh@...dia.com>
Link: https://lore.kernel.org/all/4e5b349c-6599-4871-9e3b-e10352ae0ca0@nvidia.com
---
 kernel/irq/chip.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -1495,7 +1495,7 @@ int irq_chip_redirect_set_affinity(struc
 	WRITE_ONCE(redir->target_cpu, cpumask_first(dest));
 	irq_data_update_effective_affinity(data, dest);

-	return IRQ_SET_MASK_OK;
+	return IRQ_SET_MASK_OK_DONE;
 }
 EXPORT_SYMBOL_GPL(irq_chip_redirect_set_affinity);
 #endif