lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <144a39f2d19af30961498acc11d6b7475166ccf5.camel@redhat.com>
Date: Tue, 06 Jan 2026 20:13:36 -0500
From: Radu Rendec <rrendec@...hat.com>
To: Jon Hunter <jonathanh@...dia.com>, linux-kernel@...r.kernel.org, 
	linux-tip-commits@...r.kernel.org
Cc: Thomas Gleixner <tglx@...utronix.de>, x86@...nel.org, 
 "linux-tegra@...r.kernel.org"
	 <linux-tegra@...r.kernel.org>
Subject: Re: [tip: irq/msi] PCI: dwc: Enable MSI affinity support

Hi Jon,

On Tue, 2026-01-06 at 10:07 -0500, Radu Rendec wrote:
> On Tue, 2026-01-06 at 09:53 +0000, Jon Hunter wrote:
> > On 15/12/2025 21:34, tip-bot2 for Radu Rendec wrote:
> > > The following commit has been merged into the irq/msi branch of tip:
> > > 
> > > Commit-ID:     eaf290c404f7c39f23292e9ce83b8b5b51ab598a
> > > Gitweb:        https://git.kernel.org/tip/eaf290c404f7c39f23292e9ce83b8b5b51ab598a
> > > Author:        Radu Rendec <rrendec@...hat.com>
> > > AuthorDate:    Fri, 28 Nov 2025 16:20:55 -05:00
> > > Committer:     Thomas Gleixner <tglx@...utronix.de>
> > > CommitterDate: Mon, 15 Dec 2025 22:30:48 +01:00
> > > 
> > > PCI: dwc: Enable MSI affinity support
> > > 
> > > Leverage the interrupt redirection infrastructure to enable CPU affinity
> > > support for MSI interrupts. Since the parent interrupt affinity cannot
> > > be changed, affinity control for the child interrupt (MSI) is achieved
> > > by redirecting the handler to run in IRQ work context on the target CPU.
> > > 
> > > This patch was originally prepared by Thomas Gleixner (see Link tag below)
> > > in a patch series that was never submitted as is, and only parts of that
> > > series have made it upstream so far.
> > > 
> > > Originally-by: Thomas Gleixner <tglx@...utronix.de>
> > > Signed-off-by: Radu Rendec <rrendec@...hat.com>
> > > Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
> > > Link: https://lore.kernel.org/linux-pci/878qpg4o4t.ffs@tglx/
> > > Link: https://patch.msgid.link/20251128212055.1409093-4-rrendec@redhat.com
> > 
> > 
> > With next-20260105 I am observing the following warning on the Tegra194 
> > Jetson AGX platform ...
> > 
> >   WARNING KERN genirq: irq_chip DW-PCI-MSI-0001:01:00.0 did not update
> >    eff. affinity mask of irq 171
> > 
> > Bisect is point to this commit. This platform is using the driver 
> > drivers/pci/controller/dwc/pcie-tegra194.c. Is there some default 
> > affinity that we should be setting to avoid this warning?
> 
> Before that patch, affinity control wasn't even possible for PCI MSIs
> exposed by the dw_pci drivers. Without having looked at the code yet,
> I suspect it's just because now that affinity control is enabled,
> something tries to use it.
> 
> I don't think you should set some default affinity. By default, the PCI
> MSIs should be affine to all available CPUs, and that warning shouldn't
> happen in the first place. Let me test on Jetson AGX and see what's
> going on. I'll update the thread with my findings, hopefully later
> today.

I looked at the code and tested, and the problem is that the effective
affinity mask is not updated for interrupt redirection. The bug is not
in this patch, but the previous one in the series [1], which adds the
interrupt redirection framework.

The warning is actually triggered when the MSI is set up. This is the
top part of the relevant stack trace:
  irq_do_set_affinity+0x28c/0x300 (P)
  irq_setup_affinity+0x130/0x208
  irq_startup+0x118/0x170
  __setup_irq+0x5b0/0x6a0
  request_threaded_irq+0xb8/0x180
  devm_request_threaded_irq+0x88/0x150
  rtw_pci_probe+0x1e8/0x370 [rtw88_pci]

I don't immediately see an easy way to fix it for the generic case
because the affinity of the demultiplexing IRQ (the "parent" IRQ) can
change after the affinity of the demultiplexed IRQ (the "child" IRQ)
has been set up. But since dw_pcie is currently the only user of the
interrupt redirection infrastructure, and it sets up the demultiplexing
IRQ as a chained IRQ, there is no way its affinity can change other
than CPU hot(un)plug. And in this particular case, something as simple
as will work:

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index d5c3f6ee24cc2..036641f9534ae 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -1512,8 +1512,11 @@ EXPORT_SYMBOL_GPL(irq_chip_release_resources_parent);
 int irq_chip_redirect_set_affinity(struct irq_data *data, const struct cpumask *dest, bool force)
 {
 	struct irq_redirect *redir = &irq_data_to_desc(data)->redirect;
+	unsigned int target_cpu = cpumask_first(dest);
+
+	WRITE_ONCE(redir->target_cpu, target_cpu);
+	irq_data_update_effective_affinity(data, cpumask_of(target_cpu));
 
-	WRITE_ONCE(redir->target_cpu, cpumask_first(dest));
 	return IRQ_SET_MASK_OK;
 }
 EXPORT_SYMBOL_GPL(irq_chip_redirect_set_affinity);

I will send this as a proper patch tomorrow, and it will fix the
immediate problem and buy some time for a more elaborate fix for the
generic case. Meanwhile, thanks a lot for finding/reporting this!

[1] https://lore.kernel.org/all/20251128212055.1409093-2-rrendec@redhat.com/

-- 
Best regards,
Radu


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ