[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a8d841e870d6dbbabef7eadb774f2a58a96c57c7.camel@redhat.com>
Date: Tue, 20 Jan 2026 17:30:55 -0500
From: Radu Rendec <rrendec@...hat.com>
To: Jon Hunter <jonathanh@...dia.com>, Thomas Gleixner <tglx@...utronix.de>,
Manivannan Sadhasivam <mani@...nel.org>
Cc: Daniel Tsai <danielsftsai@...gle.com>, Marek Behún
<kabel@...nel.org>, Krishna Chaitanya Chundru <quic_krichai@...cinc.com>,
Bjorn Helgaas <bhelgaas@...gle.com>, Rob Herring <robh@...nel.org>,
Krzysztof Wilczyński <kwilczynski@...nel.org>, Lorenzo
Pieralisi <lpieralisi@...nel.org>, Jingoo Han <jingoohan1@...il.com>,
Brian Masney <bmasney@...hat.com>, Eric Chanudet <echanude@...hat.com>,
Alessandro Carminati <acarmina@...hat.com>, Jared Kangas
<jkangas@...hat.com>, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org, "linux-tegra@...r.kernel.org"
<linux-tegra@...r.kernel.org>
Subject: Re: [PATCH v3 3/3] PCI: dwc: Enable MSI affinity support
Hi Jon,
On Tue, 2026-01-20 at 18:01 +0000, Jon Hunter wrote:
> On 28/11/2025 21:20, Radu Rendec wrote:
> > Leverage the interrupt redirection infrastructure to enable CPU affinity
> > support for MSI interrupts. Since the parent interrupt affinity cannot
> > be changed, affinity control for the child interrupt (MSI) is achieved
> > by redirecting the handler to run in IRQ work context on the target CPU.
> >
> > This patch was originally prepared by Thomas Gleixner (see Link tag
> > below) in a patch series that was never submitted as is, and only
> > parts of that series have made it upstream so far.
> >
> > Originally-by: Thomas Gleixner <tglx@...utronix.de>
> > Link: https://lore.kernel.org/linux-pci/878qpg4o4t.ffs@tglx/
> > Signed-off-by: Radu Rendec <rrendec@...hat.com>
> > ---
> > .../pci/controller/dwc/pcie-designware-host.c | 33 ++++++++++++++++---
> > 1 file changed, 28 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> > index aa93acaa579a5..90d9cb45e7842 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > @@ -26,9 +26,27 @@ static struct pci_ops dw_pcie_ops;
> > static struct pci_ops dw_pcie_ecam_ops;
> > static struct pci_ops dw_child_pcie_ops;
> >
> > +#ifdef CONFIG_SMP
> > +static void dw_irq_noop(struct irq_data *d) { }
> > +#endif
> > +
> > +static bool dw_pcie_init_dev_msi_info(struct device *dev, struct irq_domain *domain,
> > + struct irq_domain *real_parent, struct msi_domain_info *info)
> > +{
> > + if (!msi_lib_init_dev_msi_info(dev, domain, real_parent, info))
> > + return false;
> > +
> > +#ifdef CONFIG_SMP
> > + info->chip->irq_ack = dw_irq_noop;
> > + info->chip->irq_pre_redirect = irq_chip_pre_redirect_parent;
> > +#else
> > + info->chip->irq_ack = irq_chip_ack_parent;
> > +#endif
> > + return true;
> > +}
> > +
> > #define DW_PCIE_MSI_FLAGS_REQUIRED (MSI_FLAG_USE_DEF_DOM_OPS | \
> > MSI_FLAG_USE_DEF_CHIP_OPS | \
> > - MSI_FLAG_NO_AFFINITY | \
> > MSI_FLAG_PCI_MSI_MASK_PARENT)
> > #define DW_PCIE_MSI_FLAGS_SUPPORTED (MSI_FLAG_MULTI_PCI_MSI | \
> > MSI_FLAG_PCI_MSIX | \
> > @@ -40,9 +58,8 @@ static const struct msi_parent_ops dw_pcie_msi_parent_ops = {
> > .required_flags = DW_PCIE_MSI_FLAGS_REQUIRED,
> > .supported_flags = DW_PCIE_MSI_FLAGS_SUPPORTED,
> > .bus_select_token = DOMAIN_BUS_PCI_MSI,
> > - .chip_flags = MSI_CHIP_FLAG_SET_ACK,
> > .prefix = "DW-",
> > - .init_dev_msi_info = msi_lib_init_dev_msi_info,
> > + .init_dev_msi_info = dw_pcie_init_dev_msi_info,
> > };
> >
> > /* MSI int handler */
> > @@ -63,7 +80,7 @@ void dw_handle_msi_irq(struct dw_pcie_rp *pp)
> > continue;
> >
> > for_each_set_bit(pos, &status, MAX_MSI_IRQS_PER_CTRL)
> > - generic_handle_domain_irq(pp->irq_domain, irq_off + pos);
> > + generic_handle_demux_domain_irq(pp->irq_domain, irq_off + pos);
> > }
> > }
> >
> > @@ -140,10 +157,16 @@ static void dw_pci_bottom_ack(struct irq_data *d)
> >
> > static struct irq_chip dw_pci_msi_bottom_irq_chip = {
> > .name = "DWPCI-MSI",
> > - .irq_ack = dw_pci_bottom_ack,
> > .irq_compose_msi_msg = dw_pci_setup_msi_msg,
> > .irq_mask = dw_pci_bottom_mask,
> > .irq_unmask = dw_pci_bottom_unmask,
> > +#ifdef CONFIG_SMP
> > + .irq_ack = dw_irq_noop,
> > + .irq_pre_redirect = dw_pci_bottom_ack,
> > + .irq_set_affinity = irq_chip_redirect_set_affinity,
> > +#else
> > + .irq_ack = dw_pci_bottom_ack,
> > +#endif
> > };
> >
> > static int dw_pcie_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
>
>
> I am seeing another issue with this patch. On the Tegra194 AGX Xavier
> platform suspend is failing and reverting this patch fixes the problem.
>
> Unfortunately the logs don't tell me much. In a bad case I see ...
>
> PM: suspend entry (deep)
> Filesystems sync: 0.000 seconds
> Freezing user space processes
> Freezing user space processes completed (elapsed 0.002 seconds)
> OOM killer disabled.
> Freezing remaining freezable tasks
> Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> tegra-xusb 3610000.usb: Firmware timestamp: 2020-09-11 16:55:03 UTC
> dwc-eth-dwmac 2490000.ethernet eth0: Link is Down
> tegra194-pcie 14100000.pcie: Link didn't transition to L2 state
> Disabling non-boot CPUs ...
>
> It appears to hang here. In a good case I see ...
>
> PM: suspend entry (deep)
> Filesystems sync: 0.000 seconds
> Freezing user space processes
> Freezing user space processes completed (elapsed 0.002 seconds)
> OOM killer disabled.
> Freezing remaining freezable tasks
> Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> tegra-xusb 3610000.usb: Firmware timestamp: 2020-09-11 16:55:03 UTC
> dwc-eth-dwmac 2490000.ethernet eth0: Link is Down
> tegra194-pcie 14100000.pcie: Link didn't transition to L2 state
> Disabling non-boot CPUs ...
> psci: CPU7 killed (polled 0 ms)
> psci: CPU6 killed (polled 4 ms)
> psci: CPU5 killed (polled 0 ms)
> psci: CPU4 killed (polled 4 ms)
> psci: CPU3 killed (polled 4 ms)
> psci: CPU2 killed (polled 0 ms)
> psci: CPU1 killed (polled 0 ms)
> ...
> Enabling non-boot CPUs ... (resume starts)
>
> So it looks like it is hanging when disabling the non-boot CPUs. So far
> it only appears to happen on Tegra194.
>
> Let me know if you have any suggestions.
Ouch. I'm afraid this is going to be much harder to figure out than the
previous one, especially since I can't get access easily to a board to
test on. I will try to reserve a board and reproduce the bug.
Meanwhile, if you (or someone else in your team) can spare a few cycles,
could you please try to reproduce the bug again with the debug patch
below applied, and a few other changes:
* enable debug messages in kernel/irq/cpuhotplug.c;
* save the contents of /proc/interrupts to a file before suspending;
* add "no_console_suspend" to the kernel command line (although it
looks like you already have it).
It will be much more verbose during suspend but hopefully we can at
least figure out how far along it goes and how it's related to the MSI
affinity configuration.
Thanks,
Radu
---
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 84cc4bea773c0..62ae76661f26d 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -1492,6 +1492,8 @@ int irq_chip_redirect_set_affinity(struct irq_data *data, const struct cpumask *
{
struct irq_redirect *redir = &irq_data_to_desc(data)->redirect;
+ pr_info("%s: irq %u mask 0x%*pb\n", __func__, data->irq, cpumask_pr_args(dest));
+
WRITE_ONCE(redir->target_cpu, cpumask_first(dest));
irq_data_update_effective_affinity(data, dest);
diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c
index cd5689e383b00..d8c62547f9d06 100644
--- a/kernel/irq/cpuhotplug.c
+++ b/kernel/irq/cpuhotplug.c
@@ -59,6 +59,8 @@ static bool migrate_one_irq(struct irq_desc *desc)
bool brokeaff = false;
int err;
+ pr_info("%s: irq %u cpu %u\n", __func__, d->irq, smp_processor_id());
+
/*
* IRQ chip might be already torn down, but the irq descriptor is
* still in the radix tree. Also if the chip has no affinity setter,
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 3fe6b0c99f3d8..94bd7ad64c9b7 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -227,6 +227,7 @@ static int multi_cpu_stop(void *data)
stop_machine_yield(cpumask);
newstate = READ_ONCE(msdata->state);
if (newstate != curstate) {
+ pr_info("%s: cpu %d entering state %d\n", __func__, cpu, newstate);
curstate = newstate;
switch (curstate) {
case MULTI_STOP_DISABLE_IRQ:
Powered by blists - more mailing lists