[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<SN6PR02MB41575601F0250A20ABF50F3BD442A@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Fri, 4 Jul 2025 04:58:26 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Nam Cao <namcao@...utronix.de>
CC: Thomas Gleixner <tglx@...utronix.de>, Marc Zyngier <maz@...nel.org>,
Lorenzo Pieralisi <lpieralisi@...nel.org>,
Krzysztof Wilczyński <kwilczynski@...nel.org>, Manivannan
Sadhasivam <mani@...nel.org>, Rob Herring <robh@...nel.org>, Bjorn Helgaas
<bhelgaas@...gle.com>, "linux-pci@...r.kernel.org"
<linux-pci@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, Karthikeyan Mitran
<m.karthikeyan@...iveil.co.in>, Hou Zhiqiang <Zhiqiang.Hou@....com>, Thomas
Petazzoni <thomas.petazzoni@...tlin.com>, Pali Rohár
<pali@...nel.org>, "K . Y . Srinivasan" <kys@...rosoft.com>, Haiyang Zhang
<haiyangz@...rosoft.com>, Wei Liu <wei.liu@...nel.org>, Dexuan Cui
<decui@...rosoft.com>, Joyce Ooi <joyce.ooi@...el.com>, Jim Quinlan
<jim2101024@...il.com>, Nicolas Saenz Julienne <nsaenz@...nel.org>, Florian
Fainelli <florian.fainelli@...adcom.com>, Broadcom internal kernel review
list <bcm-kernel-feedback-list@...adcom.com>, Ray Jui <rjui@...adcom.com>,
Scott Branden <sbranden@...adcom.com>, Ryder Lee <ryder.lee@...iatek.com>,
Jianjun Wang <jianjun.wang@...iatek.com>, Marek Vasut
<marek.vasut+renesas@...il.com>, Yoshihiro Shimoda
<yoshihiro.shimoda.uh@...esas.com>, Michal Simek <michal.simek@....com>,
Daire McNamara <daire.mcnamara@...rochip.com>, Nirmal Patel
<nirmal.patel@...ux.intel.com>, Jonathan Derrick
<jonathan.derrick@...ux.dev>, Matthias Brugger <matthias.bgg@...il.com>,
AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>, "linux-hyperv@...r.kernel.org"
<linux-hyperv@...r.kernel.org>, "linux-rpi-kernel@...ts.infradead.org"
<linux-rpi-kernel@...ts.infradead.org>, "linux-mediatek@...ts.infradead.org"
<linux-mediatek@...ts.infradead.org>, "linux-renesas-soc@...r.kernel.org"
<linux-renesas-soc@...r.kernel.org>
Subject: RE: [PATCH 14/16] PCI: hv: Switch to msi_create_parent_irq_domain()
From: Nam Cao <namcao@...utronix.de> Sent: Thursday, July 3, 2025 9:33 PM
>
> On Fri, Jul 04, 2025 at 02:27:01AM +0000, Michael Kelley wrote:
> > I haven't resolved the conflict. As a shortcut for testing I just
> > removed the conflicting patch since it is for a Microsoft custom NIC
> > ("MANA") that's not in the configuration I'm testing with. I'll have to
> > look more closely to figure out the resolution.
> >
> > Separately, this patch (the switch to misc_create_parent_irq_domain)
> > isn't working for Linux VMs on Hyper-V on ARM64. The initial symptom
> > is that interrupts from the NVMe controller aren't getting handled
> > and everything hangs. Here's the dmesg output:
> >
> > [ 84.463419] hv_vmbus: registering driver hv_pci
> > [ 84.463875] hv_pci abee639e-0b9d-49b7-9a07-c54ba8cd5734: PCI VMBus probing: Using version 0x10004
> > [ 84.464518] hv_pci abee639e-0b9d-49b7-9a07-c54ba8cd5734: PCI host bridge to bus 0b9d:00
> > [ 84.464529] pci_bus 0b9d:00: root bus resource [mem 0xfc0000000-0xfc00fffff window]
> > [ 84.464531] pci_bus 0b9d:00: No busn resource found for root bus, will use [bus 00-ff]
> > [ 84.465211] pci 0b9d:00:00.0: [1414:b111] type 00 class 0x010802 PCIe Endpoint
> > [ 84.466657] pci 0b9d:00:00.0: BAR 0 [mem 0xfc0000000-0xfc00fffff 64bit]
> > [ 84.481923] pci_bus 0b9d:00: busn_res: [bus 00-ff] end is updated to 00
> > [ 84.481936] pci 0b9d:00:00.0: BAR 0 [mem 0xfc0000000-0xfc00fffff 64bit]: assigned
> > [ 84.482413] nvme nvme0: pci function 0b9d:00:00.0
> > [ 84.482513] nvme 0b9d:00:00.0: enabling device (0000 -> 0002)
> > [ 84.556871] irq 17, desc: 00000000e8529819, depth: 0, count: 0, unhandled: 0
> > [ 84.556883] ->handle_irq(): 0000000062fa78bc, handle_bad_irq+0x0/0x270
> > [ 84.556892] ->irq_data.chip(): 00000000ba07832f, 0xffff00011469dc30
> > [ 84.556895] ->action(): 0000000069f160b3
> > [ 84.556896] ->action->handler(): 00000000e15d8191, nvme_irq+0x0/0x3e8
> > [ 172.307920] watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [kworker/6:1H:195]
>
> Thanks for the report.
>
> On arm64, this driver relies on the parent irq domain to set handler. So
> the driver must not overwrite it to NULL.
>
> This should cures it:
>
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index 3a24fadddb83..f4a435b0456c 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -577,8 +577,6 @@ static void hv_pci_onchannelcallback(void *context);
>
> #ifdef CONFIG_X86
> #define DELIVERY_MODE APIC_DELIVERY_MODE_FIXED
> -#define FLOW_HANDLER handle_edge_irq
> -#define FLOW_NAME "edge"
>
> static int hv_pci_irqchip_init(void)
> {
> @@ -723,8 +721,6 @@ static void hv_arch_irq_unmask(struct irq_data *data)
> #define HV_PCI_MSI_SPI_START 64
> #define HV_PCI_MSI_SPI_NR (1020 - HV_PCI_MSI_SPI_START)
> #define DELIVERY_MODE 0
> -#define FLOW_HANDLER NULL
> -#define FLOW_NAME NULL
> #define hv_msi_prepare NULL
>
> struct hv_pci_chip_data {
> @@ -2162,8 +2158,9 @@ static int hv_pcie_domain_alloc(struct irq_domain *d,
> unsigned int virq, unsigne
> return ret;
>
> for (int i = 0; i < nr_irqs; i++) {
> - irq_domain_set_info(d, virq + i, 0, &hv_msi_irq_chip, NULL, FLOW_HANDLER, NULL,
> - FLOW_NAME);
> + irq_domain_set_hwirq_and_chip(d, virq + i, 0, &hv_msi_irq_chip, NULL);
> + if (IS_ENABLED(CONFIG_X86))
> + __irq_set_handler(virq + i, handle_edge_irq, 0, "edge");
> }
>
> return 0;
Yes, that fixes the problem. Linux now boots with the PCI NIC VF and two
NVMe controllers being visible and operational. Thanks for the fix! It
would have taken me a while to figure it out.
I want to do some additional testing tomorrow, and look more closely at the
code, but now I have something that works well enough to make further
progress.
Michael
Powered by blists - more mailing lists