[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BYAPR21MB1688D11FC50A2B43564D21ECD7969@BYAPR21MB1688.namprd21.prod.outlook.com>
Date: Fri, 7 Apr 2023 16:05:11 +0000
From: "Michael Kelley (LINUX)" <mikelley@...rosoft.com>
To: Dexuan Cui <decui@...rosoft.com>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"edumazet@...gle.com" <edumazet@...gle.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
Jake Oshins <jakeo@...rosoft.com>,
"kuba@...nel.org" <kuba@...nel.org>, "kw@...ux.com" <kw@...ux.com>,
KY Srinivasan <kys@...rosoft.com>,
"leon@...nel.org" <leon@...nel.org>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"lpieralisi@...nel.org" <lpieralisi@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>,
"robh@...nel.org" <robh@...nel.org>,
"saeedm@...dia.com" <saeedm@...dia.com>,
"wei.liu@...nel.org" <wei.liu@...nel.org>,
Long Li <longli@...rosoft.com>,
"boqun.feng@...il.com" <boqun.feng@...il.com>,
Saurabh Singh Sengar <ssengar@...rosoft.com>,
"helgaas@...nel.org" <helgaas@...nel.org>
CC: "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: RE: [PATCH v2 2/6] PCI: hv: Fix a race condition in hv_irq_unmask()
that can cause panic
From: Dexuan Cui <decui@...rosoft.com> Sent: Monday, April 3, 2023 7:06 PM
>
> When the host tries to remove a PCI device, the host first sends a
> PCI_EJECT message to the guest, and the guest is supposed to gracefully
> remove the PCI device and send a PCI_EJECTION_COMPLETE message to the host;
> the host then sends a VMBus message CHANNELMSG_RESCIND_CHANNELOFFER to
> the guest (when the guest receives this message, the device is already
> unassigned from the guest) and the guest can do some final cleanup work;
> if the guest fails to respond to the PCI_EJECT message within one minute,
> the host sends the VMBus message CHANNELMSG_RESCIND_CHANNELOFFER and
> removes the PCI device forcibly.
>
> In the case of fast device addition/removal, it's possible that the PCI
> device driver is still configuring MSI-X interrupts when the guest receives
> the PCI_EJECT message; the channel callback calls hv_pci_eject_device(),
> which sets hpdev->state to hv_pcichild_ejecting, and schedules a work
> hv_eject_device_work(); if the PCI device driver is calling
> pci_alloc_irq_vectors() -> ... -> hv_compose_msi_msg(), we can break the
> while loop in hv_compose_msi_msg() due to the updated hpdev->state, and
> leave data->chip_data with its default value of NULL; later, when the PCI
> device driver calls request_irq() -> ... -> hv_irq_unmask(), the guest
> crashes in hv_arch_irq_unmask() due to data->chip_data being NULL.
>
> Fix the issue by not testing hpdev->state in the while loop: when the
> guest receives PCI_EJECT, the device is still assigned to the guest, and
> the guest has one minute to finish the device removal gracefully. We don't
> really need to (and we should not) test hpdev->state in the loop.
>
> Fixes: de0aa7b2f97d ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()")
> Signed-off-by: Dexuan Cui <decui@...rosoft.com>
> Cc: stable@...r.kernel.org
> ---
>
> v2:
> Removed the "debug code".
> No change to the patch body.
> Added Cc:stable
>
> drivers/pci/controller/pci-hyperv.c | 11 +++++------
> 1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index b82c7cde19e66..1b11cf7391933 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -643,6 +643,11 @@ static void hv_arch_irq_unmask(struct irq_data *data)
> pbus = pdev->bus;
> hbus = container_of(pbus->sysdata, struct hv_pcibus_device, sysdata);
> int_desc = data->chip_data;
> + if (!int_desc) {
> + dev_warn(&hbus->hdev->device, "%s() can not unmask irq %u\n",
> + __func__, data->irq);
> + return;
> + }
>
> spin_lock_irqsave(&hbus->retarget_msi_interrupt_lock, flags);
>
> @@ -1911,12 +1916,6 @@ static void hv_compose_msi_msg(struct irq_data *data,
> struct msi_msg *msg)
> hv_pci_onchannelcallback(hbus);
> spin_unlock_irqrestore(&channel->sched_lock, flags);
>
> - if (hpdev->state == hv_pcichild_ejecting) {
> - dev_err_once(&hbus->hdev->device,
> - "the device is being ejected\n");
> - goto enable_tasklet;
> - }
> -
> udelay(100);
> }
>
> --
> 2.25.1
Reviewed-by: Michael Kelley <mikelley@...rosoft.com>
Powered by blists - more mailing lists