netdev - RE: [EXTERNAL] [PATCH 1/6] PCI: hv: fix a race condition bug in hv_pci_query

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <PUZP153MB0749F39A34DEC9FABE17C615BE889@PUZP153MB0749.APCP153.PROD.OUTLOOK.COM>
Date:   Tue, 28 Mar 2023 05:29:07 +0000
From:   Saurabh Singh Sengar <ssengar@...rosoft.com>
To:     Dexuan Cui <decui@...rosoft.com>,
        "bhelgaas@...gle.com" <bhelgaas@...gle.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        Dexuan Cui <decui@...rosoft.com>,
        "edumazet@...gle.com" <edumazet@...gle.com>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        Jake Oshins <jakeo@...rosoft.com>,
        "kuba@...nel.org" <kuba@...nel.org>, "kw@...ux.com" <kw@...ux.com>,
        KY Srinivasan <kys@...rosoft.com>,
        "leon@...nel.org" <leon@...nel.org>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        "lpieralisi@...nel.org" <lpieralisi@...nel.org>,
        "Michael Kelley (LINUX)" <mikelley@...rosoft.com>,
        "pabeni@...hat.com" <pabeni@...hat.com>,
        "robh@...nel.org" <robh@...nel.org>,
        "saeedm@...dia.com" <saeedm@...dia.com>,
        "wei.liu@...nel.org" <wei.liu@...nel.org>,
        Long Li <longli@...rosoft.com>,
        "boqun.feng@...il.com" <boqun.feng@...il.com>
CC:     "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [EXTERNAL] [PATCH 1/6] PCI: hv: fix a race condition bug in
 hv_pci_query_relations()



> -----Original Message-----
> From: Dexuan Cui <decui@...rosoft.com>
> Sent: Tuesday, March 28, 2023 10:21 AM
> To: bhelgaas@...gle.com; davem@...emloft.net; Dexuan Cui
> <decui@...rosoft.com>; edumazet@...gle.com; Haiyang Zhang
> <haiyangz@...rosoft.com>; Jake Oshins <jakeo@...rosoft.com>;
> kuba@...nel.org; kw@...ux.com; KY Srinivasan <kys@...rosoft.com>;
> leon@...nel.org; linux-pci@...r.kernel.org; lpieralisi@...nel.org; Michael
> Kelley (LINUX) <mikelley@...rosoft.com>; pabeni@...hat.com;
> robh@...nel.org; saeedm@...dia.com; wei.liu@...nel.org; Long Li
> <longli@...rosoft.com>; boqun.feng@...il.com
> Cc: linux-hyperv@...r.kernel.org; linux-kernel@...r.kernel.org; linux-
> rdma@...r.kernel.org; netdev@...r.kernel.org
> Subject: [EXTERNAL] [PATCH 1/6] PCI: hv: fix a race condition bug in
> hv_pci_query_relations()
> 
> Fix the longstanding race between hv_pci_query_relations() and
> survey_child_resources() by flushing the workqueue before we exit from
> hv_pci_query_relations().
> 
> Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft
> Hyper-V VMs")
> Signed-off-by: Dexuan Cui <decui@...rosoft.com>
> 
> ---
>  drivers/pci/controller/pci-hyperv.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> With the below debug code:
> 
> @@ -2103,6 +2103,8 @@ static void survey_child_resources(struct
> hv_pcibus_device *hbus)
>  	}
> 
>  	spin_unlock_irqrestore(&hbus->device_list_lock, flags);
> +	ssleep(15);
> +	printk("%s: completing %px\n", __func__, event);
>  	complete(event);
>  }
> 
> @@ -3305,8 +3307,12 @@ static int hv_pci_query_relations(struct hv_device
> *hdev)
> 
>  	ret = vmbus_sendpacket(hdev->channel, &message, sizeof(message),
>  			       0, VM_PKT_DATA_INBAND, 0);
> -	if (!ret)
> +	if (!ret) {
> +		ssleep(10); // unassign the PCI device on the host during the
> 10s
>  		ret = wait_for_response(hdev, &comp);
> +		printk("%s: comp=%px is becoming invalid! ret=%d\n",
> +			__func__, &comp, ret);
> +	}
> 
>  	return ret;
>  }
> @@ -3635,6 +3641,8 @@ static int hv_pci_probe(struct hv_device *hdev,
> 
>  retry:
>  	ret = hv_pci_query_relations(hdev);
> +	printk("hv_pci_query_relations() exited\n");

Can we use pr_* or the appropriate KERN_<LEVEL> in all the printk(s).

> +
>  	if (ret)
>  		goto free_irq_domain;
> 
> I'm able to repro the below hang issue:
> 
> [   74.544744] hv_pci b92a0085-468b-407a-a88a-d33fac8edc75: PCI VMBus
> probing: Using version 0x10004
> [   76.886944] hv_netvsc 818fe754-b912-4445-af51-1f584812e3c9 eth0: VF slot
> 1 removed
> [   84.788266] hv_pci b92a0085-468b-407a-a88a-d33fac8edc75: The device is
> gone.
> [   84.792586] hv_pci_query_relations: comp=ffffa7504012fb58 is becoming
> invalid! ret=-19
> [   84.797505] hv_pci_query_relations() exited
> [   89.652268] survey_child_resources: completing ffffa7504012fb58
> [  150.392242] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> [  150.398447] rcu:     15-...0: (2 ticks this GP)
> idle=867c/1/0x4000000000000000 softirq=947/947 fqs=5234
> [  150.405851] rcu:     (detected by 14, t=15004 jiffies, g=2553, q=4833
> ncpus=16)
> [  150.410870] Sending NMI from CPU 14 to CPUs 15:
> [  150.414836] NMI backtrace for cpu 15
> [  150.414840] CPU: 15 PID: 10 Comm: kworker/u32:0 Tainted: G        W   E
> 6.3.0-rc3-decui-dirty #34
> ...
> [  150.414849] Workqueue: hv_pci_468b pci_devices_present_work
> [pci_hyperv] [  150.414866] RIP:
> 0010:__pv_queued_spin_lock_slowpath+0x10f/0x3c0
> ...
> [  150.414905] Call Trace:
> [  150.414907]  <TASK>
> [  150.414911]  _raw_spin_lock_irqsave+0x40/0x50 [  150.414917]
> complete+0x1d/0x60 [  150.414924]  pci_devices_present_work+0x5dd/0x680
> [pci_hyperv] [  150.414946]  process_one_work+0x21f/0x430 [  150.414952]
> worker_thread+0x4a/0x3c0
> 
> With this patch, the hang issue goes away:
> 
> [  186.143612] hv_pci b92a0085-468b-407a-a88a-d33fac8edc75: The device is
> gone.
> [  186.148034] hv_pci_query_relations: comp=ffffa7cfd0aa3b50 is becoming
> invalid! ret=-19 [  191.263611] survey_child_resources: completing
> ffffa7cfd0aa3b50 [  191.267732] hv_pci_query_relations() exited
> 
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-
> hyperv.c
> index f33370b75628..b82c7cde19e6 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -3308,6 +3308,19 @@ static int hv_pci_query_relations(struct hv_device
> *hdev)
>  	if (!ret)
>  		ret = wait_for_response(hdev, &comp);
> 
> +	/*
> +	 * In the case of fast device addition/removal, it's possible that
> +	 * vmbus_sendpacket() or wait_for_response() returns -ENODEV but
> we
> +	 * already got a PCI_BUS_RELATIONS* message from the host and the
> +	 * channel callback already scheduled a work to hbus->wq, which can
> be
> +	 * running survey_child_resources() -> complete(&hbus-
> >survey_event),
> +	 * even after hv_pci_query_relations() exits and the stack variable
> +	 * 'comp' is no longer valid. This can cause a strange hang issue
> +	 * or sometimes a page fault. Flush hbus->wq before we exit from
> +	 * hv_pci_query_relations() to avoid the issues.
> +	 */
> +	flush_workqueue(hbus->wq);
> +
>  	return ret;
>  }
> 
> --
> 2.25.1