[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SN6PR2101MB089485D8C070855CD43C1961BF3C9@SN6PR2101MB0894.namprd21.prod.outlook.com>
Date: Thu, 3 Jun 2021 18:04:31 +0000
From: Dexuan Cui <decui@...rosoft.com>
To: Leonid Bloch <leonidb@...cscloud.com>,
KY Srinivasan <kys@...rosoft.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Wei Liu <wei.liu@...nel.org>, Long Li <longli@...rosoft.com>
CC: "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [BUG] hv_netvsc: Unbind exits before the VFs bound to it are
unregistered
> From: Leonid Bloch <leonidb@...cscloud.com>
> Sent: Thursday, June 3, 2021 5:35 AM
> To: KY Srinivasan <kys@...rosoft.com>; Haiyang Zhang
> <haiyangz@...rosoft.com>; Stephen Hemminger
> <sthemmin@...rosoft.com>; Wei Liu <wei.liu@...nel.org>; Dexuan Cui
> <decui@...rosoft.com>
> Cc: linux-hyperv@...r.kernel.org; netdev@...r.kernel.org
> Subject: [BUG] hv_netvsc: Unbind exits before the VFs bound to it are
> unregistered
>
> Hi,
>
> When I try to unbind a network interface from hv_netvsc and bind it to
> uio_hv_generic, once in a while I get the following kernel panic (please
> note the first two lines: it seems as uio_hv_generic is registered
> before the VF bound to hv_netvsc is unregistered):
>
> [Jun 3 09:04] hv_vmbus: registering driver uio_hv_generic
> [ +0.002215] hv_netvsc 5e089342-8a78-4b76-9729-25c81bd338fc eth2: VF
> unregistering: eth5
> [ +1.088078] BUG: scheduling while atomic: swapper/8/0/0x00010003
> [ +0.000001] BUG: scheduling while atomic: swapper/3/0/0x00010003
> [ +0.000001] BUG: scheduling while atomic: swapper/6/0/0x00010003
> [ +0.000000] BUG: scheduling while atomic: swapper/7/0/0x00010003
> [ +0.000005] Modules linked in:
> [ +0.000001] Modules linked in:
> [ +0.000001] uio_hv_generic
> [ +0.000000] Modules linked in:
> [ +0.000000] Modules linked in:
> [ +0.000001] uio_hv_generic uio
> [ +0.000001] uio
> [ +0.000000] uio_hv_generic
> [ +0.000000] uio_hv_generic
> ...
>
> I run kernel 5.10.27, unmodified, besides RT patch v36, on Azure Stack
> Edge platform, software version 2105 (2.2.1606.3320).
>
> I perform the bind-unbind using the following script (please note the
> comment inline):
>
> net_uuid="f8615163-df3e-46c5-913f-f2d2f965ed0e"
> dev_uuid="$(basename "$(readlink "/sys/class/net/eth1/device")")"
> modprobe uio_hv_generic
> echo "${net_uuid}" > /sys/bus/vmbus/drivers/uio_hv_generic/new_id
> printf "%s" "${dev_uuid}" > /sys/bus/vmbus/drivers/hv_netvsc/unbind
> ### If I insert 'sleep 1' here - all works correctly
> printf "%s" "${dev_uuid}" > /sys/bus/vmbus/drivers/uio_hv_generic/bind
>
>
> Thanks,
> Leonid.
It would be great if you can test the mainline kernel, which I suspect also
has the bug.
It looks like netvsc_remove() -> netvsc_unregister_vf() does the unbinding work
in a synchronous mannter. I don't know why the bug happens.
Right now I don't have a DPDK setup to test this, but I think the bug can
be worked around by unbinding the PCI VF device from the pci-hyperv driver
before unbinding the netvsc device, and re-binding the VF device after binding
the netvsc device to uio_hv_generic.
Thanks,
-- Dexuan
Powered by blists - more mailing lists