lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d13d685a-ae48-b747-7ecf-357b91c275b2@asocscloud.com>
Date:   Fri, 4 Jun 2021 11:14:16 +0300
From:   Leonid Bloch <leonidb@...cscloud.com>
To:     Dexuan Cui <decui@...rosoft.com>,
        KY Srinivasan <kys@...rosoft.com>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        Wei Liu <wei.liu@...nel.org>, Long Li <longli@...rosoft.com>
Cc:     "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [BUG] hv_netvsc: Unbind exits before the VFs bound to it are
 unregistered

On 6/3/21 9:04 PM, Dexuan Cui wrote:
>> From: Leonid Bloch <leonidb@...cscloud.com>
>> Sent: Thursday, June 3, 2021 5:35 AM
>> To: KY Srinivasan <kys@...rosoft.com>; Haiyang Zhang
>> <haiyangz@...rosoft.com>; Stephen Hemminger
>> <sthemmin@...rosoft.com>; Wei Liu <wei.liu@...nel.org>; Dexuan Cui
>> <decui@...rosoft.com>
>> Cc: linux-hyperv@...r.kernel.org; netdev@...r.kernel.org
>> Subject: [BUG] hv_netvsc: Unbind exits before the VFs bound to it are
>> unregistered
>>
>> Hi,
>>
>> When I try to unbind a network interface from hv_netvsc and bind it to
>> uio_hv_generic, once in a while I get the following kernel panic (please
>> note the first two lines: it seems as uio_hv_generic is registered
>> before the VF bound to hv_netvsc is unregistered):
>>
>> [Jun 3 09:04] hv_vmbus: registering driver uio_hv_generic
>> [  +0.002215] hv_netvsc 5e089342-8a78-4b76-9729-25c81bd338fc eth2: VF
>> unregistering: eth5
>> [  +1.088078] BUG: scheduling while atomic: swapper/8/0/0x00010003
>> [  +0.000001] BUG: scheduling while atomic: swapper/3/0/0x00010003
>> [  +0.000001] BUG: scheduling while atomic: swapper/6/0/0x00010003
>> [  +0.000000] BUG: scheduling while atomic: swapper/7/0/0x00010003
>> [  +0.000005] Modules linked in:
>> [  +0.000001] Modules linked in:
>> [  +0.000001]  uio_hv_generic
>> [  +0.000000] Modules linked in:
>> [  +0.000000] Modules linked in:
>> [  +0.000001]  uio_hv_generic uio
>> [  +0.000001]  uio
>> [  +0.000000]  uio_hv_generic
>> [  +0.000000]  uio_hv_generic
>> ...
>>
>> I run kernel 5.10.27, unmodified, besides RT patch v36, on Azure Stack
>> Edge platform, software version 2105 (2.2.1606.3320).
>>
>> I perform the bind-unbind using the following script (please note the
>> comment inline):
>>
>> net_uuid="f8615163-df3e-46c5-913f-f2d2f965ed0e"
>> dev_uuid="$(basename "$(readlink "/sys/class/net/eth1/device")")"
>> modprobe uio_hv_generic
>> echo "${net_uuid}" > /sys/bus/vmbus/drivers/uio_hv_generic/new_id
>> printf "%s" "${dev_uuid}" > /sys/bus/vmbus/drivers/hv_netvsc/unbind
>> ### If I insert 'sleep 1' here - all works correctly
>> printf "%s" "${dev_uuid}" > /sys/bus/vmbus/drivers/uio_hv_generic/bind
>>
>>
>> Thanks,
>> Leonid.
> 
> It would be great if you can test the mainline kernel, which I suspect also
> has the bug.
> 
> It looks like netvsc_remove() -> netvsc_unregister_vf() does the unbinding work
> in a synchronous mannter. I don't know why the bug happens.
> 
> Right now I don't have a DPDK setup to test this, but I think the bug can
> be worked around by unbinding the PCI VF device from the pci-hyperv driver
> before unbinding the netvsc device, and re-binding the VF device after binding
> the netvsc device to uio_hv_generic.
> 
> Thanks,
> -- Dexuan
> 

Hi Dexuan,

Thanks for your reply. I can check for myself only next week, as I am 
out of office now, but do you think that the reason might be using 
cancel_delayed_work_sync(), instead of cancel_delayed_work() in 
netvsc_unregister_vf()?

And if the above is not correct, can you please advise on a way of 
finding the corresponding VF device from userspace, given the kernel 
name of the parent device? I did not find it in sysfs so far.

Thanks,
Leonid.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ