netdev - Re: [PATCH] virtio-net: disable delayed refill when pausing rx

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <58b84867-086e-44f7-86b1-c1759bbc2009@gmail.com>
Date: Wed, 9 Apr 2025 13:55:45 +0700
From: Bui Quang Minh <minhquangbui99@...il.com>
To: Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>
Cc: Jason Wang <jasowang@...hat.com>, Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
 "Michael S . Tsirkin" <mst@...hat.com>, Andrew Lunn <andrew+netdev@...n.ch>,
 Eric Dumazet <edumazet@...gle.com>, Alexei Starovoitov <ast@...nel.org>,
 Daniel Borkmann <daniel@...earbox.net>,
 Jesper Dangaard Brouer <hawk@...nel.org>,
 John Fastabend <john.fastabend@...il.com>,
 Eugenio Pérez <eperezma@...hat.com>,
 "David S . Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
 linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
 virtualization@...ts.linux.dev
Subject: Re: [PATCH] virtio-net: disable delayed refill when pausing rx

On 4/8/25 21:37, Jakub Kicinski wrote:
> On Tue, 8 Apr 2025 11:28:54 +0200 Paolo Abeni wrote:
>>>> When napi_disable is called on an already disabled napi, it will sleep
>>>> in napi_disable_locked while still holding the netdev_lock. As a result,
>>>> later napi_enable gets stuck too as it cannot acquire the netdev_lock.
>>>> This leads to refill_work and the pause-then-resume tx are stuck altogether.
>>> This needs to be added to the chagelog. And it looks like this is a fix for
>>>
>>> commit 413f0271f3966e0c73d4937963f19335af19e628
>>> Author: Jakub Kicinski <kuba@...nel.org>
>>> Date:   Tue Jan 14 19:53:14 2025 -0800
>>>
>>>      net: protect NAPI enablement with netdev_lock()
>>>
>>> ?
>>>
>>> I wonder if it's simpler to just hold the netdev lock in resize or xsk
>>> binding instead of this.
>> Setting:
>>
>> 	dev->request_ops_lock = true;
>>
>> in virtnet_probe() before calling register_netdevice() should achieve
>> the above. Could you please have a try?
> Can we do that or do we need a more tailored fix? request_ops_lock only
> appeared in 6.15 and the bug AFAIU dates back to 6.14. We don't normally
> worry about given the stream of fixes - request_ops_lock is a bit risky.
> Jason's suggestion AFAIU is just to wrap the disable/enable pairs in
> the lock. We can try request_ops_lock in -next ?
>
> Bui Quang Minh, could you add a selftest for this problem?
> See tools/testing/selftests/drivers/net/virtio_net/
> You can re-use / extend the XSK helper from
> tools/testing/selftests/drivers/net/xdp_helper.c ?
> (move it to tools/testing/selftests/net/lib for easier access)

I've just tried the current selftests and found out that the queues.py 
can trigger the deadlock.

Running this a few times (the probability is quite high) on the guest 
with virtio-net interface
~ NETIF=enp0s2 ./ksft-net-drv/run_kselftest.sh -t drivers/net:queues.py

The test hangs.

I've got the hung task warning

[  363.841549]  #0: ffff88800015c758 
((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x1a9/0x3c9
[  363.841560]  #1: ffffc90000043e28 
((work_completion)(&pool->work)){+.+.}-{0:0}, at: 
process_scheduled_works+0x1c2/0x3c9
[  363.841570]  #2: ffffffff81f5a150 (rtnl_mutex){+.+.}-{4:4}, at: 
rtnl_lock+0x1b/0x1d
[  363.841579]  #3: ffff8880035e49f0 (&dev->lock){+.+.}-{4:4}, at: 
netdev_lock+0x12/0x14
[  363.841587] 3 locks held by kworker/2:0/26:
[  363.841589]  #0: ffff88800015c758 
((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x1a9/0x3c9
[  363.841598]  #1: ffffc900000f7e28 
((work_completion)(&(&vi->refill)->work)){+.+.}-{0:0}, at: 
process_scheduled_works+0x1c2/0x3c9
[  363.841608]  #2: ffff8880035e49f0 (&dev->lock){+.+.}-{4:4}, at: 
netdev_lock+0x12/0x14

~ cat /proc/7/stack
[<0>] netdev_lock+0x12/0x14
[<0>] napi_enable+0x1a/0x35
[<0>] virtnet_napi_do_enable+0x1a/0x40
[<0>] virtnet_napi_enable+0x32/0x4a
[<0>] virtnet_rx_resume+0x4f/0x56
[<0>] virtnet_rq_bind_xsk_pool+0xd8/0xfd
[<0>] virtnet_xdp+0x6d3/0x72d
[<0>] xp_disable_drv_zc+0x5a/0x63
[<0>] xp_clear_dev+0x1d/0x4a
[<0>] xp_release_deferred+0x24/0x75
[<0>] process_scheduled_works+0x23e/0x3c9
[<0>] worker_thread+0x13f/0x1ca
[<0>] kthread+0x1b2/0x1ba
[<0>] ret_from_fork+0x2b/0x40
[<0>] ret_from_fork_asm+0x11/0x20

~ cat /proc/26/stack
[<0>] napi_disable_locked+0x41/0xaf
[<0>] napi_disable+0x22/0x35
[<0>] refill_work+0x4b/0x8f
[<0>] process_scheduled_works+0x23e/0x3c9
[<0>] worker_thread+0x13f/0x1ca
[<0>] kthread+0x1b2/0x1ba
[<0>] ret_from_fork+0x2b/0x40
[<0>] ret_from_fork_asm+0x11/0x20

The deadlock was caused by a race between xsk unbind and refill_work.


Thanks,
Quang Minh.