[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d5ad1b10-f485-4939-b9de-918b378362b9@bytedance.com>
Date: Tue, 15 Jul 2025 18:59:51 +0800
From: Zigit Zo <zuozhijie@...edance.com>
To: Jason Wang <jasowang@...hat.com>
Cc: mst@...hat.com, xuanzhuo@...ux.alibaba.com, eperezma@...hat.com,
virtualization@...ts.linux.dev, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: Re: [PATCH net v2] virtio-net: fix a rtnl_lock() deadlock during
probing
On 7/15/25 5:31 PM, Jason Wang wrote:
> On Wed, Jul 2, 2025 at 6:37 PM Zigit Zo <zuozhijie@...edance.com> wrote:
>>
>> This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while
>> the virtio-net driver is still probing with rtnl_lock() hold, this will
>> cause a recursive mutex in netdev_notify_peers().
>>
>> Fix it by temporarily save the announce status while probing, and then in
>> virtnet_open(), if it sees a delayed announce work is there, it starts to
>> schedule the virtnet_config_changed_work().
>>
>> Another possible solution is to directly check whether rtnl_is_locked()
>> and call __netdev_notify_peers(), but in that way means we need to relies
>> on netdev_queue to schedule the arp packets after ndo_open(), which we
>> thought is not very intuitive.
>>
>> We've observed a softlockup with Ubuntu 24.04, and can be reproduced with
>> QEMU sending the announce_self rapidly while booting.
>>
>> [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds.
>> [ 494.167667] Not tainted 6.8.0-57-generic #59-Ubuntu
>> [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000
>> [ 494.168260] Call Trace:
>> [ 494.168329] <TASK>
>> [ 494.168389] __schedule+0x27c/0x6b0
>> [ 494.168495] schedule+0x33/0x110
>> [ 494.168585] schedule_preempt_disabled+0x15/0x30
>> [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740
>> [ 494.168835] __mutex_lock_slowpath+0x13/0x20
>> [ 494.168949] mutex_lock+0x3c/0x50
>> [ 494.169039] rtnl_lock+0x15/0x20
>> [ 494.169128] netdev_notify_peers+0x12/0x30
>> [ 494.169240] virtnet_config_changed_work+0x152/0x1a0
>> [ 494.169377] virtnet_probe+0xa48/0xe00
>> [ 494.169484] ? vp_get+0x4d/0x100
>> [ 494.169574] virtio_dev_probe+0x1e9/0x310
>> [ 494.169682] really_probe+0x1c7/0x410
>> [ 494.169783] __driver_probe_device+0x8c/0x180
>> [ 494.169901] driver_probe_device+0x24/0xd0
>> [ 494.170011] __driver_attach+0x10b/0x210
>> [ 494.170117] ? __pfx___driver_attach+0x10/0x10
>> [ 494.170237] bus_for_each_dev+0x8d/0xf0
>> [ 494.170341] driver_attach+0x1e/0x30
>> [ 494.170440] bus_add_driver+0x14e/0x290
>> [ 494.170548] driver_register+0x5e/0x130
>> [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10
>> [ 494.170788] register_virtio_driver+0x20/0x40
>> [ 494.170905] virtio_net_driver_init+0x97/0xb0
>> [ 494.171022] do_one_initcall+0x5e/0x340
>> [ 494.171128] do_initcalls+0x107/0x230
>> [ 494.171228] ? __pfx_kernel_init+0x10/0x10
>> [ 494.171340] kernel_init_freeable+0x134/0x210
>> [ 494.171462] kernel_init+0x1b/0x200
>> [ 494.171560] ret_from_fork+0x47/0x70
>> [ 494.171659] ? __pfx_kernel_init+0x10/0x10
>> [ 494.171769] ret_from_fork_asm+0x1b/0x30
>> [ 494.171875] </TASK>
>>
>> Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down")
>> Signed-off-by: Zigit Zo <zuozhijie@...edance.com>
>> ---
>> v1 -> v2:
>> - Check vi->status in virtnet_open().
>> v1:
>> - https://lore.kernel.org/netdev/20250630095109.214013-1-zuozhijie@bytedance.com/
>> ---
>> drivers/net/virtio_net.c | 43 ++++++++++++++++++++++++----------------
>> 1 file changed, 26 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index e53ba600605a..859add98909b 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -3151,6 +3151,10 @@ static int virtnet_open(struct net_device *dev)
>> if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) {
>> if (vi->status & VIRTIO_NET_S_LINK_UP)
>> netif_carrier_on(vi->dev);
>> + if (vi->status & VIRTIO_NET_S_ANNOUNCE) {
>> + vi->status &= ~VIRTIO_NET_S_ANNOUNCE;
>> + schedule_work(&vi->config_work);
>> + }
>> virtio_config_driver_enable(vi->vdev);
>
> Instead of doing tricks like this.
>
> I wonder if the fix is as simple as calling
> virtio_config_driver_disable() before init_vqs()?
>
> Thanks
>
That might not work as the device like QEMU will set the VIRTIO_NET_S_ANNOUNCE
regardless of most of the driver status, QEMU only checks whether the driver has
finalized it's features with VIRTIO_NET_F_GUEST_ANNOUNCE & VIRTIO_NET_F_CTRL_VQ.
We've made a little patch to verify, don't know if it matches your thought, but
it does not seem to work :(
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index e53ba600605a..f309ce3fe243 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -6903,6 +6903,9 @@ static int virtnet_probe(struct virtio_device *vdev)
vi->curr_queue_pairs = num_online_cpus();
vi->max_queue_pairs = max_queue_pairs;
+ /* Disable config change notification until ndo_open. */
+ virtio_config_driver_disable(vi->vdev);
+
/* Allocate/initialize the rx/tx queues, and invoke find_vqs */
err = init_vqs(vi);
if (err)
@@ -6965,9 +6968,6 @@ static int virtnet_probe(struct virtio_device *vdev)
goto free_failover;
}
- /* Disable config change notification until ndo_open. */
- virtio_config_driver_disable(vi->vdev);
-
virtio_device_ready(vdev);
if (vi->has_rss || vi->has_rss_hash_report) {
For reproduce details,
1. Spawn qemu with monitor, like `-monitor unix:qemu.sock,server`
2. In another window, run `while true; echo "announce_self"; end | socat - unix-connect:qemu.sock > /dev/null`
3. The boot up will get hanged when probing the virtio_net
The simplest version we've made is to revert the usage of
`virtnet_config_changed_work()` back to the `schedule_work()`, but as in v1,
we're still trying to understand the impact, making sure that it won't break
other things.
Regards,
Powered by blists - more mailing lists