[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9a2b5887-3874-4d3d-bbd8-0b5a25d3e605@bytedance.com>
Date: Wed, 16 Jul 2025 19:20:25 +0800
From: Zigit Zo <zuozhijie@...edance.com>
To: Jason Wang <jasowang@...hat.com>
Cc: mst@...hat.com, xuanzhuo@...ux.alibaba.com, eperezma@...hat.com,
virtualization@...ts.linux.dev, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, zuozhijie@...edance.com
Subject: Re: Re: Re: [PATCH net v2] virtio-net: fix a rtnl_lock() deadlock
during probing
On 7/16/25 10:47 AM, Jason Wang wrote:
> Hi Zigit:
>
> On Tue, Jul 15, 2025 at 7:00 PM Zigit Zo <zuozhijie@...edance.com> wrote:
>>
>> On 7/15/25 5:31 PM, Jason Wang wrote:
>>> On Wed, Jul 2, 2025 at 6:37 PM Zigit Zo <zuozhijie@...edance.com> wrote:
>>>>
>>>> This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while
>>>> the virtio-net driver is still probing with rtnl_lock() hold, this will
>>>> cause a recursive mutex in netdev_notify_peers().
>>>>
>>>> Fix it by temporarily save the announce status while probing, and then in
>>>> virtnet_open(), if it sees a delayed announce work is there, it starts to
>>>> schedule the virtnet_config_changed_work().
>>>>
>>>> Another possible solution is to directly check whether rtnl_is_locked()
>>>> and call __netdev_notify_peers(), but in that way means we need to relies
>>>> on netdev_queue to schedule the arp packets after ndo_open(), which we
>>>> thought is not very intuitive.
>>>>
>>>> We've observed a softlockup with Ubuntu 24.04, and can be reproduced with
>>>> QEMU sending the announce_self rapidly while booting.
>>>>
>>>> [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds.
>>>> [ 494.167667] Not tainted 6.8.0-57-generic #59-Ubuntu
>>>> [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000
>>>> [ 494.168260] Call Trace:
>>>> [ 494.168329] <TASK>
>>>> [ 494.168389] __schedule+0x27c/0x6b0
>>>> [ 494.168495] schedule+0x33/0x110
>>>> [ 494.168585] schedule_preempt_disabled+0x15/0x30
>>>> [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740
>>>> [ 494.168835] __mutex_lock_slowpath+0x13/0x20
>>>> [ 494.168949] mutex_lock+0x3c/0x50
>>>> [ 494.169039] rtnl_lock+0x15/0x20
>>>> [ 494.169128] netdev_notify_peers+0x12/0x30
>>>> [ 494.169240] virtnet_config_changed_work+0x152/0x1a0
>>>> [ 494.169377] virtnet_probe+0xa48/0xe00
>>>> [ 494.169484] ? vp_get+0x4d/0x100
>>>> [ 494.169574] virtio_dev_probe+0x1e9/0x310
>>>> [ 494.169682] really_probe+0x1c7/0x410
>>>> [ 494.169783] __driver_probe_device+0x8c/0x180
>>>> [ 494.169901] driver_probe_device+0x24/0xd0
>>>> [ 494.170011] __driver_attach+0x10b/0x210
>>>> [ 494.170117] ? __pfx___driver_attach+0x10/0x10
>>>> [ 494.170237] bus_for_each_dev+0x8d/0xf0
>>>> [ 494.170341] driver_attach+0x1e/0x30
>>>> [ 494.170440] bus_add_driver+0x14e/0x290
>>>> [ 494.170548] driver_register+0x5e/0x130
>>>> [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10
>>>> [ 494.170788] register_virtio_driver+0x20/0x40
>>>> [ 494.170905] virtio_net_driver_init+0x97/0xb0
>>>> [ 494.171022] do_one_initcall+0x5e/0x340
>>>> [ 494.171128] do_initcalls+0x107/0x230
>>>> [ 494.171228] ? __pfx_kernel_init+0x10/0x10
>>>> [ 494.171340] kernel_init_freeable+0x134/0x210
>>>> [ 494.171462] kernel_init+0x1b/0x200
>>>> [ 494.171560] ret_from_fork+0x47/0x70
>>>> [ 494.171659] ? __pfx_kernel_init+0x10/0x10
>>>> [ 494.171769] ret_from_fork_asm+0x1b/0x30
>>>> [ 494.171875] </TASK>
>>>>
>>>> Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down")
>>>> Signed-off-by: Zigit Zo <zuozhijie@...edance.com>
>>>> ---
>>>> v1 -> v2:
>>>> - Check vi->status in virtnet_open().
>>>> v1:
>>>> - https://lore.kernel.org/netdev/20250630095109.214013-1-zuozhijie@bytedance.com/
>>>> ---
>>>> drivers/net/virtio_net.c | 43 ++++++++++++++++++++++++----------------
>>>> 1 file changed, 26 insertions(+), 17 deletions(-)
>>>>
>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>> index e53ba600605a..859add98909b 100644
>>>> --- a/drivers/net/virtio_net.c
>>>> +++ b/drivers/net/virtio_net.c
>>>> @@ -3151,6 +3151,10 @@ static int virtnet_open(struct net_device *dev)
>>>> if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) {
>>>> if (vi->status & VIRTIO_NET_S_LINK_UP)
>>>> netif_carrier_on(vi->dev);
>>>> + if (vi->status & VIRTIO_NET_S_ANNOUNCE) {
>>>> + vi->status &= ~VIRTIO_NET_S_ANNOUNCE;
>>>> + schedule_work(&vi->config_work);
>>>> + }
>>>> virtio_config_driver_enable(vi->vdev);
>>>
>>> Instead of doing tricks like this.
>>>
>>> I wonder if the fix is as simple as calling
>>> virtio_config_driver_disable() before init_vqs()?
>>>
>>> Thanks
>>>
>>
>> That might not work as the device like QEMU will set the VIRTIO_NET_S_ANNOUNCE
>> regardless of most of the driver status, QEMU only checks whether the driver has
>> finalized it's features with VIRTIO_NET_F_GUEST_ANNOUNCE & VIRTIO_NET_F_CTRL_VQ.
>>
>> We've made a little patch to verify, don't know if it matches your thought, but
>> it does not seem to work :(
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index e53ba600605a..f309ce3fe243 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -6903,6 +6903,9 @@ static int virtnet_probe(struct virtio_device *vdev)
>> vi->curr_queue_pairs = num_online_cpus();
>> vi->max_queue_pairs = max_queue_pairs;
>>
>> + /* Disable config change notification until ndo_open. */
>> + virtio_config_driver_disable(vi->vdev);
>> +
>> /* Allocate/initialize the rx/tx queues, and invoke find_vqs */
>> err = init_vqs(vi);
>> if (err)
>> @@ -6965,9 +6968,6 @@ static int virtnet_probe(struct virtio_device *vdev)
>> goto free_failover;
>> }
>>
>> - /* Disable config change notification until ndo_open. */
>> - virtio_config_driver_disable(vi->vdev);
>> -
>> virtio_device_ready(vdev);
>>
>> if (vi->has_rss || vi->has_rss_hash_report) {
>>
>> For reproduce details,
>>
>> 1. Spawn qemu with monitor, like `-monitor unix:qemu.sock,server`
>> 2. In another window, run `while true; echo "announce_self"; end | socat - unix-connect:qemu.sock > /dev/null`
>> 3. The boot up will get hanged when probing the virtio_net
>>
>> The simplest version we've made is to revert the usage of
>> `virtnet_config_changed_work()` back to the `schedule_work()`, but as in v1,
>> we're still trying to understand the impact, making sure that it won't break
>> other things.
>>
>> Regards,
>>
>
> Thanks for the clarification. Now I see the issue.
>
> It looks like the root cause is to call virtio_config_changed_work()
> directly during probe().
>
> Let's switch to use virtio_config_changed() instead so that we can
> properly check the config_driver_disabled.
>
> Thanks
>
Yes, we just wonder why there's a change to `virtnet_config_changed_work()`
in commit df28de7b0050, and therefore try to keep this behavior as much
as possible, to avoid breaking something.
Anyway, v3 will be sent off soon, thanks for the reviewing!
Regards,
Powered by blists - more mailing lists