[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1717058986.0899282-2-hengqi@linux.alibaba.com>
Date: Thu, 30 May 2024 16:49:46 +0800
From: Heng Qi <hengqi@...ux.alibaba.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: Jason Wang <jasowang@...hat.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Jiri Pirko <jiri@...nulli.us>,
Daniel Jurgens <danielj@...dia.com>,
netdev@...r.kernel.org,
virtualization@...ts.linux.dev
Subject: Re: [PATCH net v3 2/2] virtio_net: fix a spurious deadlock issue
On Thu, 30 May 2024 10:34:07 +0200, Paolo Abeni <pabeni@...hat.com> wrote:
> On Tue, 2024-05-28 at 21:41 +0800, Heng Qi wrote:
> > When the following snippet is run, lockdep will report a deadlock[1].
> >
> > /* Acquire all queues dim_locks */
> > for (i = 0; i < vi->max_queue_pairs; i++)
> > mutex_lock(&vi->rq[i].dim_lock);
> >
> > There's no deadlock here because the vq locks are always taken
> > in the same order, but lockdep can not figure it out. So refactoring
> > the code to alleviate the problem.
> >
> > [1]
> > ========================================================
> > WARNING: possible recursive locking detected
> > 6.9.0-rc7+ #319 Not tainted
> > --------------------------------------------
> > ethtool/962 is trying to acquire lock:
> >
> > but task is already holding lock:
> >
> > other info that might help us debug this:
> > Possible unsafe locking scenario:
> >
> > CPU0
> > ----
> > lock(&vi->rq[i].dim_lock);
> > lock(&vi->rq[i].dim_lock);
> >
> > *** DEADLOCK ***
> >
> > May be due to missing lock nesting notation
> >
> > 3 locks held by ethtool/962:
> > #0: ffffffff82dbaab0 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40
> > #1: ffffffff82dad0a8 (rtnl_mutex){+.+.}-{3:3}, at:
> > ethnl_default_set_doit+0xbe/0x1e0
> >
> > stack backtrace:
> > CPU: 6 PID: 962 Comm: ethtool Not tainted 6.9.0-rc7+ #319
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > Call Trace:
> > <TASK>
> > dump_stack_lvl+0x79/0xb0
> > check_deadlock+0x130/0x220
> > __lock_acquire+0x861/0x990
> > lock_acquire.part.0+0x72/0x1d0
> > ? lock_acquire+0xf8/0x130
> > __mutex_lock+0x71/0xd50
> > virtnet_set_coalesce+0x151/0x190
> > __ethnl_set_coalesce.isra.0+0x3f8/0x4d0
> > ethnl_set_coalesce+0x34/0x90
> > ethnl_default_set_doit+0xdd/0x1e0
> > genl_family_rcv_msg_doit+0xdc/0x130
> > genl_family_rcv_msg+0x154/0x230
> > ? __pfx_ethnl_default_set_doit+0x10/0x10
> > genl_rcv_msg+0x4b/0xa0
> > ? __pfx_genl_rcv_msg+0x10/0x10
> > netlink_rcv_skb+0x5a/0x110
> > genl_rcv+0x28/0x40
> > netlink_unicast+0x1af/0x280
> > netlink_sendmsg+0x20e/0x460
> > __sys_sendto+0x1fe/0x210
> > ? find_held_lock+0x2b/0x80
> > ? do_user_addr_fault+0x3a2/0x8a0
> > ? __lock_release+0x5e/0x160
> > ? do_user_addr_fault+0x3a2/0x8a0
> > ? lock_release+0x72/0x140
> > ? do_user_addr_fault+0x3a7/0x8a0
> > __x64_sys_sendto+0x29/0x30
> > do_syscall_64+0x78/0x180
> > entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >
> > Fixes: 4d4ac2ececd3 ("virtio_net: Add a lock for per queue RX coalesce")
> > Signed-off-by: Heng Qi <hengqi@...ux.alibaba.com>
>
> This would have deserved a changelog after the commit message.
I declared the changelog in the cover-letter, but I can initiate a new
RESEND version with a changelog in this patch if you want :)
>
> The patch LGTM (for obvious reasons ;), but it deserves an explicit ack
> from Jason and/or Michael
Thanks.
>
> Cheers,
>
> Paolo
>
Powered by blists - more mailing lists