lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e6411cb5ec7e5ecc211faded7af4843647c6143a.camel@redhat.com>
Date: Thu, 30 May 2024 10:34:07 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Heng Qi <hengqi@...ux.alibaba.com>, netdev@...r.kernel.org, 
	virtualization@...ts.linux.dev
Cc: Jason Wang <jasowang@...hat.com>, "Michael S. Tsirkin" <mst@...hat.com>,
  Xuan Zhuo <xuanzhuo@...ux.alibaba.com>, "David S. Miller"
 <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski
 <kuba@...nel.org>, Jiri Pirko <jiri@...nulli.us>, Daniel Jurgens
 <danielj@...dia.com>
Subject: Re: [PATCH net v3 2/2] virtio_net: fix a spurious deadlock issue

On Tue, 2024-05-28 at 21:41 +0800, Heng Qi wrote:
> When the following snippet is run, lockdep will report a deadlock[1].
> 
>   /* Acquire all queues dim_locks */
>   for (i = 0; i < vi->max_queue_pairs; i++)
>           mutex_lock(&vi->rq[i].dim_lock);
> 
> There's no deadlock here because the vq locks are always taken
> in the same order, but lockdep can not figure it out. So refactoring
> the code to alleviate the problem.
> 
> [1]
> ========================================================
> WARNING: possible recursive locking detected
> 6.9.0-rc7+ #319 Not tainted
> --------------------------------------------
> ethtool/962 is trying to acquire lock:
> 
> but task is already holding lock:
> 
> other info that might help us debug this:
> Possible unsafe locking scenario:
> 
>       CPU0
>       ----
>  lock(&vi->rq[i].dim_lock);
>  lock(&vi->rq[i].dim_lock);
> 
> *** DEADLOCK ***
> 
>  May be due to missing lock nesting notation
> 
> 3 locks held by ethtool/962:
>  #0: ffffffff82dbaab0 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40
>  #1: ffffffff82dad0a8 (rtnl_mutex){+.+.}-{3:3}, at:
> 				ethnl_default_set_doit+0xbe/0x1e0
> 
> stack backtrace:
> CPU: 6 PID: 962 Comm: ethtool Not tainted 6.9.0-rc7+ #319
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 	   rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> Call Trace:
>  <TASK>
>  dump_stack_lvl+0x79/0xb0
>  check_deadlock+0x130/0x220
>  __lock_acquire+0x861/0x990
>  lock_acquire.part.0+0x72/0x1d0
>  ? lock_acquire+0xf8/0x130
>  __mutex_lock+0x71/0xd50
>  virtnet_set_coalesce+0x151/0x190
>  __ethnl_set_coalesce.isra.0+0x3f8/0x4d0
>  ethnl_set_coalesce+0x34/0x90
>  ethnl_default_set_doit+0xdd/0x1e0
>  genl_family_rcv_msg_doit+0xdc/0x130
>  genl_family_rcv_msg+0x154/0x230
>  ? __pfx_ethnl_default_set_doit+0x10/0x10
>  genl_rcv_msg+0x4b/0xa0
>  ? __pfx_genl_rcv_msg+0x10/0x10
>  netlink_rcv_skb+0x5a/0x110
>  genl_rcv+0x28/0x40
>  netlink_unicast+0x1af/0x280
>  netlink_sendmsg+0x20e/0x460
>  __sys_sendto+0x1fe/0x210
>  ? find_held_lock+0x2b/0x80
>  ? do_user_addr_fault+0x3a2/0x8a0
>  ? __lock_release+0x5e/0x160
>  ? do_user_addr_fault+0x3a2/0x8a0
>  ? lock_release+0x72/0x140
>  ? do_user_addr_fault+0x3a7/0x8a0
>  __x64_sys_sendto+0x29/0x30
>  do_syscall_64+0x78/0x180
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> Fixes: 4d4ac2ececd3 ("virtio_net: Add a lock for per queue RX coalesce")
> Signed-off-by: Heng Qi <hengqi@...ux.alibaba.com>

This would have deserved a changelog after the commit message.

The patch LGTM (for obvious reasons ;), but it deserves an explicit ack
from Jason and/or Michael

Cheers,

Paolo


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ