[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGxU2F6KaqmmK7qP55Rs8YHLOX62HyT77wY-RU1qPFpjhgV4jg@mail.gmail.com>
Date: Tue, 21 Oct 2025 14:19:40 +0200
From: Stefano Garzarella <sgarzare@...hat.com>
To: syzbot <syzbot+10e35716f8e4929681fa@...kaller.appspotmail.com>,
Michal Luczaj <mhal@...x.co>
Cc: davem@...emloft.net, edumazet@...gle.com, horms@...nel.org,
kuba@...nel.org, linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
pabeni@...hat.com, syzkaller-bugs@...glegroups.com,
virtualization@...ts.linux.dev
Subject: Re: [syzbot] [virt?] [net?] possible deadlock in vsock_linger
On Tue, 21 Oct 2025 at 12:48, Stefano Garzarella <sgarzare@...hat.com> wrote:
>
> On Tue, 21 Oct 2025 at 10:27, Stefano Garzarella <sgarzare@...hat.com> wrote:
> >
> > Hi Michal,
> >
> > On Mon, Oct 20, 2025 at 05:02:56PM -0700, syzbot wrote:
> > >Hello,
> > >
> > >syzbot found the following issue on:
> > >
> > >HEAD commit: d9043c79ba68 Merge tag 'sched_urgent_for_v6.18_rc2' of git..
> > >git tree: upstream
> > >console output: https://syzkaller.appspot.com/x/log.txt?x=130983cd980000
> > >kernel config: https://syzkaller.appspot.com/x/.config?x=f3e7b5a3627a90dd
> > >dashboard link: https://syzkaller.appspot.com/bug?extid=10e35716f8e4929681fa
> > >compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > >syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17f0f52f980000
> > >C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11ea9734580000
> > >
> > >Downloadable assets:
> > >disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-d9043c79.raw.xz
> > >vmlinux: https://storage.googleapis.com/syzbot-assets/0546b6eaf1aa/vmlinux-d9043c79.xz
> > >kernel image: https://storage.googleapis.com/syzbot-assets/81285b4ada51/bzImage-d9043c79.xz
> > >
> > >IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > >Reported-by: syzbot+10e35716f8e4929681fa@...kaller.appspotmail.com
> > >
> > >======================================================
> > >WARNING: possible circular locking dependency detected
> > >syzkaller #0 Not tainted
> > >------------------------------------------------------
> > >syz.0.17/6098 is trying to acquire lock:
> > >ffff8880363b8258 (sk_lock-AF_VSOCK){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1679 [inline]
> > >ffff8880363b8258 (sk_lock-AF_VSOCK){+.+.}-{0:0}, at: vsock_linger+0x25e/0x4d0 net/vmw_vsock/af_vsock.c:1066
> >
> > Could this be related to our recent work on linger in vsock?
> >
> > >
> > >but task is already holding lock:
> > >ffffffff906260a8 (vsock_register_mutex){+.+.}-{4:4}, at: vsock_assign_transport+0xf2/0x900 net/vmw_vsock/af_vsock.c:469
> > >
> > >which lock already depends on the new lock.
> > >
> > >
> > >the existing dependency chain (in reverse order) is:
> > >
> > >-> #1 (vsock_register_mutex){+.+.}-{4:4}:
> > > __mutex_lock_common kernel/locking/mutex.c:598 [inline]
> > > __mutex_lock+0x193/0x1060 kernel/locking/mutex.c:760
> > > vsock_registered_transport_cid net/vmw_vsock/af_vsock.c:560 [inline]
> >
> > Ah, no maybe this is related to commit 209fd720838a ("vsock:
> > Fix transport_{g2h,h2g} TOCTOU") where we added locking in
> > vsock_find_cid().
> >
> > Maybe we can just move the checks on top of __vsock_bind() to the
> > caller. I mean:
> >
> > /* First ensure this socket isn't already bound. */
> > if (vsock_addr_bound(&vsk->local_addr))
> > return -EINVAL;
> >
> > /* Now bind to the provided address or select appropriate values if
> > * none are provided (VMADDR_CID_ANY and VMADDR_PORT_ANY). Note that
> > * like AF_INET prevents binding to a non-local IP address (in most
> > * cases), we only allow binding to a local CID.
> > */
> > if (addr->svm_cid != VMADDR_CID_ANY && !vsock_find_cid(addr->svm_cid))
> > return -EADDRNOTAVAIL;
> >
> > We have 2 callers: vsock_auto_bind() and vsock_bind().
> >
> > vsock_auto_bind() is already checking if the socket is already bound,
> > if not is setting VMADDR_CID_ANY, so we can skip those checks.
> >
> > In vsock_bind() we can do the checks before lock_sock(sk), at least the
> > checks on vm_addr, calling vsock_find_cid().
> >
> > I'm preparing a patch to do this.
>
> mmm, no, this is more related to vsock_linger() where sk_wait_event()
> releases and locks again the sk_lock.
> So, it should be related to commit 687aa0c5581b ("vsock: Fix
> transport_* TOCTOU") where we take vsock_register_mutex in
> vsock_assign_transport() while calling vsk->transport->release().
>
> So, maybe we need to move the release and vsock_deassign_transport()
> after unlocking vsock_register_mutex.
I implemented this here:
https://lore.kernel.org/netdev/20251021121718.137668-1-sgarzare@redhat.com/
sysbot successfully tested it.
Stefano
Powered by blists - more mailing lists