linux-kernel - Re: [syzbot] [virt?] [net?] possible deadlock in vsock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9a0daed1-fd18-4fef-99e9-4058fd4abe18@rbox.co>
Date: Mon, 17 Nov 2025 22:00:18 +0100
From: Michal Luczaj <mhal@...x.co>
To: Stefano Garzarella <sgarzare@...hat.com>
Cc: syzbot <syzbot+10e35716f8e4929681fa@...kaller.appspotmail.com>,
 davem@...emloft.net, edumazet@...gle.com, horms@...nel.org, kuba@...nel.org,
 linux-kernel@...r.kernel.org, netdev@...r.kernel.org, pabeni@...hat.com,
 syzkaller-bugs@...glegroups.com, virtualization@...ts.linux.dev
Subject: Re: [syzbot] [virt?] [net?] possible deadlock in vsock_linger

On 11/17/25 10:57, Stefano Garzarella wrote:
> On Sat, Nov 15, 2025 at 05:00:28PM +0100, Michal Luczaj wrote:
>> On 10/21/25 14:19, Stefano Garzarella wrote:
>>> On Tue, 21 Oct 2025 at 12:48, Stefano Garzarella <sgarzare@...hat.com> wrote:
>>>>
>>>> On Tue, 21 Oct 2025 at 10:27, Stefano Garzarella <sgarzare@...hat.com> wrote:
>>>>>
>>>>> Hi Michal,
>>>>>
>>>>> On Mon, Oct 20, 2025 at 05:02:56PM -0700, syzbot wrote:
>>>>>> Hello,
>>>>>>
>>>>>> syzbot found the following issue on:
>>>>>>
>>>>>> HEAD commit:    d9043c79ba68 Merge tag 'sched_urgent_for_v6.18_rc2' of git..
>>>>>> git tree:       upstream
>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=130983cd980000
>>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=f3e7b5a3627a90dd
>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=10e35716f8e4929681fa
>>>>>> compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
>>>>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17f0f52f980000
>>>>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11ea9734580000
>>>>>>
>>>>>> Downloadable assets:
>>>>>> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-d9043c79.raw.xz
>>>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/0546b6eaf1aa/vmlinux-d9043c79.xz
>>>>>> kernel image: https://storage.googleapis.com/syzbot-assets/81285b4ada51/bzImage-d9043c79.xz
>>>>>>
>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>>> Reported-by: syzbot+10e35716f8e4929681fa@...kaller.appspotmail.com
>>>>>>
>>>>>> ======================================================
>>>>>> WARNING: possible circular locking dependency detected
>>>>>> syzkaller #0 Not tainted
>>>>>> ------------------------------------------------------
>>>>>> syz.0.17/6098 is trying to acquire lock:
>>>>>> ffff8880363b8258 (sk_lock-AF_VSOCK){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1679 [inline]
>>>>>> ffff8880363b8258 (sk_lock-AF_VSOCK){+.+.}-{0:0}, at: vsock_linger+0x25e/0x4d0 net/vmw_vsock/af_vsock.c:1066
>>>>>
>>>>> Could this be related to our recent work on linger in vsock?
>>>>>
>>>>>>
>>>>>> but task is already holding lock:
>>>>>> ffffffff906260a8 (vsock_register_mutex){+.+.}-{4:4}, at: vsock_assign_transport+0xf2/0x900 net/vmw_vsock/af_vsock.c:469
>>>>>>
>>>>>> which lock already depends on the new lock.
>>>>>>
>>>>>>
>>>>>> the existing dependency chain (in reverse order) is:
>>>>>>
>>>>>> -> #1 (vsock_register_mutex){+.+.}-{4:4}:
>>>>>>       __mutex_lock_common kernel/locking/mutex.c:598 [inline]
>>>>>>       __mutex_lock+0x193/0x1060 kernel/locking/mutex.c:760
>>>>>>       vsock_registered_transport_cid net/vmw_vsock/af_vsock.c:560 [inline]
>>>>>
>>>>> Ah, no maybe this is related to commit 209fd720838a ("vsock:
>>>>> Fix transport_{g2h,h2g} TOCTOU") where we added locking in
>>>>> vsock_find_cid().
>>>>>
>>>>> Maybe we can just move the checks on top of __vsock_bind() to the
>>>>> caller. I mean:
>>>>>
>>>>>         /* First ensure this socket isn't already bound. */
>>>>>         if (vsock_addr_bound(&vsk->local_addr))
>>>>>                 return -EINVAL;
>>>>>
>>>>>         /* Now bind to the provided address or select appropriate values if
>>>>>          * none are provided (VMADDR_CID_ANY and VMADDR_PORT_ANY).  Note that
>>>>>          * like AF_INET prevents binding to a non-local IP address (in most
>>>>>          * cases), we only allow binding to a local CID.
>>>>>          */
>>>>>         if (addr->svm_cid != VMADDR_CID_ANY && !vsock_find_cid(addr->svm_cid))
>>>>>                 return -EADDRNOTAVAIL;
>>>>>
>>>>> We have 2 callers: vsock_auto_bind() and vsock_bind().
>>>>>
>>>>> vsock_auto_bind() is already checking if the socket is already bound,
>>>>> if not is setting VMADDR_CID_ANY, so we can skip those checks.
>>>>>
>>>>> In vsock_bind() we can do the checks before lock_sock(sk), at least the
>>>>> checks on vm_addr, calling vsock_find_cid().
>>>>>
>>>>> I'm preparing a patch to do this.
>>>>
>>>> mmm, no, this is more related to vsock_linger() where sk_wait_event()
>>>> releases and locks again the sk_lock.
>>>> So, it should be related to commit 687aa0c5581b ("vsock: Fix
>>>> transport_* TOCTOU") where we take vsock_register_mutex in
>>>> vsock_assign_transport() while calling vsk->transport->release().
>>>>
>>>> So, maybe we need to move the release and vsock_deassign_transport()
>>>> after unlocking vsock_register_mutex.
>>>
>>> I implemented this here:
>>> https://lore.kernel.org/netdev/20251021121718.137668-1-sgarzare@redhat.com/
>>>
>>> sysbot successfully tested it.
>>>
>>> Stefano
>>
>> Hi Stefano
> 
> Hi Michal!
> 
>>
>> Apologies for missing this, I was away for a couple of weeks.
> 
> Don't worry at all!
> 
>>
>> Turns out it's vsock_connect()'s reset-on-signal that strikes again. While
>> you've fixed the lock order inversion (thank you), being able to reset an
>> established socket, combined with SO_LINGER's lock-release-lock dance,
>> still leads to crashes.
> 
> Yeah, I see!
> 
>>
>> I think it goes like this: if user hits connect() with a signal right after
>> connection is established (which implies an assigned transport), `sk_state`
>> gets set to TCP_CLOSING and `state` to SS_UNCONNECTED. SS_UNCONNECTED means
>> connect() can be retried. If re-connect() is for a different CID, transport
>> reassignment takes place. That involves transport->release() of the old
>> transport. Because `sk_state == TCP_CLOSING`, vsock_linger() is called.
>> Lingering temporarily releases socket lock. Which can be raced by another
>> thread doing connect(). Basically thread-1 can release resources from under
>> thread-0. That breaks the assumptions, e.g. virtio_transport_unsent_bytes()
>> does not expect a disappearing transport.
> 
> Makes sense to me!
> 
>>
>> BUG: KASAN: slab-use-after-free in _raw_spin_lock_bh+0x34/0x40
>> Read of size 1 at addr ffff888107c99420 by task a.out/1385
>> CPU: 6 UID: 1000 PID: 1385 Comm: a.out Tainted: G            E
>> 6.18.0-rc5+ #241 PREEMPT(voluntary)
>> Call Trace:
>> dump_stack_lvl+0x7e/0xc0
>> print_report+0x170/0x4de
>> kasan_report+0xc2/0x180
>> __kasan_check_byte+0x3a/0x50
>> lock_acquire+0xb2/0x300
>> _raw_spin_lock_bh+0x34/0x40
>> virtio_transport_unsent_bytes+0x3b/0x80
>> vsock_linger+0x263/0x370
>> virtio_transport_release+0x3ff/0x510
>> vsock_assign_transport+0x358/0x780
>> vsock_connect+0x5a2/0xc40
>> __sys_connect+0xde/0x110
>> __x64_sys_connect+0x6e/0xc0
>> do_syscall_64+0x94/0xbb0
>> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>>
>> Allocated by task 1384:
>> kasan_save_stack+0x1c/0x40
>> kasan_save_track+0x10/0x30
>> __kasan_kmalloc+0x92/0xa0
>> virtio_transport_do_socket_init+0x48/0x320
>> vsock_assign_transport+0x4ff/0x780
>> vsock_connect+0x5a2/0xc40
>> __sys_connect+0xde/0x110
>> __x64_sys_connect+0x6e/0xc0
>> do_syscall_64+0x94/0xbb0
>> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>>
>> Freed by task 1384:
>> kasan_save_stack+0x1c/0x40
>> kasan_save_track+0x10/0x30
>> __kasan_save_free_info+0x37/0x50
>> __kasan_slab_free+0x63/0x80
>> kfree+0x142/0x6a0
>> virtio_transport_destruct+0x86/0x170
>> vsock_assign_transport+0x3a8/0x780
>> vsock_connect+0x5a2/0xc40
>> __sys_connect+0xde/0x110
>> __x64_sys_connect+0x6e/0xc0
>> do_syscall_64+0x94/0xbb0
>> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>>
>> I suppose there are many ways this chain of events can be stopped, but I
>> see it as yet another reason to simplify vsock_connect(): do not let it
>> "reset" an already established socket. I guess that would do the trick.
>> What do you think?
> 
> I agree, we should do that. Do you have time to take a look?

Sure, here's a patch:
https://lore.kernel.org/netdev/20251117-vsock-interrupted-connect-v1-1-bc021e907c3f@rbox.co/

Michal