lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e97b5cae-f6ef-4221-98e1-6efd7fdc6676@rbox.co>
Date: Sun, 29 Jun 2025 23:26:12 +0200
From: Michal Luczaj <mhal@...x.co>
To: Stefano Garzarella <sgarzare@...hat.com>
Cc: "David S. Miller" <davem@...emloft.net>,
 Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
 Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
 Stefan Hajnoczi <stefanha@...hat.com>, virtualization@...ts.linux.dev,
 netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC net v2 1/3] vsock: Fix transport_{g2h,h2g} TOCTOU

On 6/27/25 10:02, Stefano Garzarella wrote:
> On Wed, Jun 25, 2025 at 11:23:30PM +0200, Michal Luczaj wrote:
>> On 6/25/25 10:43, Stefano Garzarella wrote:
>>> On Fri, Jun 20, 2025 at 09:52:43PM +0200, Michal Luczaj wrote:
>>>> vsock_find_cid() and vsock_dev_do_ioctl() may race with module unload.
>>>> transport_{g2h,h2g} may become NULL after the NULL check.
>>>>
>>>> Introduce vsock_transport_local_cid() to protect from a potential
>>>> null-ptr-deref.
>>>>
>>>> KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
>>>> RIP: 0010:vsock_find_cid+0x47/0x90
>>>> Call Trace:
>>>> __vsock_bind+0x4b2/0x720
>>>> vsock_bind+0x90/0xe0
>>>> __sys_bind+0x14d/0x1e0
>>>> __x64_sys_bind+0x6e/0xc0
>>>> do_syscall_64+0x92/0x1c0
>>>> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>>>>
>>>> KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
>>>> RIP: 0010:vsock_dev_do_ioctl.isra.0+0x58/0xf0
>>>> Call Trace:
>>>> __x64_sys_ioctl+0x12d/0x190
>>>> do_syscall_64+0x92/0x1c0
>>>> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>>>>
>>>> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
>>>> Suggested-by: Stefano Garzarella <sgarzare@...hat.com>
>>>> Signed-off-by: Michal Luczaj <mhal@...x.co>
>>>> ---
>>>> net/vmw_vsock/af_vsock.c | 23 +++++++++++++++++------
>>>> 1 file changed, 17 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>>>> index 2e7a3034e965db30b6ee295370d866e6d8b1c341..63a920af5bfe6960306a3e5eeae0cbf30648985e 100644
>>>> --- a/net/vmw_vsock/af_vsock.c
>>>> +++ b/net/vmw_vsock/af_vsock.c
>>>> @@ -531,9 +531,21 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
>>>> }
>>>> EXPORT_SYMBOL_GPL(vsock_assign_transport);
>>>>
>>>> +static u32 vsock_transport_local_cid(const struct vsock_transport **transport)
>>>
>>> Why we need double pointer?
>>
>> Because of a possible race. If @transport is `struct vsock_transport*` and
>> we pass `transport_g2h`, the passed non-NULL pointer value may immediately
>> become stale (due to module unload). But if it's `vsock_transport**` and we
>> pass `&transport_g2h`, then we can take the mutex, check `*transport` for
>> NULL and safely go ahead.
>>
>> Or are you saying this could be simplified?
> 
> Nope, you're right! I was still thinking about my old version where we 
> had the switch inside...
> 
> BTW I'd like to change the name, `vsock_transport_local` prefix is 
> confusing IMO, since it seems related only to the `transport_local`.
> 
> Another thing I'm worried about is that we'll then start using it on 
> `vsk->transport` when this is only to be used on registered transports 
> (i.e. `static ...`), though, I don't think there's a way to force type 
> checking from the compiler (unless you wrap it in a struct). (...)

I've found (on SO[1]) this somewhat hackish compile-time `static`-checking:

static u32 __vsock_registered_transport_cid(const struct vsock_transport
**transport)
{
	u32 cid = VMADDR_CID_ANY;

	mutex_lock(&vsock_register_mutex);
	if (*transport)
		cid = (*transport)->get_local_cid();
	mutex_unlock(&vsock_register_mutex);

	return cid;
}

#define ASSERT_REGISTERED_TRANSPORT(t)					\
	__always_unused static void *__UNIQUE_ID(vsock) = (t)

#define vsock_registered_transport_cid(transport)			\
({									\
	ASSERT_REGISTERED_TRANSPORT(transport);				\
	__vsock_registered_transport_cid(transport);			\
})

It does the trick, compilation fails on
vsock_registered_transport_cid(&vsk->transport):

net/vmw_vsock/af_vsock.c: In function ‘vsock_send_shutdown’:
net/vmw_vsock/af_vsock.c:565:59: error: initializer element is not constant
  565 |         __always_unused static void *__UNIQUE_ID(vsock) = (t)
      |                                                           ^
net/vmw_vsock/af_vsock.c:569:9: note: in expansion of macro
‘ASSERT_REGISTERED_TRANSPORT’
  569 |         ASSERT_REGISTERED_TRANSPORT(transport);
    \
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~
net/vmw_vsock/af_vsock.c:626:9: note: in expansion of macro
‘vsock_registered_transport_cid’
  626 |         vsock_registered_transport_cid(&vsk->transport);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

But perhaps adding a comment wouldn't hurt either, e.g.

/* Provide safe access to static transport_{h2g,g2h,dgram,local} callbacks.
 * Otherwise we may race with module removal. Do not use on
 * `vsk->transport`.
 */

? ...which begs another question: do we stick to the netdev special comment
style? See commit 82b8000c28b5 ("net: drop special comment style").

Oh, and come to think of it, we don't really need that (easily contended?)
mutex here. Same can be done with RCU. Which should speed up vsock_bind()
-> __vsock_bind() -> vsock_find_cid(), right? This is what I mean, roughly:

+static u32 vsock_registered_transport_cid(const struct vsock_transport
__rcu **trans_ptr)
+{
+	const struct vsock_transport *transport;
+	u32 cid = VMADDR_CID_ANY;
+
+	rcu_read_lock();
+	transport = rcu_dereference(*trans_ptr);
+	if (transport)
+		cid = transport->get_local_cid();
+	rcu_read_unlock();
+
+	return cid;
+}
...
@@ -2713,6 +2726,7 @@ void vsock_core_unregister(const struct
vsock_transport *t)
 		transport_local = NULL;

 	mutex_unlock(&vsock_register_mutex);
+	synchronize_rcu();
 }

I've realized I'm throwing multiple unrelated ideas/questions, so let me
summarise:
1. Hackish macro can be used to guard against calling
vsock_registered_transport_cid() on a non-static variable.
2. We can comment the function to add some context and avoid confusion.
3. Instead of taking mutex in vsock_registered_transport_cid() we can use RCU.

> So, if we can't do much, I'd add a comment and make the function name 
> more clear. e.g. vsock_registered_transport_cid() ? or something 
> similar.

Sure, will do.

Thanks!

[1]:
https://stackoverflow.com/questions/5645695/how-can-i-add-a-static-assert-to-check-if-a-variable-is-static/5672637#5672637

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ