lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4vsrtxs3uttx6w2zyk6rxescpwvrikypiw6tvjheplht6yzonc@ch6k3xlftikw>
Date: Mon, 30 Jun 2025 11:05:44 +0200
From: Stefano Garzarella <sgarzare@...hat.com>
To: Michal Luczaj <mhal@...x.co>
Cc: "David S. Miller" <davem@...emloft.net>, 
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, 
	Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>, 
	Stefan Hajnoczi <stefanha@...hat.com>, virtualization@...ts.linux.dev, netdev@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC net v2 1/3] vsock: Fix transport_{g2h,h2g} TOCTOU

On Sun, Jun 29, 2025 at 11:26:12PM +0200, Michal Luczaj wrote:
>On 6/27/25 10:02, Stefano Garzarella wrote:
>> On Wed, Jun 25, 2025 at 11:23:30PM +0200, Michal Luczaj wrote:
>>> On 6/25/25 10:43, Stefano Garzarella wrote:
>>>> On Fri, Jun 20, 2025 at 09:52:43PM +0200, Michal Luczaj wrote:
>>>>> vsock_find_cid() and vsock_dev_do_ioctl() may race with module unload.
>>>>> transport_{g2h,h2g} may become NULL after the NULL check.
>>>>>
>>>>> Introduce vsock_transport_local_cid() to protect from a potential
>>>>> null-ptr-deref.
>>>>>
>>>>> KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
>>>>> RIP: 0010:vsock_find_cid+0x47/0x90
>>>>> Call Trace:
>>>>> __vsock_bind+0x4b2/0x720
>>>>> vsock_bind+0x90/0xe0
>>>>> __sys_bind+0x14d/0x1e0
>>>>> __x64_sys_bind+0x6e/0xc0
>>>>> do_syscall_64+0x92/0x1c0
>>>>> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>>>>>
>>>>> KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
>>>>> RIP: 0010:vsock_dev_do_ioctl.isra.0+0x58/0xf0
>>>>> Call Trace:
>>>>> __x64_sys_ioctl+0x12d/0x190
>>>>> do_syscall_64+0x92/0x1c0
>>>>> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>>>>>
>>>>> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
>>>>> Suggested-by: Stefano Garzarella <sgarzare@...hat.com>
>>>>> Signed-off-by: Michal Luczaj <mhal@...x.co>
>>>>> ---
>>>>> net/vmw_vsock/af_vsock.c | 23 +++++++++++++++++------
>>>>> 1 file changed, 17 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>>>>> index 2e7a3034e965db30b6ee295370d866e6d8b1c341..63a920af5bfe6960306a3e5eeae0cbf30648985e 100644
>>>>> --- a/net/vmw_vsock/af_vsock.c
>>>>> +++ b/net/vmw_vsock/af_vsock.c
>>>>> @@ -531,9 +531,21 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
>>>>> }
>>>>> EXPORT_SYMBOL_GPL(vsock_assign_transport);
>>>>>
>>>>> +static u32 vsock_transport_local_cid(const struct vsock_transport **transport)
>>>>
>>>> Why we need double pointer?
>>>
>>> Because of a possible race. If @transport is `struct vsock_transport*` and
>>> we pass `transport_g2h`, the passed non-NULL pointer value may immediately
>>> become stale (due to module unload). But if it's `vsock_transport**` and we
>>> pass `&transport_g2h`, then we can take the mutex, check `*transport` for
>>> NULL and safely go ahead.
>>>
>>> Or are you saying this could be simplified?
>>
>> Nope, you're right! I was still thinking about my old version where we
>> had the switch inside...
>>
>> BTW I'd like to change the name, `vsock_transport_local` prefix is
>> confusing IMO, since it seems related only to the `transport_local`.
>>
>> Another thing I'm worried about is that we'll then start using it on
>> `vsk->transport` when this is only to be used on registered transports
>> (i.e. `static ...`), though, I don't think there's a way to force type
>> checking from the compiler (unless you wrap it in a struct). (...)
>
>I've found (on SO[1]) this somewhat hackish compile-time `static`-checking:
>
>static u32 __vsock_registered_transport_cid(const struct vsock_transport
>**transport)
>{
>	u32 cid = VMADDR_CID_ANY;
>
>	mutex_lock(&vsock_register_mutex);
>	if (*transport)
>		cid = (*transport)->get_local_cid();
>	mutex_unlock(&vsock_register_mutex);
>
>	return cid;
>}
>
>#define ASSERT_REGISTERED_TRANSPORT(t)					\
>	__always_unused static void *__UNIQUE_ID(vsock) = (t)
>
>#define vsock_registered_transport_cid(transport)			\
>({									\
>	ASSERT_REGISTERED_TRANSPORT(transport);				\
>	__vsock_registered_transport_cid(transport);			\
>})
>
>It does the trick, compilation fails on
>vsock_registered_transport_cid(&vsk->transport):
>
>net/vmw_vsock/af_vsock.c: In function ‘vsock_send_shutdown’:
>net/vmw_vsock/af_vsock.c:565:59: error: initializer element is not constant
>  565 |         __always_unused static void *__UNIQUE_ID(vsock) = (t)
>      |                                                           ^
>net/vmw_vsock/af_vsock.c:569:9: note: in expansion of macro
>‘ASSERT_REGISTERED_TRANSPORT’
>  569 |         ASSERT_REGISTERED_TRANSPORT(transport);
>    \
>      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~
>net/vmw_vsock/af_vsock.c:626:9: note: in expansion of macro
>‘vsock_registered_transport_cid’
>  626 |         vsock_registered_transport_cid(&vsk->transport);
>      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>But perhaps adding a comment wouldn't hurt either, e.g.
>
>/* Provide safe access to static transport_{h2g,g2h,dgram,local} callbacks.
> * Otherwise we may race with module removal. Do not use on
> * `vsk->transport`.
> */

Yeah, I'd just go with the comment, without introduce complex macros.
Also because in the worst case we don't do anything wrong.

BTW if we have some macros already defined in the kernel that we can 
re-use, it's fine.

>
>? ...which begs another question: do we stick to the netdev special comment
>style? See commit 82b8000c28b5 ("net: drop special comment style").

If checkpatch is fine, I'm fine :-)

>
>Oh, and come to think of it, we don't really need that (easily contended?)
>mutex here. Same can be done with RCU. Which should speed up vsock_bind()
>-> __vsock_bind() -> vsock_find_cid(), right? This is what I mean, roughly:
>
>+static u32 vsock_registered_transport_cid(const struct vsock_transport
>__rcu **trans_ptr)
>+{
>+	const struct vsock_transport *transport;
>+	u32 cid = VMADDR_CID_ANY;
>+
>+	rcu_read_lock();
>+	transport = rcu_dereference(*trans_ptr);
>+	if (transport)
>+		cid = transport->get_local_cid();
>+	rcu_read_unlock();
>+
>+	return cid;
>+}
>...
>@@ -2713,6 +2726,7 @@ void vsock_core_unregister(const struct
>vsock_transport *t)
> 		transport_local = NULL;
>
> 	mutex_unlock(&vsock_register_mutex);
>+	synchronize_rcu();
> }
>
>I've realized I'm throwing multiple unrelated ideas/questions, so let me
>summarise:
>1. Hackish macro can be used to guard against calling
>vsock_registered_transport_cid() on a non-static variable.
>2. We can comment the function to add some context and avoid confusion.

I'd go with 2.

>3. Instead of taking mutex in vsock_registered_transport_cid() we can use RCU.

Since the vsock_bind() is not in the hot path, maybe a mutex is fine.
WDYT?

Thanks,
Stefano

>
>> So, if we can't do much, I'd add a comment and make the function name
>> more clear. e.g. vsock_registered_transport_cid() ? or something
>> similar.
>
>Sure, will do.
>
>Thanks!
>
>[1]:
>https://stackoverflow.com/questions/5645695/how-can-i-add-a-static-assert-to-check-if-a-variable-is-static/5672637#5672637
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ