[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <jkhr2v5zjebxnckmhn3f3dvv5zdzbldkyxbe5kx5m7vzvw6kzi@nrqipygyhlix>
Date: Fri, 20 Dec 2024 11:49:32 +0100
From: Stefano Garzarella <sgarzare@...hat.com>
To: Michal Luczaj <mhal@...x.co>
Cc: Hyunwoo Kim <v4bel@...ori.io>, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Jason Wang <jasowang@...hat.com>, "Michael S. Tsirkin" <mst@...hat.com>,
virtualization@...ts.linux.dev, netdev@...r.kernel.org, qwerty@...ori.io
Subject: Re: [PATCH] vsock/virtio: Fix null-ptr-deref in vsock_stream_has_data
On Thu, Dec 19, 2024 at 05:09:42PM +0100, Michal Luczaj wrote:
>On 12/19/24 16:12, Stefano Garzarella wrote:
>> On Thu, 19 Dec 2024 at 16:05, Michal Luczaj <mhal@...x.co> wrote:
>>>
>>> On 12/19/24 15:48, Stefano Garzarella wrote:
>>>> On Thu, 19 Dec 2024 at 15:36, Michal Luczaj <mhal@...x.co> wrote:
>>>>>
>>>>> On 12/19/24 09:19, Stefano Garzarella wrote:
>>>>>> ...
>>>>>> I think the best thing though is to better understand how to handle
>>>>>> deassign, rather than checking everywhere that it's not null, also
>>>>>> because in some cases (like the one in virtio-vsock), it's also
>>>>>> important that the transport is the same.
>>>>>
>>>>> My vote would be to apply your virtio_transport_recv_pkt() patch *and* make
>>>>> it impossible-by-design to switch ->transport from non-NULL to NULL in
>>>>> vsock_assign_transport().
>>>>
>>>> I don't know if that's enough, in this case the problem is that some
>>>> response packets are intended for a socket, where the transport has
>>>> changed. So whether it's null or assigned but different, it's still a
>>>> problem we have to handle.
>>>>
>>>> So making it impossible for the transport to be null, but allowing it
>>>> to be different (we can't prevent it from changing), doesn't solve the
>>>> problem for us, it only shifts it.
>>>
>>> Got it. I assumed this issue would be solved by `vsk->transport !=
>>> &t->transport` in the critical place(s).
>>>
>>> (Note that BPF doesn't care if transport has changed; BPF just expects to
>>> have _a_ transport.)
>>>
>>>>> If I'm not mistaken, that would require rewriting vsock_assign_transport()
>>>>> so that a new transport is assigned only once fully initialized, otherwise
>>>>> keep the old one (still unhurt and functional) and return error. Because
>>>>> failing connect() should not change anything under the hood, right?
>>>>>
>>>>
>>>> Nope, connect should be able to change the transport.
>>>>
>>>> Because a user can do an initial connect() that requires a specific
>>>> transport, this one fails maybe because there's no peer with that cid.
>>>> Then the user can redo the connect() to a different cid that requires
>>>> a different transport.
>>>
>>> But the initial connect() failing does not change anything under the hood
>>> (transport should/could stay NULL).
>>
>> Nope, isn't null, it's assigned to a transport, because for example it
>> has to send a packet to connect to the remote CID and wait back for a
>> response that for example says the CID doesn't exist.
>
>Ahh, I think I get it. So the initial connect() passed
>vsock_assign_transport() successfully and then failed deeper in
>vsock_connect(), right? That's fine. Let the socket have a useless
>transport (a valid pointer nevertheless).
Just to be clear, it's not useless, since it's used to make the
connection. We know that it's useless just when we report that the
connection failed, so maybe we should de-assign it when we set
`sock->state = SS_UNCONNECTED`.
>Sure, upcoming connect() can
>assign a new (possibly useless just as well) transport, but there's no
>reason to allow ->transport becoming NULL.
I'm not sure about this, in the end in the connection failure case, when
we set `sock->state = SS_UNCONNECTED`, we're returning the socket to a
pre-connect state, so it might make sense to also reset the transport to
NULL, so that we have exactly the same conditions.
>
>And a pre-connect socket (where ->transport==NULL) is not an issue, because
>BPF won't let it in any sockmap, so vsock_bpf_recvmsg() won't be reachable.
>
>Anywa, thanks for explaining,
>Michal
>
>PS. Or ignore the above and remove the socket from the sockmap at every
>reconnect? Possible unhash abuse:
I should take a closer look at unhash, but it might make sense!
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 5cf8109f672a..8a65153ee186 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -483,6 +483,10 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
> if (vsk->transport == new_transport)
> return 0;
>
>+ const struct proto *prot = READ_ONCE(sk->sk_prot);
>+ if (prot->unhash)
>+ prot->unhash(sk);
>+
> /* transport->release() must be called with sock lock acquired.
> * This path can only be taken during vsock_connect(), where we
> * have already held the sock lock. In the other cases, this
>diff --git a/net/vmw_vsock/vsock_bpf.c b/net/vmw_vsock/vsock_bpf.c
>index 4aa6e74ec295..80deb4d70aea 100644
>--- a/net/vmw_vsock/vsock_bpf.c
>+++ b/net/vmw_vsock/vsock_bpf.c
>@@ -119,6 +119,7 @@ static void vsock_bpf_rebuild_protos(struct proto *prot, const struct proto *bas
> *prot = *base;
> prot->close = sock_map_close;
> prot->recvmsg = vsock_bpf_recvmsg;
>+ prot->unhash = sock_map_unhash;
> prot->sock_is_readable = sk_msg_is_readable;
> }
>
>>> Then a successful re-connect assigns
>>> the transport (NULL -> non-NULL). And it's all good because all I wanted to
>>> avoid (because of BPF) was non-NULL -> NULL. Anyway, that's my possibly
>>> shallow understanding :)
>
Note that non-NULL -> NULL should only occur before a connection is
established, so before any data is passed. Is this a problem for BPF?
Thanks,
Stefano
Powered by blists - more mailing lists