[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220310110036.fgy323c4hvk3mziq@sgarzare-redhat>
Date: Thu, 10 Mar 2022 12:00:36 +0100
From: Stefano Garzarella <sgarzare@...hat.com>
To: Jiyong Park <jiyong@...gle.com>
Cc: Stefan Hajnoczi <stefanha@...hat.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
Jason Wang <jasowang@...hat.com>, adelva@...gle.com,
kvm@...r.kernel.org, virtualization@...ts.linux-foundation.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] vhost/vsock: reset only the h2g connections upon release
On Thu, Mar 10, 2022 at 07:41:54PM +0900, Jiyong Park wrote:
>Hi Stefano,
>
>On Thu, Mar 10, 2022 at 5:59 PM Stefano Garzarella <sgarzare@...hat.com> wrote:
>>
>> Hi Jiyong,
>>
>> On Thu, Mar 10, 2022 at 05:18:54PM +0900, Jiyong Park wrote:
>> >Filtering non-h2g connections out when determining orphaned connections.
>> >Otherwise, in a nested VM configuration, destroying the nested VM (which
>> >often involves the closing of /dev/vhost-vsock if there was h2g
>> >connections to the nested VM) kills not only the h2g connections, but
>> >also all existing g2h connections to the (outmost) host which are
>> >totally unrelated.
>> >
>> >Tested: Executed the following steps on Cuttlefish (Android running on a
>> >VM) [1]: (1) Enter into an `adb shell` session - to have a g2h
>> >connection inside the VM, (2) open and then close /dev/vhost-vsock by
>> >`exec 3< /dev/vhost-vsock && exec 3<&-`, (3) observe that the adb
>> >session is not reset.
>> >
>> >[1] https://android.googlesource.com/device/google/cuttlefish/
>> >
>> >Signed-off-by: Jiyong Park <jiyong@...gle.com>
>> >---
>> > drivers/vhost/vsock.c | 4 ++++
>> > 1 file changed, 4 insertions(+)
>> >
>> >diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
>> >index 37f0b4274113..2f6d5d66f5ed 100644
>> >--- a/drivers/vhost/vsock.c
>> >+++ b/drivers/vhost/vsock.c
>> >@@ -722,6 +722,10 @@ static void vhost_vsock_reset_orphans(struct sock *sk)
>> > * executing.
>> > */
>> >
>> >+ /* Only the h2g connections are reset */
>> >+ if (vsk->transport != &vhost_transport.transport)
>> >+ return;
>> >+
>> > /* If the peer is still valid, no need to reset connection */
>> > if (vhost_vsock_get(vsk->remote_addr.svm_cid))
>> > return;
>> >--
>> >2.35.1.723.g4982287a31-goog
>> >
>>
>> Thanks for your patch!
>>
>> Yes, I see the problem and I think I introduced it with the
>> multi-transports support (ooops).
>>
>> So we should add this fixes tag:
>>
>> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
>>
>>
>> IIUC the problem is for all transports that should only cycle on their
>> own sockets. Indeed I think there is the same problem if the g2h driver
>> will be unloaded (or a reset event is received after a VM migration), it
>> will close all sockets of the nested h2g.
>>
>> So I suggest a more generic solution, modifying
>> vsock_for_each_connected_socket() like this (not tested):
>>
>> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>> index 38baeb189d4e..f04abf662ec6 100644
>> --- a/net/vmw_vsock/af_vsock.c
>> +++ b/net/vmw_vsock/af_vsock.c
>> @@ -334,7 +334,8 @@ void vsock_remove_sock(struct vsock_sock *vsk)
>> }
>> EXPORT_SYMBOL_GPL(vsock_remove_sock);
>>
>> -void vsock_for_each_connected_socket(void (*fn)(struct sock *sk))
>> +void vsock_for_each_connected_socket(struct vsock_transport *transport,
>> + void (*fn)(struct sock *sk))
>> {
>> int i;
>>
>> @@ -343,8 +344,12 @@ void vsock_for_each_connected_socket(void (*fn)(struct sock *sk))
>> for (i = 0; i < ARRAY_SIZE(vsock_connected_table); i++) {
>> struct vsock_sock *vsk;
>> list_for_each_entry(vsk, &vsock_connected_table[i],
>> - connected_table)
>> + connected_table) {
>> + if (vsk->transport != transport)
>> + continue;
>> +
>> fn(sk_vsock(vsk));
>> + }
>> }
>>
>>
>> And all transports that call it.
>>
>> Thanks,
>> Stefano
>>
>
>Thanks for the suggestion, which looks much better. It actually worked well.
Thanks for trying this!
>
>By the way, the suggested change will alter the kernel-module interface (KMI),
>which will make it difficult to land the change on older releases where we'd
>like to keep the KMI stable [1]. Would it be OK if we let the supplied function
>(fn) be responsible for checking the transport? I think that there, in
>the future,
>might be a case where one needs to cycle over all sockets for inspection or so.
>I admit that this would be prone to error, though.
>
>Please let me know what you think. I don't have a strong preference. I will
>submit a revision as you want.
I see your point, and it makes sense to keep KMI on stable branches, but
mainline I would like to have the proposed approach since all transports
use it to cycle on their sockets.
So we could do a series with 2 patches:
- Patch 1 fixes the problem in all transports by checking the transport
in the callback (here we insert the fixes tag so we backport it to the
stable branches)
- Patch 2 moves the check in vsock_for_each_connected_socket() removing
it from callbacks so it is less prone to errors and we merge it only
mainline
What do you think?
Thanks,
Stefano
Powered by blists - more mailing lists