[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZK8pxrbkrH2bEgw7@bullseye>
Date: Wed, 12 Jul 2023 22:31:34 +0000
From: Bobby Eshleman <bobbyeshleman@...il.com>
To: Arseniy Krasnov <AVKrasnov@...rdevices.ru>
Cc: Stefan Hajnoczi <stefanha@...hat.com>,
Stefano Garzarella <sgarzare@...hat.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
Jason Wang <jasowang@...hat.com>,
Bobby Eshleman <bobby.eshleman@...edance.com>,
kvm@...r.kernel.org, virtualization@...ts.linux-foundation.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
kernel@...rdevices.ru, oxffffaa@...il.com
Subject: Re: [RFC PATCH v5 13/17] vsock: enable setting SO_ZEROCOPY
On Sat, Jul 01, 2023 at 09:39:43AM +0300, Arseniy Krasnov wrote:
> For AF_VSOCK, zerocopy tx mode depends on transport, so this option must
> be set in AF_VSOCK implementation where transport is accessible (if
> transport is not set during setting SO_ZEROCOPY: for example socket is
> not connected, then SO_ZEROCOPY will be enabled, but once transport will
> be assigned, support of this type of transmission will be checked).
>
> To handle SO_ZEROCOPY, AF_VSOCK implementation uses SOCK_CUSTOM_SOCKOPT
> bit, thus handling SOL_SOCKET option operations, but all of them except
> SO_ZEROCOPY will be forwarded to the generic handler by calling
> 'sock_setsockopt()'.
>
> Signed-off-by: Arseniy Krasnov <AVKrasnov@...rdevices.ru>
> ---
> Changelog:
> v4 -> v5:
> * This patch is totally reworked. Previous version added check for
> PF_VSOCK directly to 'net/core/sock.c', thus allowing to set
> SO_ZEROCOPY for AF_VSOCK type of socket. This new version catches
> attempt to set SO_ZEROCOPY in 'af_vsock.c'. All other options
> except SO_ZEROCOPY are forwarded to generic handler. Only this
> option is processed in 'af_vsock.c'. Handling this option includes
> access to transport to check that MSG_ZEROCOPY transmission is
> supported by the current transport (if it is set, if not - transport
> will be checked during 'connect()').
>
> net/vmw_vsock/af_vsock.c | 44 ++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 42 insertions(+), 2 deletions(-)
>
> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> index da22ae0ef477..8acc77981d01 100644
> --- a/net/vmw_vsock/af_vsock.c
> +++ b/net/vmw_vsock/af_vsock.c
> @@ -1406,8 +1406,18 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
> goto out;
> }
>
> - if (vsock_msgzerocopy_allow(transport))
> + if (!vsock_msgzerocopy_allow(transport)) {
> + /* If this option was set before 'connect()',
> + * when transport was unknown, check that this
> + * feature is supported here.
> + */
> + if (sock_flag(sk, SOCK_ZEROCOPY)) {
> + err = -EOPNOTSUPP;
> + goto out;
> + }
> + } else {
> set_bit(SOCK_SUPPORT_ZC, &sk->sk_socket->flags);
> + }
>
> err = vsock_auto_bind(vsk);
> if (err)
> @@ -1643,7 +1653,7 @@ static int vsock_connectible_setsockopt(struct socket *sock,
> const struct vsock_transport *transport;
> u64 val;
>
> - if (level != AF_VSOCK)
> + if (level != AF_VSOCK && level != SOL_SOCKET)
> return -ENOPROTOOPT;
>
> #define COPY_IN(_v) \
> @@ -1666,6 +1676,34 @@ static int vsock_connectible_setsockopt(struct socket *sock,
>
> transport = vsk->transport;
>
> + if (level == SOL_SOCKET) {
> + if (optname == SO_ZEROCOPY) {
> + int zc_val;
> +
> + /* Use 'int' type here, because variable to
> + * set this option usually has this type.
> + */
> + COPY_IN(zc_val);
> +
> + if (zc_val < 0 || zc_val > 1) {
> + err = -EINVAL;
> + goto exit;
> + }
> +
> + if (transport && !vsock_msgzerocopy_allow(transport)) {
> + err = -EOPNOTSUPP;
> + goto exit;
> + }
> +
> + sock_valbool_flag(sk, SOCK_ZEROCOPY,
> + zc_val ? true : false);
> + goto exit;
> + }
> +
> + release_sock(sk);
> + return sock_setsockopt(sock, level, optname, optval, optlen);
> + }
> +
> switch (optname) {
> case SO_VM_SOCKETS_BUFFER_SIZE:
> COPY_IN(val);
> @@ -2321,6 +2359,8 @@ static int vsock_create(struct net *net, struct socket *sock,
> }
> }
>
> + set_bit(SOCK_CUSTOM_SOCKOPT, &sk->sk_socket->flags);
> +
I found that because datagrams have !ops->setsockopt this bit causes
setsockopt() to fail (the related logic can be found in
__sys_setsockopt). Maybe we should only set this for connectibles?
Best,
Bobby
> vsock_insert_unbound(vsk);
>
> return 0;
> --
> 2.25.1
>
Powered by blists - more mailing lists