[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <17a96448-b458-6c92-3d8b-c82f2fb399ed@suse.de>
Date: Mon, 27 Feb 2023 10:24:29 +0100
From: Hannes Reinecke <hare@...e.de>
To: Chuck Lever <cel@...nel.org>, kuba@...nel.org, pabeni@...hat.com,
edumazet@...gle.com
Cc: netdev@...r.kernel.org, kernel-tls-handshake@...ts.linux.dev
Subject: Re: [PATCH v5 1/2] net/handshake: Create a NETLINK service for
handling handshake requests
On 2/24/23 20:19, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@...cle.com>
>
> When a kernel consumer needs a transport layer security session, it
> first needs a handshake to negotiate and establish a session. This
> negotiation can be done in user space via one of the several
> existing library implementations, or it can be done in the kernel.
>
> No in-kernel handshake implementations yet exist. In their absence,
> we add a netlink service that can:
>
> a. Notify a user space daemon that a handshake is needed.
>
> b. Once notified, the daemon calls the kernel back via this
> netlink service to get the handshake parameters, including an
> open socket on which to establish the session.
>
> c. Once the handshake is complete, the daemon reports the
> session status and other information via a second netlink
> operation. This operation marks that it is safe for the
> kernel to use the open socket and the security session
> established there.
>
> The notification service uses a multicast group. Each handshake
> mechanism (eg, tlshd) adopts its own group number so that the
> handshake services are completely independent of one another. The
> kernel can then tell via netlink_has_listeners() whether a handshake
> service is active and prepared to handle a handshake request.
>
> A new netlink operation, ACCEPT, acts like accept(2) in that it
> instantiates a file descriptor in the user space daemon's fd table.
> If this operation is successful, the reply carries the fd number,
> which can be treated as an open and ready file descriptor.
>
> While user space is performing the handshake, the kernel keeps its
> muddy paws off the open socket. A second new netlink operation,
> DONE, indicates that the user space daemon is finished with the
> socket and it is safe for the kernel to use again. The operation
> also indicates whether a session was established successfully.
>
> Signed-off-by: Chuck Lever <chuck.lever@...cle.com>
> ---
> Documentation/netlink/specs/handshake.yaml | 134 +++++++++++
> include/net/handshake.h | 45 ++++
> include/net/net_namespace.h | 5
> include/net/sock.h | 1
> include/trace/events/handshake.h | 159 +++++++++++++
> include/uapi/linux/handshake.h | 63 +++++
> net/Makefile | 1
> net/handshake/Makefile | 11 +
> net/handshake/handshake.h | 41 +++
> net/handshake/netlink.c | 340 ++++++++++++++++++++++++++++
> net/handshake/request.c | 246 ++++++++++++++++++++
> net/handshake/trace.c | 17 +
> 12 files changed, 1063 insertions(+)
> create mode 100644 Documentation/netlink/specs/handshake.yaml
> create mode 100644 include/net/handshake.h
> create mode 100644 include/trace/events/handshake.h
> create mode 100644 include/uapi/linux/handshake.h
> create mode 100644 net/handshake/Makefile
> create mode 100644 net/handshake/handshake.h
> create mode 100644 net/handshake/netlink.c
> create mode 100644 net/handshake/request.c
> create mode 100644 net/handshake/trace.c
>
> diff --git a/Documentation/netlink/specs/handshake.yaml b/Documentation/netlink/specs/handshake.yaml
> new file mode 100644
> index 000000000000..683a8f2df0a7
> --- /dev/null
> +++ b/Documentation/netlink/specs/handshake.yaml
> @@ -0,0 +1,134 @@
> +# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
> +#
> +# GENL HANDSHAKE service.
> +#
> +# Author: Chuck Lever <chuck.lever@...cle.com>
> +#
> +# Copyright (c) 2023, Oracle and/or its affiliates.
> +#
> +
> +name: handshake
> +
> +protocol: genetlink-c
> +
> +doc: Netlink protocol to request a transport layer security handshake.
> +
> +uapi-header: linux/net/handshake.h
> +
> +definitions:
> + -
> + type: enum
> + name: handler-class
> + enum-name:
> + value-start: 0
> + entries: [ none ]
> + -
> + type: enum
> + name: msg-type
> + enum-name:
> + value-start: 0
> + entries: [ unspec, clienthello, serverhello ]
> + -
> + type: enum
> + name: auth
> + enum-name:
> + value-start: 0
> + entries: [ unspec, unauth, x509, psk ]
> +
> +attribute-sets:
> + -
> + name: accept
> + attributes:
> + -
> + name: status
> + doc: Status of this accept operation
> + type: u32
> + value: 1
> + -
> + name: sockfd
> + doc: File descriptor of socket to use
> + type: u32
> + -
> + name: handler-class
> + doc: Which type of handler is responding
> + type: u32
> + enum: handler-class
> + -
> + name: message-type
> + doc: Handshake message type
> + type: u32
> + enum: msg-type
> + -
> + name: auth
> + doc: Authentication mode
> + type: u32
> + enum: auth
> + -
> + name: gnutls-priorities
> + doc: GnuTLS priority string
> + type: string
> + -
> + name: my-peerid
> + doc: Serial no of key containing local identity
> + type: u32
> + -
> + name: my-privkey
> + doc: Serial no of key containing optional private key
> + type: u32
> + -
> + name: done
> + attributes:
> + -
> + name: status
> + doc: Session status
> + type: u32
> + value: 1
> + -
> + name: sockfd
> + doc: File descriptor of socket that has completed
> + type: u32
> + -
> + name: remote-peerid
> + doc: Serial no of keys containing identities of remote peer
> + type: u32
> +
> +operations:
> + list:
> + -
> + name: ready
> + doc: Notify handlers that a new handshake request is waiting
> + value: 1
> + notify: accept
> + -
> + name: accept
> + doc: Handler retrieves next queued handshake request
> + attribute-set: accept
> + flags: [ admin-perm ]
> + do:
> + request:
> + attributes:
> + - handler-class
> + reply:
> + attributes:
> + - status
> + - sockfd
> + - message-type
> + - auth
> + - gnutls-priorities
> + - my-peerid
> + - my-privkey
> + -
> + name: done
> + doc: Handler reports handshake completion
> + attribute-set: done
> + do:
> + request:
> + attributes:
> + - status
> + - sockfd
> + - remote-peerid
> +
> +mcast-groups:
> + list:
> + -
> + name: none
> diff --git a/include/net/handshake.h b/include/net/handshake.h
> new file mode 100644
> index 000000000000..08f859237936
> --- /dev/null
> +++ b/include/net/handshake.h
> @@ -0,0 +1,45 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Generic HANDSHAKE service.
> + *
> + * Author: Chuck Lever <chuck.lever@...cle.com>
> + *
> + * Copyright (c) 2023, Oracle and/or its affiliates.
> + */
> +
> +/*
> + * Data structures and functions that are visible only within the
> + * kernel are declared here.
> + */
> +
> +#ifndef _NET_HANDSHAKE_H
> +#define _NET_HANDSHAKE_H
> +
> +struct handshake_req;
> +
> +/*
> + * Invariants for all handshake requests for one transport layer
> + * security protocol
> + */
> +struct handshake_proto {
> + int hp_handler_class;
> + size_t hp_privsize;
> +
> + int (*hp_accept)(struct handshake_req *req,
> + struct genl_info *gi, int fd);
> + void (*hp_done)(struct handshake_req *req,
> + int status, struct nlattr **tb);
> + void (*hp_destroy)(struct handshake_req *req);
> +};
> +
> +extern struct handshake_req *
> +handshake_req_alloc(struct socket *sock, const struct handshake_proto *proto,
> + gfp_t flags);
> +extern void *handshake_req_private(struct handshake_req *req);
> +extern int handshake_req_submit(struct handshake_req *req, gfp_t flags);
> +extern int handshake_req_cancel(struct socket *sock);
> +
> +extern struct nlmsghdr *handshake_genl_put(struct sk_buff *msg,
> + struct genl_info *gi);
> +
> +#endif /* _NET_HANDSHAKE_H */
> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
> index 78beaa765c73..a0ce9de4dab1 100644
> --- a/include/net/net_namespace.h
> +++ b/include/net/net_namespace.h
> @@ -188,6 +188,11 @@ struct net {
> #if IS_ENABLED(CONFIG_SMC)
> struct netns_smc smc;
> #endif
> +
> + /* transport layer security handshake requests */
> + spinlock_t hs_lock;
> + struct list_head hs_requests;
> + int hs_pending;
> } __randomize_layout;
>
> #include <linux/seq_file_net.h>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 573f2bf7e0de..2a7345ce2540 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -519,6 +519,7 @@ struct sock {
>
> struct socket *sk_socket;
> void *sk_user_data;
> + void *sk_handshake_req;
> #ifdef CONFIG_SECURITY
> void *sk_security;
> #endif
> diff --git a/include/trace/events/handshake.h b/include/trace/events/handshake.h
> new file mode 100644
> index 000000000000..feffcd1d6256
> --- /dev/null
> +++ b/include/trace/events/handshake.h
> @@ -0,0 +1,159 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM handshake
> +
> +#if !defined(_TRACE_HANDSHAKE_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_HANDSHAKE_H
> +
> +#include <linux/net.h>
> +#include <linux/tracepoint.h>
> +
> +DECLARE_EVENT_CLASS(handshake_event_class,
> + TP_PROTO(
> + const struct net *net,
> + const struct handshake_req *req,
> + const struct socket *sock
> + ),
> + TP_ARGS(net, req, sock),
> + TP_STRUCT__entry(
> + __field(const void *, req)
> + __field(const void *, sock)
> + __field(unsigned int, netns_ino)
> + ),
> + TP_fast_assign(
> + __entry->req = req;
> + __entry->sock = sock;
> + __entry->netns_ino = net->ns.inum;
> + ),
> + TP_printk("req=%p sock=%p",
> + __entry->req, __entry->sock
> + )
> +);
> +#define DEFINE_HANDSHAKE_EVENT(name) \
> + DEFINE_EVENT(handshake_event_class, name, \
> + TP_PROTO( \
> + const struct net *net, \
> + const struct handshake_req *req, \
> + const struct socket *sock \
> + ), \
> + TP_ARGS(net, req, sock))
> +
> +DECLARE_EVENT_CLASS(handshake_fd_class,
> + TP_PROTO(
> + const struct net *net,
> + const struct handshake_req *req,
> + const struct socket *sock,
> + int fd
> + ),
> + TP_ARGS(net, req, sock, fd),
> + TP_STRUCT__entry(
> + __field(const void *, req)
> + __field(const void *, sock)
> + __field(int, fd)
> + __field(unsigned int, netns_ino)
> + ),
> + TP_fast_assign(
> + __entry->req = req;
> + __entry->sock = req->hr_sock;
> + __entry->fd = fd;
> + __entry->netns_ino = net->ns.inum;
> + ),
> + TP_printk("req=%p sock=%p fd=%d",
> + __entry->req, __entry->sock, __entry->fd
> + )
> +);
> +#define DEFINE_HANDSHAKE_FD_EVENT(name) \
> + DEFINE_EVENT(handshake_fd_class, name, \
> + TP_PROTO( \
> + const struct net *net, \
> + const struct handshake_req *req, \
> + const struct socket *sock, \
> + int fd \
> + ), \
> + TP_ARGS(net, req, sock, fd))
> +
> +DECLARE_EVENT_CLASS(handshake_error_class,
> + TP_PROTO(
> + const struct net *net,
> + const struct handshake_req *req,
> + const struct socket *sock,
> + int err
> + ),
> + TP_ARGS(net, req, sock, err),
> + TP_STRUCT__entry(
> + __field(const void *, req)
> + __field(const void *, sock)
> + __field(int, err)
> + __field(unsigned int, netns_ino)
> + ),
> + TP_fast_assign(
> + __entry->req = req;
> + __entry->sock = sock;
> + __entry->err = err;
> + __entry->netns_ino = net->ns.inum;
> + ),
> + TP_printk("req=%p sock=%p err=%d",
> + __entry->req, __entry->sock, __entry->err
> + )
> +);
> +#define DEFINE_HANDSHAKE_ERROR(name) \
> + DEFINE_EVENT(handshake_error_class, name, \
> + TP_PROTO( \
> + const struct net *net, \
> + const struct handshake_req *req, \
> + const struct socket *sock, \
> + int err \
> + ), \
> + TP_ARGS(net, req, sock, err))
> +
> +
> +/**
> + ** Request lifetime events
> + **/
> +
> +DEFINE_HANDSHAKE_EVENT(handshake_submit);
> +DEFINE_HANDSHAKE_ERROR(handshake_submit_err);
> +DEFINE_HANDSHAKE_EVENT(handshake_cancel);
> +DEFINE_HANDSHAKE_EVENT(handshake_cancel_none);
> +DEFINE_HANDSHAKE_EVENT(handshake_cancel_busy);
> +DEFINE_HANDSHAKE_EVENT(handshake_destruct);
> +
> +
> +TRACE_EVENT(handshake_complete,
> + TP_PROTO(
> + const struct net *net,
> + const struct handshake_req *req,
> + const struct socket *sock,
> + int status
> + ),
> + TP_ARGS(net, req, sock, status),
> + TP_STRUCT__entry(
> + __field(const void *, req)
> + __field(const void *, sock)
> + __field(int, status)
> + __field(unsigned int, netns_ino)
> + ),
> + TP_fast_assign(
> + __entry->req = req;
> + __entry->sock = sock;
> + __entry->status = status;
> + __entry->netns_ino = net->ns.inum;
> + ),
> + TP_printk("req=%p sock=%p status=%d",
> + __entry->req, __entry->sock, __entry->status
> + )
> +);
> +
> +/**
> + ** Netlink events
> + **/
> +
> +DEFINE_HANDSHAKE_ERROR(handshake_notify_err);
> +DEFINE_HANDSHAKE_FD_EVENT(handshake_cmd_accept);
> +DEFINE_HANDSHAKE_ERROR(handshake_cmd_accept_err);
> +DEFINE_HANDSHAKE_FD_EVENT(handshake_cmd_done);
> +DEFINE_HANDSHAKE_ERROR(handshake_cmd_done_err);
> +
> +#endif /* _TRACE_HANDSHAKE_H */
> +
> +#include <trace/define_trace.h>
> diff --git a/include/uapi/linux/handshake.h b/include/uapi/linux/handshake.h
> new file mode 100644
> index 000000000000..09fd7c37cba4
> --- /dev/null
> +++ b/include/uapi/linux/handshake.h
> @@ -0,0 +1,63 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/* Do not edit directly, auto-generated from: */
> +/* Documentation/netlink/specs/handshake.yaml */
> +/* YNL-GEN uapi header */
> +
> +#ifndef _UAPI_LINUX_HANDSHAKE_H
> +#define _UAPI_LINUX_HANDSHAKE_H
> +
> +#define HANDSHAKE_FAMILY_NAME "handshake"
> +#define HANDSHAKE_FAMILY_VERSION 1
> +
> +enum {
> + HANDSHAKE_HANDLER_CLASS_NONE,
> +};
> +
> +enum {
> + HANDSHAKE_MSG_TYPE_UNSPEC,
> + HANDSHAKE_MSG_TYPE_CLIENTHELLO,
> + HANDSHAKE_MSG_TYPE_SERVERHELLO,
> +};
> +
> +enum {
> + HANDSHAKE_AUTH_UNSPEC,
> + HANDSHAKE_AUTH_UNAUTH,
> + HANDSHAKE_AUTH_X509,
> + HANDSHAKE_AUTH_PSK,
> +};
> +
> +enum {
> + HANDSHAKE_A_ACCEPT_STATUS = 1,
> + HANDSHAKE_A_ACCEPT_SOCKFD,
> + HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
> + HANDSHAKE_A_ACCEPT_MESSAGE_TYPE,
> + HANDSHAKE_A_ACCEPT_AUTH,
> + HANDSHAKE_A_ACCEPT_GNUTLS_PRIORITIES,
> + HANDSHAKE_A_ACCEPT_MY_PEERID,
> + HANDSHAKE_A_ACCEPT_MY_PRIVKEY,
> +
> + __HANDSHAKE_A_ACCEPT_MAX,
> + HANDSHAKE_A_ACCEPT_MAX = (__HANDSHAKE_A_ACCEPT_MAX - 1)
> +};
> +
> +enum {
> + HANDSHAKE_A_DONE_STATUS = 1,
> + HANDSHAKE_A_DONE_SOCKFD,
> + HANDSHAKE_A_DONE_REMOTE_PEERID,
> +
> + __HANDSHAKE_A_DONE_MAX,
> + HANDSHAKE_A_DONE_MAX = (__HANDSHAKE_A_DONE_MAX - 1)
> +};
> +
> +enum {
> + HANDSHAKE_CMD_READY = 1,
> + HANDSHAKE_CMD_ACCEPT,
> + HANDSHAKE_CMD_DONE,
> +
> + __HANDSHAKE_CMD_MAX,
> + HANDSHAKE_CMD_MAX = (__HANDSHAKE_CMD_MAX - 1)
> +};
> +
> +#define HANDSHAKE_MCGRP_NONE "none"
> +
> +#endif /* _UAPI_LINUX_HANDSHAKE_H */
> diff --git a/net/Makefile b/net/Makefile
> index 0914bea9c335..adbb64277601 100644
> --- a/net/Makefile
> +++ b/net/Makefile
> @@ -79,3 +79,4 @@ obj-$(CONFIG_NET_NCSI) += ncsi/
> obj-$(CONFIG_XDP_SOCKETS) += xdp/
> obj-$(CONFIG_MPTCP) += mptcp/
> obj-$(CONFIG_MCTP) += mctp/
> +obj-y += handshake/
> diff --git a/net/handshake/Makefile b/net/handshake/Makefile
> new file mode 100644
> index 000000000000..a41b03f4837b
> --- /dev/null
> +++ b/net/handshake/Makefile
> @@ -0,0 +1,11 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +# Makefile for the Generic HANDSHAKE service
> +#
> +# Author: Chuck Lever <chuck.lever@...cle.com>
> +#
> +# Copyright (c) 2023, Oracle and/or its affiliates.
> +#
> +
> +obj-y += handshake.o
> +handshake-y := netlink.o request.o trace.o
> diff --git a/net/handshake/handshake.h b/net/handshake/handshake.h
> new file mode 100644
> index 000000000000..366c7659ec09
> --- /dev/null
> +++ b/net/handshake/handshake.h
> @@ -0,0 +1,41 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Generic netlink handshake service
> + *
> + * Author: Chuck Lever <chuck.lever@...cle.com>
> + *
> + * Copyright (c) 2023, Oracle and/or its affiliates.
> + */
> +
> +/*
> + * Data structures and functions that are visible only within the
> + * handshake module are declared here.
> + */
> +
> +#ifndef _INTERNAL_HANDSHAKE_H
> +#define _INTERNAL_HANDSHAKE_H
> +
> +/*
> + * One handshake request
> + */
> +struct handshake_req {
> + struct list_head hr_list;
> + unsigned long hr_flags;
> + const struct handshake_proto *hr_proto;
> + struct socket *hr_sock;
> +
> + void (*hr_saved_destruct)(struct sock *sk);
> +};
> +
> +#define HANDSHAKE_F_COMPLETED BIT(0)
> +
> +/* netlink.c */
> +extern bool handshake_genl_inited;
> +int handshake_genl_notify(struct net *net, int handler_class, gfp_t flags);
> +
> +/* request.c */
> +void __remove_pending_locked(struct net *net, struct handshake_req *req);
> +void handshake_complete(struct handshake_req *req, int status,
> + struct nlattr **tb);
> +
> +#endif /* _INTERNAL_HANDSHAKE_H */
> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
> new file mode 100644
> index 000000000000..581e382236cf
> --- /dev/null
> +++ b/net/handshake/netlink.c
> @@ -0,0 +1,340 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Generic netlink handshake service
> + *
> + * Author: Chuck Lever <chuck.lever@...cle.com>
> + *
> + * Copyright (c) 2023, Oracle and/or its affiliates.
> + */
> +
> +#include <linux/types.h>
> +#include <linux/socket.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/skbuff.h>
> +#include <linux/inet.h>
> +
> +#include <net/sock.h>
> +#include <net/genetlink.h>
> +#include <net/handshake.h>
> +
> +#include <uapi/linux/handshake.h>
> +#include <trace/events/handshake.h>
> +#include "handshake.h"
> +
> +static struct genl_family __ro_after_init handshake_genl_family;
> +bool handshake_genl_inited;
> +
> +/**
> + * handshake_genl_notify - Notify handlers that a request is waiting
> + * @net: target network namespace
> + * @handler_class: target handler
> + * @flags: memory allocation control flags
> + *
> + * Returns zero on success or a negative errno if notification failed.
> + */
> +int handshake_genl_notify(struct net *net, int handler_class, gfp_t flags)
> +{
> + struct sk_buff *msg;
> + void *hdr;
> +
> + if (!genl_has_listeners(&handshake_genl_family, net, handler_class))
> + return -ESRCH;
> +
> + msg = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> + if (!msg)
> + return -ENOMEM;
> +
> + hdr = genlmsg_put(msg, 0, 0, &handshake_genl_family, 0,
> + HANDSHAKE_CMD_READY);
> + if (!hdr)
> + goto out_free;
> +
> + if (nla_put_u32(msg, HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
> + handler_class) < 0) {
> + genlmsg_cancel(msg, hdr);
> + goto out_free;
> + }
> +
> + genlmsg_end(msg, hdr);
> + return genlmsg_multicast_netns(&handshake_genl_family, net, msg,
> + 0, handler_class, flags);
> +
> +out_free:
> + nlmsg_free(msg);
> + return -EMSGSIZE;
> +}
> +
> +/**
> + * handshake_genl_put - Create a generic netlink message header
> + * @msg: buffer in which to create the header
> + * @gi: generic netlink message context
> + *
> + * Returns a ready-to-use header, or NULL.
> + */
> +struct nlmsghdr *handshake_genl_put(struct sk_buff *msg, struct genl_info *gi)
> +{
> + return genlmsg_put(msg, gi->snd_portid, gi->snd_seq,
> + &handshake_genl_family, 0, gi->genlhdr->cmd);
> +}
> +EXPORT_SYMBOL(handshake_genl_put);
> +
> +static int handshake_status_reply(struct sk_buff *skb, struct genl_info *gi,
> + int status)
> +{
> + struct nlmsghdr *hdr;
> + struct sk_buff *msg;
> + int ret;
> +
> + ret = -ENOMEM;
> + msg = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> + if (!msg)
> + goto out;
> + hdr = handshake_genl_put(msg, gi);
> + if (!hdr)
> + goto out_free;
> +
> + ret = -EMSGSIZE;
> + ret = nla_put_u32(msg, HANDSHAKE_A_ACCEPT_STATUS, status);
> + if (ret < 0)
> + goto out_free;
> +
> + genlmsg_end(msg, hdr);
> + return genlmsg_reply(msg, gi);
> +
> +out_free:
> + genlmsg_cancel(msg, hdr);
> +out:
> + return ret;
> +}
> +
> +/*
> + * dup() a kernel socket for use as a user space file descriptor
> + * in the current process.
> + *
> + * Implicit argument: "current()"
> + */
> +static int handshake_dup(struct socket *kernsock)
> +{
> + struct file *file = get_file(kernsock->file);
> + int newfd;
> +
> + newfd = get_unused_fd_flags(O_CLOEXEC);
> + if (newfd < 0) {
> + fput(file);
> + return newfd;
> + }
> +
> + fd_install(newfd, file);
> + return newfd;
> +}
> +
> +static const struct nla_policy
> +handshake_accept_nl_policy[HANDSHAKE_A_ACCEPT_HANDLER_CLASS + 1] = {
> + [HANDSHAKE_A_ACCEPT_HANDLER_CLASS] = { .type = NLA_U32, },
> +};
> +
> +static int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *gi)
> +{
> + struct nlattr *tb[HANDSHAKE_A_ACCEPT_MAX + 1];
> + struct net *net = sock_net(skb->sk);
> + struct handshake_req *pos, *req;
> + int fd, err;
> +
> + err = -EINVAL;
> + if (genlmsg_parse(nlmsg_hdr(skb), &handshake_genl_family, tb,
> + HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
> + handshake_accept_nl_policy, NULL))
> + goto out_status;
> + if (!tb[HANDSHAKE_A_ACCEPT_HANDLER_CLASS])
> + goto out_status;
> +
> + req = NULL;
> + spin_lock(&net->hs_lock);
> + list_for_each_entry(pos, &net->hs_requests, hr_list) {
> + if (pos->hr_proto->hp_handler_class !=
> + nla_get_u32(tb[HANDSHAKE_A_ACCEPT_HANDLER_CLASS]))
> + continue;
> + __remove_pending_locked(net, pos);
> + req = pos;
> + break;
> + }
> + spin_unlock(&net->hs_lock);
> + if (!req)
> + goto out_status;
> +
> + fd = handshake_dup(req->hr_sock);
> + if (fd < 0) {
> + err = fd;
> + goto out_complete;
> + }
> + err = req->hr_proto->hp_accept(req, gi, fd);
> + if (err)
> + goto out_complete;
> +
> + trace_handshake_cmd_accept(net, req, req->hr_sock, fd);
> + return 0;
> +
> +out_complete:
> + handshake_complete(req, -EIO, NULL);
> + fput(req->hr_sock->file);
> +out_status:
> + trace_handshake_cmd_accept_err(net, req, NULL, err);
> + return handshake_status_reply(skb, gi, err);
> +}
> +
> +static const struct nla_policy
> +handshake_done_nl_policy[HANDSHAKE_A_DONE_MAX + 1] = {
> + [HANDSHAKE_A_DONE_SOCKFD] = { .type = NLA_U32, },
> + [HANDSHAKE_A_DONE_STATUS] = { .type = NLA_U32, },
> + [HANDSHAKE_A_DONE_REMOTE_PEERID] = { .type = NLA_U32, },
> +};
> +
> +static int handshake_nl_done_doit(struct sk_buff *skb, struct genl_info *gi)
> +{
> + struct nlattr *tb[HANDSHAKE_A_DONE_MAX + 1];
> + struct net *net = sock_net(skb->sk);
> + struct socket *sock = NULL;
> + struct handshake_req *req;
> + int fd, status, err;
> +
> + err = genlmsg_parse(nlmsg_hdr(skb), &handshake_genl_family, tb,
> + HANDSHAKE_A_DONE_MAX, handshake_done_nl_policy,
> + NULL);
> + if (err || !tb[HANDSHAKE_A_DONE_SOCKFD]) {
> + err = -EINVAL;
> + goto out_status;
> + }
> +
> + fd = nla_get_u32(tb[HANDSHAKE_A_DONE_SOCKFD]);
> +
> + err = 0;
> + sock = sockfd_lookup(fd, &err);
> + if (err) {
> + err = -EBADF;
> + goto out_status;
> + }
> +
> + req = sock->sk->sk_handshake_req;
> + if (!req) {
> + err = -EBUSY;
> + goto out_status;
> + }
> +
> + trace_handshake_cmd_done(net, req, sock, fd);
> +
> + status = -EIO;
> + if (tb[HANDSHAKE_A_DONE_STATUS])
> + status = nla_get_u32(tb[HANDSHAKE_A_DONE_STATUS]);
> +
And this makes me ever so slightly uneasy.
As 'status' is a netlink attribute it's inevitably defined as 'unsigned'.
Yet we assume that 'status' is a negative number, leaving us
_technically_ in unchartered territory.
And that is notwithstanding the problem that we haven't even defined
_what_ should be in the status attribute.
Reading the code I assume that it's either '0' for success or a negative
number (ie the error code) on failure.
Which implicitely means that we _never_ set a positive number here.
So what would we lose if we declare 'status' to carry the _positive_
error number instead?
It would bring us in-line with the actual netlink attribute definition,
we wouldn't need to worry about possible integer overflows, yadda yadda...
Hmm?
> + handshake_complete(req, status, tb);
> + fput(sock->file);
> + return 0;
> +
> +out_status:
> + trace_handshake_cmd_done_err(net, req, sock, err);
> + return handshake_status_reply(skb, gi, err);
> +}
> +
> +static const struct genl_split_ops handshake_nl_ops[] = {
> + {
> + .cmd = HANDSHAKE_CMD_ACCEPT,
> + .doit = handshake_nl_accept_doit,
> + .policy = handshake_accept_nl_policy,
> + .maxattr = HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
> + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
> + },
> + {
> + .cmd = HANDSHAKE_CMD_DONE,
> + .doit = handshake_nl_done_doit,
> + .policy = handshake_done_nl_policy,
> + .maxattr = HANDSHAKE_A_DONE_REMOTE_PEERID,
> + .flags = GENL_CMD_CAP_DO,
> + },
> +};
> +
> +static const struct genl_multicast_group handshake_nl_mcgrps[] = {
> + [HANDSHAKE_HANDLER_CLASS_NONE] = { .name = HANDSHAKE_MCGRP_NONE, },
> +};
> +
> +static struct genl_family __ro_after_init handshake_genl_family = {
> + .hdrsize = 0,
> + .name = HANDSHAKE_FAMILY_NAME,
> + .version = HANDSHAKE_FAMILY_VERSION,
> + .netnsok = true,
> + .parallel_ops = true,
> + .n_mcgrps = ARRAY_SIZE(handshake_nl_mcgrps),
> + .n_split_ops = ARRAY_SIZE(handshake_nl_ops),
> + .split_ops = handshake_nl_ops,
> + .mcgrps = handshake_nl_mcgrps,
> + .module = THIS_MODULE,
> +};
> +
> +static int __net_init handshake_net_init(struct net *net)
> +{
> + spin_lock_init(&net->hs_lock);
> + INIT_LIST_HEAD(&net->hs_requests);
> + net->hs_pending = 0;
> + return 0;
> +}
> +
> +static void __net_exit handshake_net_exit(struct net *net)
> +{
> + struct handshake_req *req;
> + LIST_HEAD(requests);
> +
> + /*
> + * This drains the net's pending list. Requests that
> + * have been accepted and are in progress will be
> + * destroyed when the socket is closed.
> + */
> + spin_lock(&net->hs_lock);
> + list_splice_init(&requests, &net->hs_requests);
> + spin_unlock(&net->hs_lock);
> +
> + while (!list_empty(&requests)) {
> + req = list_first_entry(&requests, struct handshake_req, hr_list);
> + list_del(&req->hr_list);
> +
> + /*
> + * Requests on this list have not yet been
> + * accepted, so they do not have an fd to put.
> + */
> +
> + handshake_complete(req, -ETIMEDOUT, NULL);
> + }
> +}
> +
> +static struct pernet_operations handshake_genl_net_ops = {
> + .init = handshake_net_init,
> + .exit = handshake_net_exit,
> +};
> +
> +static int __init handshake_init(void)
> +{
> + int ret;
> +
> + ret = genl_register_family(&handshake_genl_family);
> + if (ret) {
> + pr_warn("handshake: netlink registration failed (%d)\n", ret);
> + return ret;
> + }
> +
> + ret = register_pernet_subsys(&handshake_genl_net_ops);
> + if (ret) {
> + pr_warn("handshake: pernet registration failed (%d)\n", ret);
> + genl_unregister_family(&handshake_genl_family);
> + }
> +
> + handshake_genl_inited = true;
> + return ret;
> +}
> +
> +static void __exit handshake_exit(void)
> +{
> + unregister_pernet_subsys(&handshake_genl_net_ops);
> + genl_unregister_family(&handshake_genl_family);
> +}
> +
> +module_init(handshake_init);
> +module_exit(handshake_exit);
> diff --git a/net/handshake/request.c b/net/handshake/request.c
> new file mode 100644
> index 000000000000..1d3b8e76dd2c
> --- /dev/null
> +++ b/net/handshake/request.c
> @@ -0,0 +1,246 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Handshake request lifetime events
> + *
> + * Author: Chuck Lever <chuck.lever@...cle.com>
> + *
> + * Copyright (c) 2023, Oracle and/or its affiliates.
> + */
> +
> +#include <linux/types.h>
> +#include <linux/socket.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/skbuff.h>
> +#include <linux/inet.h>
> +#include <linux/fdtable.h>
> +
> +#include <net/sock.h>
> +#include <net/genetlink.h>
> +#include <net/handshake.h>
> +
> +#include <uapi/linux/handshake.h>
> +#include <trace/events/handshake.h>
> +#include "handshake.h"
> +
> +/*
> + * This limit is to prevent slow remotes from causing denial of service.
> + * A ulimit-style tunable might be used instead.
> + */
> +#define HANDSHAKE_PENDING_MAX (10)
> +
> +static void __add_pending_locked(struct net *net, struct handshake_req *req)
> +{
> + net->hs_pending++;
> + list_add_tail(&req->hr_list, &net->hs_requests);
> +}
> +
> +void __remove_pending_locked(struct net *net, struct handshake_req *req)
> +{
> + net->hs_pending--;
> + list_del_init(&req->hr_list);
> +}
> +
> +/*
> + * Return values:
> + * %true - the request was found on @net's pending list
> + * %false - the request was not found on @net's pending list
> + *
> + * If @req was on a pending list, it has not yet been accepted.
> + */
> +static bool remove_pending(struct net *net, struct handshake_req *req)
> +{
> + bool ret;
> +
> + ret = false;
> +
> + spin_lock(&net->hs_lock);
> + if (!list_empty(&req->hr_list)) {
> + __remove_pending_locked(net, req);
> + ret = true;
> + }
> + spin_unlock(&net->hs_lock);
> +
> + return ret;
> +}
> +
> +static void handshake_req_destroy(struct handshake_req *req, struct sock *sk)
> +{
> + req->hr_proto->hp_destroy(req);
> + sk->sk_handshake_req = NULL;
> + kfree(req);
> +}
> +
> +static void handshake_sk_destruct(struct sock *sk)
> +{
> + struct handshake_req *req = sk->sk_handshake_req;
> +
> + if (req) {
> + trace_handshake_destruct(sock_net(sk), req, req->hr_sock);
> + handshake_req_destroy(req, sk);
> + }
> +}
> +
> +/**
> + * handshake_req_alloc - consumer API to allocate a request
> + * @sock: open socket on which to perform a handshake
> + * @proto: security protocol
> + * @flags: memory allocation flags
> + *
> + * Returns an initialized handshake_req or NULL.
> + */
> +struct handshake_req *handshake_req_alloc(struct socket *sock,
> + const struct handshake_proto *proto,
> + gfp_t flags)
> +{
> + struct handshake_req *req;
> +
> + /* Avoid accessing uninitialized global variables later on */
> + if (!handshake_genl_inited)
> + return NULL;
> +
> + req = kzalloc(sizeof(*req) + proto->hp_privsize, flags);
> + if (!req)
> + return NULL;
> +
> + sock_hold(sock->sk);
> +
> + INIT_LIST_HEAD(&req->hr_list);
> + req->hr_sock = sock;
> + req->hr_proto = proto;
> + return req;
> +}
> +EXPORT_SYMBOL(handshake_req_alloc);
> +
> +/**
> + * handshake_req_private - consumer API to return per-handshake private data
> + * @req: handshake arguments
> + *
> + */
> +void *handshake_req_private(struct handshake_req *req)
> +{
> + return (void *)(req + 1);
> +}
> +EXPORT_SYMBOL(handshake_req_private);
> +
> +/**
> + * handshake_req_submit - consumer API to submit a handshake request
> + * @req: handshake arguments
> + * @flags: memory allocation flags
> + *
> + * Return values:
> + * %0: Request queued
> + * %-EBUSY: A handshake is already under way for this socket
> + * %-ESRCH: No handshake agent is available
> + * %-EAGAIN: Too many pending handshake requests
> + * %-ENOMEM: Failed to allocate memory
> + * %-EMSGSIZE: Failed to construct notification message
> + *
> + * A zero return value from handshake_request() means that
> + * exactly one subsequent completion callback is guaranteed.
> + *
> + * A negative return value from handshake_request() means that
> + * no completion callback will be done and that @req is
> + * destroyed.
> + */
> +int handshake_req_submit(struct handshake_req *req, gfp_t flags)
> +{
> + struct socket *sock = req->hr_sock;
> + struct sock *sk = sock->sk;
> + struct net *net = sock_net(sk);
> + int ret;
> +
> + ret = -EAGAIN;
> + if (READ_ONCE(net->hs_pending) >= HANDSHAKE_PENDING_MAX)
> + goto out_err;
> +
> + ret = -EBUSY;
> + spin_lock(&net->hs_lock);
> + if (sk->sk_handshake_req || !list_empty(&req->hr_list)) {
> + spin_unlock(&net->hs_lock);
> + goto out_err;
> + }
> + req->hr_saved_destruct = sk->sk_destruct;
> + sk->sk_destruct = handshake_sk_destruct;
> + sk->sk_handshake_req = req;
> + __add_pending_locked(net, req);
> + spin_unlock(&net->hs_lock);
> +
> + ret = handshake_genl_notify(net, req->hr_proto->hp_handler_class,
> + flags);
> + if (ret) {
> + trace_handshake_notify_err(net, req, sock, ret);
> + if (remove_pending(net, req))
> + goto out_err;
> + }
> +
> + trace_handshake_submit(net, req, sock);
> + return 0;
> +
> +out_err:
> + trace_handshake_submit_err(net, req, sock, ret);
> + handshake_req_destroy(req, sk);
> + return ret;
> +}
> +EXPORT_SYMBOL(handshake_req_submit);
> +
> +void handshake_complete(struct handshake_req *req, int status,
> + struct nlattr **tb)
> +{
> + struct socket *sock = req->hr_sock;
> + struct net *net = sock_net(sock->sk);
> +
> + if (!test_and_set_bit(HANDSHAKE_F_COMPLETED, &req->hr_flags)) {
> + trace_handshake_complete(net, req, sock, status);
> + req->hr_proto->hp_done(req, status, tb);
> + __sock_put(sock->sk);
> + }
> +}
> +
> +/**
> + * handshake_req_cancel - consumer API to cancel an in-progress handshake
> + * @sock: socket on which there is an ongoing handshake
> + *
> + * XXX: Perhaps killing the user space agent might also be necessary?
I thought we had agreed that we would be sending a signal to the
userspace process?
Ideally we would be sending a SIGHUP, wait for some time on the
userspace process to respond with a 'done' message, and send a 'KILL'
signal if we haven't received one.
Obs: Sending a KILL signal would imply that userspace is able to cope
with children dying. Which pretty much excludes pthreads, I would think.
Guess I'll have to consult Stevens :-)
> + *
> + * Request cancellation races with request completion. To determine
> + * who won, callers examine the return value from this function.
> + *
> + * Return values:
> + * %0 - Uncompleted handshake request was canceled or not found
> + * %-EBUSY - Handshake request already completed
EBUSY? Wouldn't be EAGAIN more approriate?
After all, the request is everything _but_ busy...
> + */
> +int handshake_req_cancel(struct socket *sock)
> +{
> + struct handshake_req *req;
> + struct sock *sk;
> + struct net *net;
> +
> + if (!sock)
> + return 0;
> +
> + sk = sock->sk;
> + req = sk->sk_handshake_req;
> + net = sock_net(sk);
> +
> + if (!req) {
> + trace_handshake_cancel_none(net, req, sock);
> + return 0;
> + }
> +
> + if (remove_pending(net, req)) {
> + /* Request hadn't been accepted */
> + trace_handshake_cancel(net, req, sock);
> + return 0;
> + }
> + if (test_and_set_bit(HANDSHAKE_F_COMPLETED, &req->hr_flags)) {
> + /* Request already completed */
> + trace_handshake_cancel_busy(net, req, sock);
> + return -EBUSY;
> + }
> +
> + __sock_put(sk);
> + trace_handshake_cancel(net, req, sock);
> + return 0;
> +}
> +EXPORT_SYMBOL(handshake_req_cancel);
> diff --git a/net/handshake/trace.c b/net/handshake/trace.c
> new file mode 100644
> index 000000000000..3a5b6f29a2b8
> --- /dev/null
> +++ b/net/handshake/trace.c
> @@ -0,0 +1,17 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Trace points for transport security layer handshakes.
> + *
> + * Author: Chuck Lever <chuck.lever@...cle.com>
> + *
> + * Copyright (c) 2023, Oracle and/or its affiliates.
> + */
> +
> +#include <linux/types.h>
> +#include <net/sock.h>
> +
> +#include "handshake.h"
> +
> +#define CREATE_TRACE_POINTS
> +
> +#include <trace/events/handshake.h>
>
Cheers,
Hannes
Powered by blists - more mailing lists