lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 27 Feb 2023 10:24:29 +0100
From:   Hannes Reinecke <hare@...e.de>
To:     Chuck Lever <cel@...nel.org>, kuba@...nel.org, pabeni@...hat.com,
        edumazet@...gle.com
Cc:     netdev@...r.kernel.org, kernel-tls-handshake@...ts.linux.dev
Subject: Re: [PATCH v5 1/2] net/handshake: Create a NETLINK service for
 handling handshake requests

On 2/24/23 20:19, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@...cle.com>
> 
> When a kernel consumer needs a transport layer security session, it
> first needs a handshake to negotiate and establish a session. This
> negotiation can be done in user space via one of the several
> existing library implementations, or it can be done in the kernel.
> 
> No in-kernel handshake implementations yet exist. In their absence,
> we add a netlink service that can:
> 
> a. Notify a user space daemon that a handshake is needed.
> 
> b. Once notified, the daemon calls the kernel back via this
>     netlink service to get the handshake parameters, including an
>     open socket on which to establish the session.
> 
> c. Once the handshake is complete, the daemon reports the
>     session status and other information via a second netlink
>     operation. This operation marks that it is safe for the
>     kernel to use the open socket and the security session
>     established there.
> 
> The notification service uses a multicast group. Each handshake
> mechanism (eg, tlshd) adopts its own group number so that the
> handshake services are completely independent of one another. The
> kernel can then tell via netlink_has_listeners() whether a handshake
> service is active and prepared to handle a handshake request.
> 
> A new netlink operation, ACCEPT, acts like accept(2) in that it
> instantiates a file descriptor in the user space daemon's fd table.
> If this operation is successful, the reply carries the fd number,
> which can be treated as an open and ready file descriptor.
> 
> While user space is performing the handshake, the kernel keeps its
> muddy paws off the open socket. A second new netlink operation,
> DONE, indicates that the user space daemon is finished with the
> socket and it is safe for the kernel to use again. The operation
> also indicates whether a session was established successfully.
> 
> Signed-off-by: Chuck Lever <chuck.lever@...cle.com>
> ---
>   Documentation/netlink/specs/handshake.yaml |  134 +++++++++++
>   include/net/handshake.h                    |   45 ++++
>   include/net/net_namespace.h                |    5
>   include/net/sock.h                         |    1
>   include/trace/events/handshake.h           |  159 +++++++++++++
>   include/uapi/linux/handshake.h             |   63 +++++
>   net/Makefile                               |    1
>   net/handshake/Makefile                     |   11 +
>   net/handshake/handshake.h                  |   41 +++
>   net/handshake/netlink.c                    |  340 ++++++++++++++++++++++++++++
>   net/handshake/request.c                    |  246 ++++++++++++++++++++
>   net/handshake/trace.c                      |   17 +
>   12 files changed, 1063 insertions(+)
>   create mode 100644 Documentation/netlink/specs/handshake.yaml
>   create mode 100644 include/net/handshake.h
>   create mode 100644 include/trace/events/handshake.h
>   create mode 100644 include/uapi/linux/handshake.h
>   create mode 100644 net/handshake/Makefile
>   create mode 100644 net/handshake/handshake.h
>   create mode 100644 net/handshake/netlink.c
>   create mode 100644 net/handshake/request.c
>   create mode 100644 net/handshake/trace.c
> 
> diff --git a/Documentation/netlink/specs/handshake.yaml b/Documentation/netlink/specs/handshake.yaml
> new file mode 100644
> index 000000000000..683a8f2df0a7
> --- /dev/null
> +++ b/Documentation/netlink/specs/handshake.yaml
> @@ -0,0 +1,134 @@
> +# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
> +#
> +# GENL HANDSHAKE service.
> +#
> +# Author: Chuck Lever <chuck.lever@...cle.com>
> +#
> +# Copyright (c) 2023, Oracle and/or its affiliates.
> +#
> +
> +name: handshake
> +
> +protocol: genetlink-c
> +
> +doc: Netlink protocol to request a transport layer security handshake.
> +
> +uapi-header: linux/net/handshake.h
> +
> +definitions:
> +  -
> +    type: enum
> +    name: handler-class
> +    enum-name:
> +    value-start: 0
> +    entries: [ none ]
> +  -
> +    type: enum
> +    name: msg-type
> +    enum-name:
> +    value-start: 0
> +    entries: [ unspec, clienthello, serverhello ]
> +  -
> +    type: enum
> +    name: auth
> +    enum-name:
> +    value-start: 0
> +    entries: [ unspec, unauth, x509, psk ]
> +
> +attribute-sets:
> +  -
> +    name: accept
> +    attributes:
> +      -
> +        name: status
> +        doc: Status of this accept operation
> +        type: u32
> +        value: 1
> +      -
> +        name: sockfd
> +        doc: File descriptor of socket to use
> +        type: u32
> +      -
> +        name: handler-class
> +        doc: Which type of handler is responding
> +        type: u32
> +        enum: handler-class
> +      -
> +        name: message-type
> +        doc: Handshake message type
> +        type: u32
> +        enum: msg-type
> +      -
> +        name: auth
> +        doc: Authentication mode
> +        type: u32
> +        enum: auth
> +      -
> +        name: gnutls-priorities
> +        doc: GnuTLS priority string
> +        type: string
> +      -
> +        name: my-peerid
> +        doc: Serial no of key containing local identity
> +        type: u32
> +      -
> +        name: my-privkey
> +        doc: Serial no of key containing optional private key
> +        type: u32
> +  -
> +    name: done
> +    attributes:
> +      -
> +        name: status
> +        doc: Session status
> +        type: u32
> +        value: 1
> +      -
> +        name: sockfd
> +        doc: File descriptor of socket that has completed
> +        type: u32
> +      -
> +        name: remote-peerid
> +        doc: Serial no of keys containing identities of remote peer
> +        type: u32
> +
> +operations:
> +  list:
> +    -
> +      name: ready
> +      doc: Notify handlers that a new handshake request is waiting
> +      value: 1
> +      notify: accept
> +    -
> +      name: accept
> +      doc: Handler retrieves next queued handshake request
> +      attribute-set: accept
> +      flags: [ admin-perm ]
> +      do:
> +        request:
> +          attributes:
> +            - handler-class
> +        reply:
> +          attributes:
> +            - status
> +            - sockfd
> +            - message-type
> +            - auth
> +            - gnutls-priorities
> +            - my-peerid
> +            - my-privkey
> +    -
> +      name: done
> +      doc: Handler reports handshake completion
> +      attribute-set: done
> +      do:
> +        request:
> +          attributes:
> +            - status
> +            - sockfd
> +            - remote-peerid
> +
> +mcast-groups:
> +  list:
> +    -
> +      name: none
> diff --git a/include/net/handshake.h b/include/net/handshake.h
> new file mode 100644
> index 000000000000..08f859237936
> --- /dev/null
> +++ b/include/net/handshake.h
> @@ -0,0 +1,45 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Generic HANDSHAKE service.
> + *
> + * Author: Chuck Lever <chuck.lever@...cle.com>
> + *
> + * Copyright (c) 2023, Oracle and/or its affiliates.
> + */
> +
> +/*
> + * Data structures and functions that are visible only within the
> + * kernel are declared here.
> + */
> +
> +#ifndef _NET_HANDSHAKE_H
> +#define _NET_HANDSHAKE_H
> +
> +struct handshake_req;
> +
> +/*
> + * Invariants for all handshake requests for one transport layer
> + * security protocol
> + */
> +struct handshake_proto {
> +	int			hp_handler_class;
> +	size_t			hp_privsize;
> +
> +	int			(*hp_accept)(struct handshake_req *req,
> +					     struct genl_info *gi, int fd);
> +	void			(*hp_done)(struct handshake_req *req,
> +					   int status, struct nlattr **tb);
> +	void			(*hp_destroy)(struct handshake_req *req);
> +};
> +
> +extern struct handshake_req *
> +handshake_req_alloc(struct socket *sock, const struct handshake_proto *proto,
> +		    gfp_t flags);
> +extern void *handshake_req_private(struct handshake_req *req);
> +extern int handshake_req_submit(struct handshake_req *req, gfp_t flags);
> +extern int handshake_req_cancel(struct socket *sock);
> +
> +extern struct nlmsghdr *handshake_genl_put(struct sk_buff *msg,
> +					   struct genl_info *gi);
> +
> +#endif /* _NET_HANDSHAKE_H */
> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
> index 78beaa765c73..a0ce9de4dab1 100644
> --- a/include/net/net_namespace.h
> +++ b/include/net/net_namespace.h
> @@ -188,6 +188,11 @@ struct net {
>   #if IS_ENABLED(CONFIG_SMC)
>   	struct netns_smc	smc;
>   #endif
> +
> +	/* transport layer security handshake requests */
> +	spinlock_t		hs_lock;
> +	struct list_head	hs_requests;
> +	int			hs_pending;
>   } __randomize_layout;
>   
>   #include <linux/seq_file_net.h>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 573f2bf7e0de..2a7345ce2540 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -519,6 +519,7 @@ struct sock {
>   
>   	struct socket		*sk_socket;
>   	void			*sk_user_data;
> +	void			*sk_handshake_req;
>   #ifdef CONFIG_SECURITY
>   	void			*sk_security;
>   #endif
> diff --git a/include/trace/events/handshake.h b/include/trace/events/handshake.h
> new file mode 100644
> index 000000000000..feffcd1d6256
> --- /dev/null
> +++ b/include/trace/events/handshake.h
> @@ -0,0 +1,159 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM handshake
> +
> +#if !defined(_TRACE_HANDSHAKE_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_HANDSHAKE_H
> +
> +#include <linux/net.h>
> +#include <linux/tracepoint.h>
> +
> +DECLARE_EVENT_CLASS(handshake_event_class,
> +	TP_PROTO(
> +		const struct net *net,
> +		const struct handshake_req *req,
> +		const struct socket *sock
> +	),
> +	TP_ARGS(net, req, sock),
> +	TP_STRUCT__entry(
> +		__field(const void *, req)
> +		__field(const void *, sock)
> +		__field(unsigned int, netns_ino)
> +	),
> +	TP_fast_assign(
> +		__entry->req = req;
> +		__entry->sock = sock;
> +		__entry->netns_ino = net->ns.inum;
> +	),
> +	TP_printk("req=%p sock=%p",
> +		__entry->req, __entry->sock
> +	)
> +);
> +#define DEFINE_HANDSHAKE_EVENT(name)				\
> +	DEFINE_EVENT(handshake_event_class, name,		\
> +		TP_PROTO(					\
> +			const struct net *net,			\
> +			const struct handshake_req *req,	\
> +			const struct socket *sock		\
> +		),						\
> +		TP_ARGS(net, req, sock))
> +
> +DECLARE_EVENT_CLASS(handshake_fd_class,
> +	TP_PROTO(
> +		const struct net *net,
> +		const struct handshake_req *req,
> +		const struct socket *sock,
> +		int fd
> +	),
> +	TP_ARGS(net, req, sock, fd),
> +	TP_STRUCT__entry(
> +		__field(const void *, req)
> +		__field(const void *, sock)
> +		__field(int, fd)
> +		__field(unsigned int, netns_ino)
> +	),
> +	TP_fast_assign(
> +		__entry->req = req;
> +		__entry->sock = req->hr_sock;
> +		__entry->fd = fd;
> +		__entry->netns_ino = net->ns.inum;
> +	),
> +	TP_printk("req=%p sock=%p fd=%d",
> +		__entry->req, __entry->sock, __entry->fd
> +	)
> +);
> +#define DEFINE_HANDSHAKE_FD_EVENT(name)				\
> +	DEFINE_EVENT(handshake_fd_class, name,			\
> +		TP_PROTO(					\
> +			const struct net *net,			\
> +			const struct handshake_req *req,	\
> +			const struct socket *sock,		\
> +			int fd					\
> +		),						\
> +		TP_ARGS(net, req, sock, fd))
> +
> +DECLARE_EVENT_CLASS(handshake_error_class,
> +	TP_PROTO(
> +		const struct net *net,
> +		const struct handshake_req *req,
> +		const struct socket *sock,
> +		int err
> +	),
> +	TP_ARGS(net, req, sock, err),
> +	TP_STRUCT__entry(
> +		__field(const void *, req)
> +		__field(const void *, sock)
> +		__field(int, err)
> +		__field(unsigned int, netns_ino)
> +	),
> +	TP_fast_assign(
> +		__entry->req = req;
> +		__entry->sock = sock;
> +		__entry->err = err;
> +		__entry->netns_ino = net->ns.inum;
> +	),
> +	TP_printk("req=%p sock=%p err=%d",
> +		__entry->req, __entry->sock, __entry->err
> +	)
> +);
> +#define DEFINE_HANDSHAKE_ERROR(name)				\
> +	DEFINE_EVENT(handshake_error_class, name,		\
> +		TP_PROTO(					\
> +			const struct net *net,			\
> +			const struct handshake_req *req,	\
> +			const struct socket *sock,		\
> +			int err					\
> +		),						\
> +		TP_ARGS(net, req, sock, err))
> +
> +
> +/**
> + ** Request lifetime events
> + **/
> +
> +DEFINE_HANDSHAKE_EVENT(handshake_submit);
> +DEFINE_HANDSHAKE_ERROR(handshake_submit_err);
> +DEFINE_HANDSHAKE_EVENT(handshake_cancel);
> +DEFINE_HANDSHAKE_EVENT(handshake_cancel_none);
> +DEFINE_HANDSHAKE_EVENT(handshake_cancel_busy);
> +DEFINE_HANDSHAKE_EVENT(handshake_destruct);
> +
> +
> +TRACE_EVENT(handshake_complete,
> +	TP_PROTO(
> +		const struct net *net,
> +		const struct handshake_req *req,
> +		const struct socket *sock,
> +		int status
> +	),
> +	TP_ARGS(net, req, sock, status),
> +	TP_STRUCT__entry(
> +		__field(const void *, req)
> +		__field(const void *, sock)
> +		__field(int, status)
> +		__field(unsigned int, netns_ino)
> +	),
> +	TP_fast_assign(
> +		__entry->req = req;
> +		__entry->sock = sock;
> +		__entry->status = status;
> +		__entry->netns_ino = net->ns.inum;
> +	),
> +	TP_printk("req=%p sock=%p status=%d",
> +		__entry->req, __entry->sock, __entry->status
> +	)
> +);
> +
> +/**
> + ** Netlink events
> + **/
> +
> +DEFINE_HANDSHAKE_ERROR(handshake_notify_err);
> +DEFINE_HANDSHAKE_FD_EVENT(handshake_cmd_accept);
> +DEFINE_HANDSHAKE_ERROR(handshake_cmd_accept_err);
> +DEFINE_HANDSHAKE_FD_EVENT(handshake_cmd_done);
> +DEFINE_HANDSHAKE_ERROR(handshake_cmd_done_err);
> +
> +#endif /* _TRACE_HANDSHAKE_H */
> +
> +#include <trace/define_trace.h>
> diff --git a/include/uapi/linux/handshake.h b/include/uapi/linux/handshake.h
> new file mode 100644
> index 000000000000..09fd7c37cba4
> --- /dev/null
> +++ b/include/uapi/linux/handshake.h
> @@ -0,0 +1,63 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/* Do not edit directly, auto-generated from: */
> +/*	Documentation/netlink/specs/handshake.yaml */
> +/* YNL-GEN uapi header */
> +
> +#ifndef _UAPI_LINUX_HANDSHAKE_H
> +#define _UAPI_LINUX_HANDSHAKE_H
> +
> +#define HANDSHAKE_FAMILY_NAME		"handshake"
> +#define HANDSHAKE_FAMILY_VERSION	1
> +
> +enum {
> +	HANDSHAKE_HANDLER_CLASS_NONE,
> +};
> +
> +enum {
> +	HANDSHAKE_MSG_TYPE_UNSPEC,
> +	HANDSHAKE_MSG_TYPE_CLIENTHELLO,
> +	HANDSHAKE_MSG_TYPE_SERVERHELLO,
> +};
> +
> +enum {
> +	HANDSHAKE_AUTH_UNSPEC,
> +	HANDSHAKE_AUTH_UNAUTH,
> +	HANDSHAKE_AUTH_X509,
> +	HANDSHAKE_AUTH_PSK,
> +};
> +
> +enum {
> +	HANDSHAKE_A_ACCEPT_STATUS = 1,
> +	HANDSHAKE_A_ACCEPT_SOCKFD,
> +	HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
> +	HANDSHAKE_A_ACCEPT_MESSAGE_TYPE,
> +	HANDSHAKE_A_ACCEPT_AUTH,
> +	HANDSHAKE_A_ACCEPT_GNUTLS_PRIORITIES,
> +	HANDSHAKE_A_ACCEPT_MY_PEERID,
> +	HANDSHAKE_A_ACCEPT_MY_PRIVKEY,
> +
> +	__HANDSHAKE_A_ACCEPT_MAX,
> +	HANDSHAKE_A_ACCEPT_MAX = (__HANDSHAKE_A_ACCEPT_MAX - 1)
> +};
> +
> +enum {
> +	HANDSHAKE_A_DONE_STATUS = 1,
> +	HANDSHAKE_A_DONE_SOCKFD,
> +	HANDSHAKE_A_DONE_REMOTE_PEERID,
> +
> +	__HANDSHAKE_A_DONE_MAX,
> +	HANDSHAKE_A_DONE_MAX = (__HANDSHAKE_A_DONE_MAX - 1)
> +};
> +
> +enum {
> +	HANDSHAKE_CMD_READY = 1,
> +	HANDSHAKE_CMD_ACCEPT,
> +	HANDSHAKE_CMD_DONE,
> +
> +	__HANDSHAKE_CMD_MAX,
> +	HANDSHAKE_CMD_MAX = (__HANDSHAKE_CMD_MAX - 1)
> +};
> +
> +#define HANDSHAKE_MCGRP_NONE	"none"
> +
> +#endif /* _UAPI_LINUX_HANDSHAKE_H */
> diff --git a/net/Makefile b/net/Makefile
> index 0914bea9c335..adbb64277601 100644
> --- a/net/Makefile
> +++ b/net/Makefile
> @@ -79,3 +79,4 @@ obj-$(CONFIG_NET_NCSI)		+= ncsi/
>   obj-$(CONFIG_XDP_SOCKETS)	+= xdp/
>   obj-$(CONFIG_MPTCP)		+= mptcp/
>   obj-$(CONFIG_MCTP)		+= mctp/
> +obj-y				+= handshake/
> diff --git a/net/handshake/Makefile b/net/handshake/Makefile
> new file mode 100644
> index 000000000000..a41b03f4837b
> --- /dev/null
> +++ b/net/handshake/Makefile
> @@ -0,0 +1,11 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +# Makefile for the Generic HANDSHAKE service
> +#
> +# Author: Chuck Lever <chuck.lever@...cle.com>
> +#
> +# Copyright (c) 2023, Oracle and/or its affiliates.
> +#
> +
> +obj-y += handshake.o
> +handshake-y := netlink.o request.o trace.o
> diff --git a/net/handshake/handshake.h b/net/handshake/handshake.h
> new file mode 100644
> index 000000000000..366c7659ec09
> --- /dev/null
> +++ b/net/handshake/handshake.h
> @@ -0,0 +1,41 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Generic netlink handshake service
> + *
> + * Author: Chuck Lever <chuck.lever@...cle.com>
> + *
> + * Copyright (c) 2023, Oracle and/or its affiliates.
> + */
> +
> +/*
> + * Data structures and functions that are visible only within the
> + * handshake module are declared here.
> + */
> +
> +#ifndef _INTERNAL_HANDSHAKE_H
> +#define _INTERNAL_HANDSHAKE_H
> +
> +/*
> + * One handshake request
> + */
> +struct handshake_req {
> +	struct list_head		hr_list;
> +	unsigned long			hr_flags;
> +	const struct handshake_proto	*hr_proto;
> +	struct socket			*hr_sock;
> +
> +	void				(*hr_saved_destruct)(struct sock *sk);
> +};
> +
> +#define HANDSHAKE_F_COMPLETED	BIT(0)
> +
> +/* netlink.c */
> +extern bool handshake_genl_inited;
> +int handshake_genl_notify(struct net *net, int handler_class, gfp_t flags);
> +
> +/* request.c */
> +void __remove_pending_locked(struct net *net, struct handshake_req *req);
> +void handshake_complete(struct handshake_req *req, int status,
> +			struct nlattr **tb);
> +
> +#endif /* _INTERNAL_HANDSHAKE_H */
> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
> new file mode 100644
> index 000000000000..581e382236cf
> --- /dev/null
> +++ b/net/handshake/netlink.c
> @@ -0,0 +1,340 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Generic netlink handshake service
> + *
> + * Author: Chuck Lever <chuck.lever@...cle.com>
> + *
> + * Copyright (c) 2023, Oracle and/or its affiliates.
> + */
> +
> +#include <linux/types.h>
> +#include <linux/socket.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/skbuff.h>
> +#include <linux/inet.h>
> +
> +#include <net/sock.h>
> +#include <net/genetlink.h>
> +#include <net/handshake.h>
> +
> +#include <uapi/linux/handshake.h>
> +#include <trace/events/handshake.h>
> +#include "handshake.h"
> +
> +static struct genl_family __ro_after_init handshake_genl_family;
> +bool handshake_genl_inited;
> +
> +/**
> + * handshake_genl_notify - Notify handlers that a request is waiting
> + * @net: target network namespace
> + * @handler_class: target handler
> + * @flags: memory allocation control flags
> + *
> + * Returns zero on success or a negative errno if notification failed.
> + */
> +int handshake_genl_notify(struct net *net, int handler_class, gfp_t flags)
> +{
> +	struct sk_buff *msg;
> +	void *hdr;
> +
> +	if (!genl_has_listeners(&handshake_genl_family, net, handler_class))
> +		return -ESRCH;
> +
> +	msg = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> +	if (!msg)
> +		return -ENOMEM;
> +
> +	hdr = genlmsg_put(msg, 0, 0, &handshake_genl_family, 0,
> +			  HANDSHAKE_CMD_READY);
> +	if (!hdr)
> +		goto out_free;
> +
> +	if (nla_put_u32(msg, HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
> +			handler_class) < 0) {
> +		genlmsg_cancel(msg, hdr);
> +		goto out_free;
> +	}
> +
> +	genlmsg_end(msg, hdr);
> +	return genlmsg_multicast_netns(&handshake_genl_family, net, msg,
> +				       0, handler_class, flags);
> +
> +out_free:
> +	nlmsg_free(msg);
> +	return -EMSGSIZE;
> +}
> +
> +/**
> + * handshake_genl_put - Create a generic netlink message header
> + * @msg: buffer in which to create the header
> + * @gi: generic netlink message context
> + *
> + * Returns a ready-to-use header, or NULL.
> + */
> +struct nlmsghdr *handshake_genl_put(struct sk_buff *msg, struct genl_info *gi)
> +{
> +	return genlmsg_put(msg, gi->snd_portid, gi->snd_seq,
> +			   &handshake_genl_family, 0, gi->genlhdr->cmd);
> +}
> +EXPORT_SYMBOL(handshake_genl_put);
> +
> +static int handshake_status_reply(struct sk_buff *skb, struct genl_info *gi,
> +				  int status)
> +{
> +	struct nlmsghdr *hdr;
> +	struct sk_buff *msg;
> +	int ret;
> +
> +	ret = -ENOMEM;
> +	msg = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> +	if (!msg)
> +		goto out;
> +	hdr = handshake_genl_put(msg, gi);
> +	if (!hdr)
> +		goto out_free;
> +
> +	ret = -EMSGSIZE;
> +	ret = nla_put_u32(msg, HANDSHAKE_A_ACCEPT_STATUS, status);
> +	if (ret < 0)
> +		goto out_free;
> +
> +	genlmsg_end(msg, hdr);
> +	return genlmsg_reply(msg, gi);
> +
> +out_free:
> +	genlmsg_cancel(msg, hdr);
> +out:
> +	return ret;
> +}
> +
> +/*
> + * dup() a kernel socket for use as a user space file descriptor
> + * in the current process.
> + *
> + * Implicit argument: "current()"
> + */
> +static int handshake_dup(struct socket *kernsock)
> +{
> +	struct file *file = get_file(kernsock->file);
> +	int newfd;
> +
> +	newfd = get_unused_fd_flags(O_CLOEXEC);
> +	if (newfd < 0) {
> +		fput(file);
> +		return newfd;
> +	}
> +
> +	fd_install(newfd, file);
> +	return newfd;
> +}
> +
> +static const struct nla_policy
> +handshake_accept_nl_policy[HANDSHAKE_A_ACCEPT_HANDLER_CLASS + 1] = {
> +	[HANDSHAKE_A_ACCEPT_HANDLER_CLASS] = { .type = NLA_U32, },
> +};
> +
> +static int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *gi)
> +{
> +	struct nlattr *tb[HANDSHAKE_A_ACCEPT_MAX + 1];
> +	struct net *net = sock_net(skb->sk);
> +	struct handshake_req *pos, *req;
> +	int fd, err;
> +
> +	err = -EINVAL;
> +	if (genlmsg_parse(nlmsg_hdr(skb), &handshake_genl_family, tb,
> +			  HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
> +			  handshake_accept_nl_policy, NULL))
> +		goto out_status;
> +	if (!tb[HANDSHAKE_A_ACCEPT_HANDLER_CLASS])
> +		goto out_status;
> +
> +	req = NULL;
> +	spin_lock(&net->hs_lock);
> +	list_for_each_entry(pos, &net->hs_requests, hr_list) {
> +		if (pos->hr_proto->hp_handler_class !=
> +		    nla_get_u32(tb[HANDSHAKE_A_ACCEPT_HANDLER_CLASS]))
> +			continue;
> +		__remove_pending_locked(net, pos);
> +		req = pos;
> +		break;
> +	}
> +	spin_unlock(&net->hs_lock);
> +	if (!req)
> +		goto out_status;
> +
> +	fd = handshake_dup(req->hr_sock);
> +	if (fd < 0) {
> +		err = fd;
> +		goto out_complete;
> +	}
> +	err = req->hr_proto->hp_accept(req, gi, fd);
> +	if (err)
> +		goto out_complete;
> +
> +	trace_handshake_cmd_accept(net, req, req->hr_sock, fd);
> +	return 0;
> +
> +out_complete:
> +	handshake_complete(req, -EIO, NULL);
> +	fput(req->hr_sock->file);
> +out_status:
> +	trace_handshake_cmd_accept_err(net, req, NULL, err);
> +	return handshake_status_reply(skb, gi, err);
> +}
> +
> +static const struct nla_policy
> +handshake_done_nl_policy[HANDSHAKE_A_DONE_MAX + 1] = {
> +	[HANDSHAKE_A_DONE_SOCKFD] = { .type = NLA_U32, },
> +	[HANDSHAKE_A_DONE_STATUS] = { .type = NLA_U32, },
> +	[HANDSHAKE_A_DONE_REMOTE_PEERID] = { .type = NLA_U32, },
> +};
> +
> +static int handshake_nl_done_doit(struct sk_buff *skb, struct genl_info *gi)
> +{
> +	struct nlattr *tb[HANDSHAKE_A_DONE_MAX + 1];
> +	struct net *net = sock_net(skb->sk);
> +	struct socket *sock = NULL;
> +	struct handshake_req *req;
> +	int fd, status, err;
> +
> +	err = genlmsg_parse(nlmsg_hdr(skb), &handshake_genl_family, tb,
> +			    HANDSHAKE_A_DONE_MAX, handshake_done_nl_policy,
> +			    NULL);
> +	if (err || !tb[HANDSHAKE_A_DONE_SOCKFD]) {
> +		err = -EINVAL;
> +		goto out_status;
> +	}
> +
> +	fd = nla_get_u32(tb[HANDSHAKE_A_DONE_SOCKFD]);
> +
> +	err = 0;
> +	sock = sockfd_lookup(fd, &err);
> +	if (err) {
> +		err = -EBADF;
> +		goto out_status;
> +	}
> +
> +	req = sock->sk->sk_handshake_req;
> +	if (!req) {
> +		err = -EBUSY;
> +		goto out_status;
> +	}
> +
> +	trace_handshake_cmd_done(net, req, sock, fd);
> +
> +	status = -EIO;
> +	if (tb[HANDSHAKE_A_DONE_STATUS])
> +		status = nla_get_u32(tb[HANDSHAKE_A_DONE_STATUS]);
> +
And this makes me ever so slightly uneasy.

As 'status' is a netlink attribute it's inevitably defined as 'unsigned'.
Yet we assume that 'status' is a negative number, leaving us 
_technically_ in unchartered territory.

And that is notwithstanding the problem that we haven't even defined 
_what_ should be in the status attribute.

Reading the code I assume that it's either '0' for success or a negative 
number (ie the error code) on failure.
Which implicitely means that we _never_ set a positive number here.
So what would we lose if we declare 'status' to carry the _positive_ 
error number instead?
It would bring us in-line with the actual netlink attribute definition, 
we wouldn't need to worry about possible integer overflows, yadda yadda...

Hmm?

> +	handshake_complete(req, status, tb);
> +	fput(sock->file);
> +	return 0;
> +
> +out_status:
> +	trace_handshake_cmd_done_err(net, req, sock, err);
> +	return handshake_status_reply(skb, gi, err);
> +}
> +
> +static const struct genl_split_ops handshake_nl_ops[] = {
> +	{
> +		.cmd		= HANDSHAKE_CMD_ACCEPT,
> +		.doit		= handshake_nl_accept_doit,
> +		.policy		= handshake_accept_nl_policy,
> +		.maxattr	= HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
> +		.flags		= GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
> +	},
> +	{
> +		.cmd		= HANDSHAKE_CMD_DONE,
> +		.doit		= handshake_nl_done_doit,
> +		.policy		= handshake_done_nl_policy,
> +		.maxattr	= HANDSHAKE_A_DONE_REMOTE_PEERID,
> +		.flags		= GENL_CMD_CAP_DO,
> +	},
> +};
> +
> +static const struct genl_multicast_group handshake_nl_mcgrps[] = {
> +	[HANDSHAKE_HANDLER_CLASS_NONE] = { .name = HANDSHAKE_MCGRP_NONE, },
> +};
> +
> +static struct genl_family __ro_after_init handshake_genl_family = {
> +	.hdrsize		= 0,
> +	.name			= HANDSHAKE_FAMILY_NAME,
> +	.version		= HANDSHAKE_FAMILY_VERSION,
> +	.netnsok		= true,
> +	.parallel_ops		= true,
> +	.n_mcgrps		= ARRAY_SIZE(handshake_nl_mcgrps),
> +	.n_split_ops		= ARRAY_SIZE(handshake_nl_ops),
> +	.split_ops		= handshake_nl_ops,
> +	.mcgrps			= handshake_nl_mcgrps,
> +	.module			= THIS_MODULE,
> +};
> +
> +static int __net_init handshake_net_init(struct net *net)
> +{
> +	spin_lock_init(&net->hs_lock);
> +	INIT_LIST_HEAD(&net->hs_requests);
> +	net->hs_pending	= 0;
> +	return 0;
> +}
> +
> +static void __net_exit handshake_net_exit(struct net *net)
> +{
> +	struct handshake_req *req;
> +	LIST_HEAD(requests);
> +
> +	/*
> +	 * This drains the net's pending list. Requests that
> +	 * have been accepted and are in progress will be
> +	 * destroyed when the socket is closed.
> +	 */
> +	spin_lock(&net->hs_lock);
> +	list_splice_init(&requests, &net->hs_requests);
> +	spin_unlock(&net->hs_lock);
> +
> +	while (!list_empty(&requests)) {
> +		req = list_first_entry(&requests, struct handshake_req, hr_list);
> +		list_del(&req->hr_list);
> +
> +		/*
> +		 * Requests on this list have not yet been
> +		 * accepted, so they do not have an fd to put.
> +		 */
> +
> +		handshake_complete(req, -ETIMEDOUT, NULL);
> +	}
> +}
> +
> +static struct pernet_operations handshake_genl_net_ops = {
> +	.init		= handshake_net_init,
> +	.exit		= handshake_net_exit,
> +};
> +
> +static int __init handshake_init(void)
> +{
> +	int ret;
> +
> +	ret = genl_register_family(&handshake_genl_family);
> +	if (ret) {
> +		pr_warn("handshake: netlink registration failed (%d)\n", ret);
> +		return ret;
> +	}
> +
> +	ret = register_pernet_subsys(&handshake_genl_net_ops);
> +	if (ret) {
> +		pr_warn("handshake: pernet registration failed (%d)\n", ret);
> +		genl_unregister_family(&handshake_genl_family);
> +	}
> +
> +	handshake_genl_inited = true;
> +	return ret;
> +}
> +
> +static void __exit handshake_exit(void)
> +{
> +	unregister_pernet_subsys(&handshake_genl_net_ops);
> +	genl_unregister_family(&handshake_genl_family);
> +}
> +
> +module_init(handshake_init);
> +module_exit(handshake_exit);
> diff --git a/net/handshake/request.c b/net/handshake/request.c
> new file mode 100644
> index 000000000000..1d3b8e76dd2c
> --- /dev/null
> +++ b/net/handshake/request.c
> @@ -0,0 +1,246 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Handshake request lifetime events
> + *
> + * Author: Chuck Lever <chuck.lever@...cle.com>
> + *
> + * Copyright (c) 2023, Oracle and/or its affiliates.
> + */
> +
> +#include <linux/types.h>
> +#include <linux/socket.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/skbuff.h>
> +#include <linux/inet.h>
> +#include <linux/fdtable.h>
> +
> +#include <net/sock.h>
> +#include <net/genetlink.h>
> +#include <net/handshake.h>
> +
> +#include <uapi/linux/handshake.h>
> +#include <trace/events/handshake.h>
> +#include "handshake.h"
> +
> +/*
> + * This limit is to prevent slow remotes from causing denial of service.
> + * A ulimit-style tunable might be used instead.
> + */
> +#define HANDSHAKE_PENDING_MAX (10)
> +
> +static void __add_pending_locked(struct net *net, struct handshake_req *req)
> +{
> +	net->hs_pending++;
> +	list_add_tail(&req->hr_list, &net->hs_requests);
> +}
> +
> +void __remove_pending_locked(struct net *net, struct handshake_req *req)
> +{
> +	net->hs_pending--;
> +	list_del_init(&req->hr_list);
> +}
> +
> +/*
> + * Return values:
> + *   %true - the request was found on @net's pending list
> + *   %false - the request was not found on @net's pending list
> + *
> + * If @req was on a pending list, it has not yet been accepted.
> + */
> +static bool remove_pending(struct net *net, struct handshake_req *req)
> +{
> +	bool ret;
> +
> +	ret = false;
> +
> +	spin_lock(&net->hs_lock);
> +	if (!list_empty(&req->hr_list)) {
> +		__remove_pending_locked(net, req);
> +		ret = true;
> +	}
> +	spin_unlock(&net->hs_lock);
> +
> +	return ret;
> +}
> +
> +static void handshake_req_destroy(struct handshake_req *req, struct sock *sk)
> +{
> +	req->hr_proto->hp_destroy(req);
> +	sk->sk_handshake_req = NULL;
> +	kfree(req);
> +}
> +
> +static void handshake_sk_destruct(struct sock *sk)
> +{
> +	struct handshake_req *req = sk->sk_handshake_req;
> +
> +	if (req) {
> +		trace_handshake_destruct(sock_net(sk), req, req->hr_sock);
> +		handshake_req_destroy(req, sk);
> +	}
> +}
> +
> +/**
> + * handshake_req_alloc - consumer API to allocate a request
> + * @sock: open socket on which to perform a handshake
> + * @proto: security protocol
> + * @flags: memory allocation flags
> + *
> + * Returns an initialized handshake_req or NULL.
> + */
> +struct handshake_req *handshake_req_alloc(struct socket *sock,
> +					  const struct handshake_proto *proto,
> +					  gfp_t flags)
> +{
> +	struct handshake_req *req;
> +
> +	/* Avoid accessing uninitialized global variables later on */
> +	if (!handshake_genl_inited)
> +		return NULL;
> +
> +	req = kzalloc(sizeof(*req) + proto->hp_privsize, flags);
> +	if (!req)
> +		return NULL;
> +
> +	sock_hold(sock->sk);
> +
> +	INIT_LIST_HEAD(&req->hr_list);
> +	req->hr_sock = sock;
> +	req->hr_proto = proto;
> +	return req;
> +}
> +EXPORT_SYMBOL(handshake_req_alloc);
> +
> +/**
> + * handshake_req_private - consumer API to return per-handshake private data
> + * @req: handshake arguments
> + *
> + */
> +void *handshake_req_private(struct handshake_req *req)
> +{
> +	return (void *)(req + 1);
> +}
> +EXPORT_SYMBOL(handshake_req_private);
> +
> +/**
> + * handshake_req_submit - consumer API to submit a handshake request
> + * @req: handshake arguments
> + * @flags: memory allocation flags
> + *
> + * Return values:
> + *   %0: Request queued
> + *   %-EBUSY: A handshake is already under way for this socket
> + *   %-ESRCH: No handshake agent is available
> + *   %-EAGAIN: Too many pending handshake requests
> + *   %-ENOMEM: Failed to allocate memory
> + *   %-EMSGSIZE: Failed to construct notification message
> + *
> + * A zero return value from handshake_request() means that
> + * exactly one subsequent completion callback is guaranteed.
> + *
> + * A negative return value from handshake_request() means that
> + * no completion callback will be done and that @req is
> + * destroyed.
> + */
> +int handshake_req_submit(struct handshake_req *req, gfp_t flags)
> +{
> +	struct socket *sock = req->hr_sock;
> +	struct sock *sk = sock->sk;
> +	struct net *net = sock_net(sk);
> +	int ret;
> +
> +	ret = -EAGAIN;
> +	if (READ_ONCE(net->hs_pending) >= HANDSHAKE_PENDING_MAX)
> +		goto out_err;
> +
> +	ret = -EBUSY;
> +	spin_lock(&net->hs_lock);
> +	if (sk->sk_handshake_req || !list_empty(&req->hr_list)) {
> +		spin_unlock(&net->hs_lock);
> +		goto out_err;
> +	}
> +	req->hr_saved_destruct = sk->sk_destruct;
> +	sk->sk_destruct = handshake_sk_destruct;
> +	sk->sk_handshake_req = req;
> +	__add_pending_locked(net, req);
> +	spin_unlock(&net->hs_lock);
> +
> +	ret = handshake_genl_notify(net, req->hr_proto->hp_handler_class,
> +				    flags);
> +	if (ret) {
> +		trace_handshake_notify_err(net, req, sock, ret);
> +		if (remove_pending(net, req))
> +			goto out_err;
> +	}
> +
> +	trace_handshake_submit(net, req, sock);
> +	return 0;
> +
> +out_err:
> +	trace_handshake_submit_err(net, req, sock, ret);
> +	handshake_req_destroy(req, sk);
> +	return ret;
> +}
> +EXPORT_SYMBOL(handshake_req_submit);
> +
> +void handshake_complete(struct handshake_req *req, int status,
> +			struct nlattr **tb)
> +{
> +	struct socket *sock = req->hr_sock;
> +	struct net *net = sock_net(sock->sk);
> +
> +	if (!test_and_set_bit(HANDSHAKE_F_COMPLETED, &req->hr_flags)) {
> +		trace_handshake_complete(net, req, sock, status);
> +		req->hr_proto->hp_done(req, status, tb);
> +		__sock_put(sock->sk);
> +	}
> +}
> +
> +/**
> + * handshake_req_cancel - consumer API to cancel an in-progress handshake
> + * @sock: socket on which there is an ongoing handshake
> + *
> + * XXX: Perhaps killing the user space agent might also be necessary?

I thought we had agreed that we would be sending a signal to the 
userspace process?
Ideally we would be sending a SIGHUP, wait for some time on the 
userspace process to respond with a 'done' message, and send a 'KILL' 
signal if we haven't received one.

Obs: Sending a KILL signal would imply that userspace is able to cope 
with children dying. Which pretty much excludes pthreads, I would think.

Guess I'll have to consult Stevens :-)

> + *
> + * Request cancellation races with request completion. To determine
> + * who won, callers examine the return value from this function.
> + *
> + * Return values:
> + *   %0 - Uncompleted handshake request was canceled or not found
> + *   %-EBUSY - Handshake request already completed

EBUSY? Wouldn't be EAGAIN more approriate?
After all, the request is everything _but_ busy...

> + */
> +int handshake_req_cancel(struct socket *sock)
> +{
> +	struct handshake_req *req;
> +	struct sock *sk;
> +	struct net *net;
> +
> +	if (!sock)
> +		return 0;
> +
> +	sk = sock->sk;
> +	req = sk->sk_handshake_req;
> +	net = sock_net(sk);
> +
> +	if (!req) {
> +		trace_handshake_cancel_none(net, req, sock);
> +		return 0;
> +	}
> +
> +	if (remove_pending(net, req)) {
> +		/* Request hadn't been accepted */
> +		trace_handshake_cancel(net, req, sock);
> +		return 0;
> +	}
> +	if (test_and_set_bit(HANDSHAKE_F_COMPLETED, &req->hr_flags)) {
> +		/* Request already completed */
> +		trace_handshake_cancel_busy(net, req, sock);
> +		return -EBUSY;
> +	}
> +
> +	__sock_put(sk);
> +	trace_handshake_cancel(net, req, sock);
> +	return 0;
> +}
> +EXPORT_SYMBOL(handshake_req_cancel);
> diff --git a/net/handshake/trace.c b/net/handshake/trace.c
> new file mode 100644
> index 000000000000..3a5b6f29a2b8
> --- /dev/null
> +++ b/net/handshake/trace.c
> @@ -0,0 +1,17 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Trace points for transport security layer handshakes.
> + *
> + * Author: Chuck Lever <chuck.lever@...cle.com>
> + *
> + * Copyright (c) 2023, Oracle and/or its affiliates.
> + */
> +
> +#include <linux/types.h>
> +#include <net/sock.h>
> +
> +#include "handshake.h"
> +
> +#define CREATE_TRACE_POINTS
> +
> +#include <trace/events/handshake.h>
> 
Cheers,

Hannes


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ