lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1B595556-0236-49F3-A8B0-ECF2332450D4@oracle.com>
Date:   Mon, 27 Feb 2023 14:59:40 +0000
From:   Chuck Lever III <chuck.lever@...cle.com>
To:     Hannes Reinecke <hare@...e.de>
CC:     Chuck Lever <cel@...nel.org>, "kuba@...nel.org" <kuba@...nel.org>,
        "pabeni@...hat.com" <pabeni@...hat.com>,
        "edumazet@...gle.com" <edumazet@...gle.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "kernel-tls-handshake@...ts.linux.dev" 
        <kernel-tls-handshake@...ts.linux.dev>
Subject: Re: [PATCH v5 1/2] net/handshake: Create a NETLINK service for
 handling handshake requests



> On Feb 27, 2023, at 4:24 AM, Hannes Reinecke <hare@...e.de> wrote:
> 
> On 2/24/23 20:19, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@...cle.com>
>> When a kernel consumer needs a transport layer security session, it
>> first needs a handshake to negotiate and establish a session. This
>> negotiation can be done in user space via one of the several
>> existing library implementations, or it can be done in the kernel.
>> No in-kernel handshake implementations yet exist. In their absence,
>> we add a netlink service that can:
>> a. Notify a user space daemon that a handshake is needed.
>> b. Once notified, the daemon calls the kernel back via this
>>    netlink service to get the handshake parameters, including an
>>    open socket on which to establish the session.
>> c. Once the handshake is complete, the daemon reports the
>>    session status and other information via a second netlink
>>    operation. This operation marks that it is safe for the
>>    kernel to use the open socket and the security session
>>    established there.
>> The notification service uses a multicast group. Each handshake
>> mechanism (eg, tlshd) adopts its own group number so that the
>> handshake services are completely independent of one another. The
>> kernel can then tell via netlink_has_listeners() whether a handshake
>> service is active and prepared to handle a handshake request.
>> A new netlink operation, ACCEPT, acts like accept(2) in that it
>> instantiates a file descriptor in the user space daemon's fd table.
>> If this operation is successful, the reply carries the fd number,
>> which can be treated as an open and ready file descriptor.
>> While user space is performing the handshake, the kernel keeps its
>> muddy paws off the open socket. A second new netlink operation,
>> DONE, indicates that the user space daemon is finished with the
>> socket and it is safe for the kernel to use again. The operation
>> also indicates whether a session was established successfully.
>> Signed-off-by: Chuck Lever <chuck.lever@...cle.com>
>> ---
>>  Documentation/netlink/specs/handshake.yaml |  134 +++++++++++
>>  include/net/handshake.h                    |   45 ++++
>>  include/net/net_namespace.h                |    5
>>  include/net/sock.h                         |    1
>>  include/trace/events/handshake.h           |  159 +++++++++++++
>>  include/uapi/linux/handshake.h             |   63 +++++
>>  net/Makefile                               |    1
>>  net/handshake/Makefile                     |   11 +
>>  net/handshake/handshake.h                  |   41 +++
>>  net/handshake/netlink.c                    |  340 ++++++++++++++++++++++++++++
>>  net/handshake/request.c                    |  246 ++++++++++++++++++++
>>  net/handshake/trace.c                      |   17 +
>>  12 files changed, 1063 insertions(+)
>>  create mode 100644 Documentation/netlink/specs/handshake.yaml
>>  create mode 100644 include/net/handshake.h
>>  create mode 100644 include/trace/events/handshake.h
>>  create mode 100644 include/uapi/linux/handshake.h
>>  create mode 100644 net/handshake/Makefile
>>  create mode 100644 net/handshake/handshake.h
>>  create mode 100644 net/handshake/netlink.c
>>  create mode 100644 net/handshake/request.c
>>  create mode 100644 net/handshake/trace.c
>> diff --git a/Documentation/netlink/specs/handshake.yaml b/Documentation/netlink/specs/handshake.yaml
>> new file mode 100644
>> index 000000000000..683a8f2df0a7
>> --- /dev/null
>> +++ b/Documentation/netlink/specs/handshake.yaml
>> @@ -0,0 +1,134 @@
>> +# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
>> +#
>> +# GENL HANDSHAKE service.
>> +#
>> +# Author: Chuck Lever <chuck.lever@...cle.com>
>> +#
>> +# Copyright (c) 2023, Oracle and/or its affiliates.
>> +#
>> +
>> +name: handshake
>> +
>> +protocol: genetlink-c
>> +
>> +doc: Netlink protocol to request a transport layer security handshake.
>> +
>> +uapi-header: linux/net/handshake.h
>> +
>> +definitions:
>> +  -
>> +    type: enum
>> +    name: handler-class
>> +    enum-name:
>> +    value-start: 0
>> +    entries: [ none ]
>> +  -
>> +    type: enum
>> +    name: msg-type
>> +    enum-name:
>> +    value-start: 0
>> +    entries: [ unspec, clienthello, serverhello ]
>> +  -
>> +    type: enum
>> +    name: auth
>> +    enum-name:
>> +    value-start: 0
>> +    entries: [ unspec, unauth, x509, psk ]
>> +
>> +attribute-sets:
>> +  -
>> +    name: accept
>> +    attributes:
>> +      -
>> +        name: status
>> +        doc: Status of this accept operation
>> +        type: u32
>> +        value: 1
>> +      -
>> +        name: sockfd
>> +        doc: File descriptor of socket to use
>> +        type: u32
>> +      -
>> +        name: handler-class
>> +        doc: Which type of handler is responding
>> +        type: u32
>> +        enum: handler-class
>> +      -
>> +        name: message-type
>> +        doc: Handshake message type
>> +        type: u32
>> +        enum: msg-type
>> +      -
>> +        name: auth
>> +        doc: Authentication mode
>> +        type: u32
>> +        enum: auth
>> +      -
>> +        name: gnutls-priorities
>> +        doc: GnuTLS priority string
>> +        type: string
>> +      -
>> +        name: my-peerid
>> +        doc: Serial no of key containing local identity
>> +        type: u32
>> +      -
>> +        name: my-privkey
>> +        doc: Serial no of key containing optional private key
>> +        type: u32
>> +  -
>> +    name: done
>> +    attributes:
>> +      -
>> +        name: status
>> +        doc: Session status
>> +        type: u32
>> +        value: 1
>> +      -
>> +        name: sockfd
>> +        doc: File descriptor of socket that has completed
>> +        type: u32
>> +      -
>> +        name: remote-peerid
>> +        doc: Serial no of keys containing identities of remote peer
>> +        type: u32
>> +
>> +operations:
>> +  list:
>> +    -
>> +      name: ready
>> +      doc: Notify handlers that a new handshake request is waiting
>> +      value: 1
>> +      notify: accept
>> +    -
>> +      name: accept
>> +      doc: Handler retrieves next queued handshake request
>> +      attribute-set: accept
>> +      flags: [ admin-perm ]
>> +      do:
>> +        request:
>> +          attributes:
>> +            - handler-class
>> +        reply:
>> +          attributes:
>> +            - status
>> +            - sockfd
>> +            - message-type
>> +            - auth
>> +            - gnutls-priorities
>> +            - my-peerid
>> +            - my-privkey
>> +    -
>> +      name: done
>> +      doc: Handler reports handshake completion
>> +      attribute-set: done
>> +      do:
>> +        request:
>> +          attributes:
>> +            - status
>> +            - sockfd
>> +            - remote-peerid
>> +
>> +mcast-groups:
>> +  list:
>> +    -
>> +      name: none
>> diff --git a/include/net/handshake.h b/include/net/handshake.h
>> new file mode 100644
>> index 000000000000..08f859237936
>> --- /dev/null
>> +++ b/include/net/handshake.h
>> @@ -0,0 +1,45 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Generic HANDSHAKE service.
>> + *
>> + * Author: Chuck Lever <chuck.lever@...cle.com>
>> + *
>> + * Copyright (c) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +/*
>> + * Data structures and functions that are visible only within the
>> + * kernel are declared here.
>> + */
>> +
>> +#ifndef _NET_HANDSHAKE_H
>> +#define _NET_HANDSHAKE_H
>> +
>> +struct handshake_req;
>> +
>> +/*
>> + * Invariants for all handshake requests for one transport layer
>> + * security protocol
>> + */
>> +struct handshake_proto {
>> +	int			hp_handler_class;
>> +	size_t			hp_privsize;
>> +
>> +	int			(*hp_accept)(struct handshake_req *req,
>> +					     struct genl_info *gi, int fd);
>> +	void			(*hp_done)(struct handshake_req *req,
>> +					   int status, struct nlattr **tb);
>> +	void			(*hp_destroy)(struct handshake_req *req);
>> +};
>> +
>> +extern struct handshake_req *
>> +handshake_req_alloc(struct socket *sock, const struct handshake_proto *proto,
>> +		    gfp_t flags);
>> +extern void *handshake_req_private(struct handshake_req *req);
>> +extern int handshake_req_submit(struct handshake_req *req, gfp_t flags);
>> +extern int handshake_req_cancel(struct socket *sock);
>> +
>> +extern struct nlmsghdr *handshake_genl_put(struct sk_buff *msg,
>> +					   struct genl_info *gi);
>> +
>> +#endif /* _NET_HANDSHAKE_H */
>> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
>> index 78beaa765c73..a0ce9de4dab1 100644
>> --- a/include/net/net_namespace.h
>> +++ b/include/net/net_namespace.h
>> @@ -188,6 +188,11 @@ struct net {
>>  #if IS_ENABLED(CONFIG_SMC)
>>  	struct netns_smc	smc;
>>  #endif
>> +
>> +	/* transport layer security handshake requests */
>> +	spinlock_t		hs_lock;
>> +	struct list_head	hs_requests;
>> +	int			hs_pending;
>>  } __randomize_layout;
>>    #include <linux/seq_file_net.h>
>> diff --git a/include/net/sock.h b/include/net/sock.h
>> index 573f2bf7e0de..2a7345ce2540 100644
>> --- a/include/net/sock.h
>> +++ b/include/net/sock.h
>> @@ -519,6 +519,7 @@ struct sock {
>>    	struct socket		*sk_socket;
>>  	void			*sk_user_data;
>> +	void			*sk_handshake_req;
>>  #ifdef CONFIG_SECURITY
>>  	void			*sk_security;
>>  #endif
>> diff --git a/include/trace/events/handshake.h b/include/trace/events/handshake.h
>> new file mode 100644
>> index 000000000000..feffcd1d6256
>> --- /dev/null
>> +++ b/include/trace/events/handshake.h
>> @@ -0,0 +1,159 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#undef TRACE_SYSTEM
>> +#define TRACE_SYSTEM handshake
>> +
>> +#if !defined(_TRACE_HANDSHAKE_H) || defined(TRACE_HEADER_MULTI_READ)
>> +#define _TRACE_HANDSHAKE_H
>> +
>> +#include <linux/net.h>
>> +#include <linux/tracepoint.h>
>> +
>> +DECLARE_EVENT_CLASS(handshake_event_class,
>> +	TP_PROTO(
>> +		const struct net *net,
>> +		const struct handshake_req *req,
>> +		const struct socket *sock
>> +	),
>> +	TP_ARGS(net, req, sock),
>> +	TP_STRUCT__entry(
>> +		__field(const void *, req)
>> +		__field(const void *, sock)
>> +		__field(unsigned int, netns_ino)
>> +	),
>> +	TP_fast_assign(
>> +		__entry->req = req;
>> +		__entry->sock = sock;
>> +		__entry->netns_ino = net->ns.inum;
>> +	),
>> +	TP_printk("req=%p sock=%p",
>> +		__entry->req, __entry->sock
>> +	)
>> +);
>> +#define DEFINE_HANDSHAKE_EVENT(name)				\
>> +	DEFINE_EVENT(handshake_event_class, name,		\
>> +		TP_PROTO(					\
>> +			const struct net *net,			\
>> +			const struct handshake_req *req,	\
>> +			const struct socket *sock		\
>> +		),						\
>> +		TP_ARGS(net, req, sock))
>> +
>> +DECLARE_EVENT_CLASS(handshake_fd_class,
>> +	TP_PROTO(
>> +		const struct net *net,
>> +		const struct handshake_req *req,
>> +		const struct socket *sock,
>> +		int fd
>> +	),
>> +	TP_ARGS(net, req, sock, fd),
>> +	TP_STRUCT__entry(
>> +		__field(const void *, req)
>> +		__field(const void *, sock)
>> +		__field(int, fd)
>> +		__field(unsigned int, netns_ino)
>> +	),
>> +	TP_fast_assign(
>> +		__entry->req = req;
>> +		__entry->sock = req->hr_sock;
>> +		__entry->fd = fd;
>> +		__entry->netns_ino = net->ns.inum;
>> +	),
>> +	TP_printk("req=%p sock=%p fd=%d",
>> +		__entry->req, __entry->sock, __entry->fd
>> +	)
>> +);
>> +#define DEFINE_HANDSHAKE_FD_EVENT(name)				\
>> +	DEFINE_EVENT(handshake_fd_class, name,			\
>> +		TP_PROTO(					\
>> +			const struct net *net,			\
>> +			const struct handshake_req *req,	\
>> +			const struct socket *sock,		\
>> +			int fd					\
>> +		),						\
>> +		TP_ARGS(net, req, sock, fd))
>> +
>> +DECLARE_EVENT_CLASS(handshake_error_class,
>> +	TP_PROTO(
>> +		const struct net *net,
>> +		const struct handshake_req *req,
>> +		const struct socket *sock,
>> +		int err
>> +	),
>> +	TP_ARGS(net, req, sock, err),
>> +	TP_STRUCT__entry(
>> +		__field(const void *, req)
>> +		__field(const void *, sock)
>> +		__field(int, err)
>> +		__field(unsigned int, netns_ino)
>> +	),
>> +	TP_fast_assign(
>> +		__entry->req = req;
>> +		__entry->sock = sock;
>> +		__entry->err = err;
>> +		__entry->netns_ino = net->ns.inum;
>> +	),
>> +	TP_printk("req=%p sock=%p err=%d",
>> +		__entry->req, __entry->sock, __entry->err
>> +	)
>> +);
>> +#define DEFINE_HANDSHAKE_ERROR(name)				\
>> +	DEFINE_EVENT(handshake_error_class, name,		\
>> +		TP_PROTO(					\
>> +			const struct net *net,			\
>> +			const struct handshake_req *req,	\
>> +			const struct socket *sock,		\
>> +			int err					\
>> +		),						\
>> +		TP_ARGS(net, req, sock, err))
>> +
>> +
>> +/**
>> + ** Request lifetime events
>> + **/
>> +
>> +DEFINE_HANDSHAKE_EVENT(handshake_submit);
>> +DEFINE_HANDSHAKE_ERROR(handshake_submit_err);
>> +DEFINE_HANDSHAKE_EVENT(handshake_cancel);
>> +DEFINE_HANDSHAKE_EVENT(handshake_cancel_none);
>> +DEFINE_HANDSHAKE_EVENT(handshake_cancel_busy);
>> +DEFINE_HANDSHAKE_EVENT(handshake_destruct);
>> +
>> +
>> +TRACE_EVENT(handshake_complete,
>> +	TP_PROTO(
>> +		const struct net *net,
>> +		const struct handshake_req *req,
>> +		const struct socket *sock,
>> +		int status
>> +	),
>> +	TP_ARGS(net, req, sock, status),
>> +	TP_STRUCT__entry(
>> +		__field(const void *, req)
>> +		__field(const void *, sock)
>> +		__field(int, status)
>> +		__field(unsigned int, netns_ino)
>> +	),
>> +	TP_fast_assign(
>> +		__entry->req = req;
>> +		__entry->sock = sock;
>> +		__entry->status = status;
>> +		__entry->netns_ino = net->ns.inum;
>> +	),
>> +	TP_printk("req=%p sock=%p status=%d",
>> +		__entry->req, __entry->sock, __entry->status
>> +	)
>> +);
>> +
>> +/**
>> + ** Netlink events
>> + **/
>> +
>> +DEFINE_HANDSHAKE_ERROR(handshake_notify_err);
>> +DEFINE_HANDSHAKE_FD_EVENT(handshake_cmd_accept);
>> +DEFINE_HANDSHAKE_ERROR(handshake_cmd_accept_err);
>> +DEFINE_HANDSHAKE_FD_EVENT(handshake_cmd_done);
>> +DEFINE_HANDSHAKE_ERROR(handshake_cmd_done_err);
>> +
>> +#endif /* _TRACE_HANDSHAKE_H */
>> +
>> +#include <trace/define_trace.h>
>> diff --git a/include/uapi/linux/handshake.h b/include/uapi/linux/handshake.h
>> new file mode 100644
>> index 000000000000..09fd7c37cba4
>> --- /dev/null
>> +++ b/include/uapi/linux/handshake.h
>> @@ -0,0 +1,63 @@
>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
>> +/* Do not edit directly, auto-generated from: */
>> +/*	Documentation/netlink/specs/handshake.yaml */
>> +/* YNL-GEN uapi header */
>> +
>> +#ifndef _UAPI_LINUX_HANDSHAKE_H
>> +#define _UAPI_LINUX_HANDSHAKE_H
>> +
>> +#define HANDSHAKE_FAMILY_NAME		"handshake"
>> +#define HANDSHAKE_FAMILY_VERSION	1
>> +
>> +enum {
>> +	HANDSHAKE_HANDLER_CLASS_NONE,
>> +};
>> +
>> +enum {
>> +	HANDSHAKE_MSG_TYPE_UNSPEC,
>> +	HANDSHAKE_MSG_TYPE_CLIENTHELLO,
>> +	HANDSHAKE_MSG_TYPE_SERVERHELLO,
>> +};
>> +
>> +enum {
>> +	HANDSHAKE_AUTH_UNSPEC,
>> +	HANDSHAKE_AUTH_UNAUTH,
>> +	HANDSHAKE_AUTH_X509,
>> +	HANDSHAKE_AUTH_PSK,
>> +};
>> +
>> +enum {
>> +	HANDSHAKE_A_ACCEPT_STATUS = 1,
>> +	HANDSHAKE_A_ACCEPT_SOCKFD,
>> +	HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
>> +	HANDSHAKE_A_ACCEPT_MESSAGE_TYPE,
>> +	HANDSHAKE_A_ACCEPT_AUTH,
>> +	HANDSHAKE_A_ACCEPT_GNUTLS_PRIORITIES,
>> +	HANDSHAKE_A_ACCEPT_MY_PEERID,
>> +	HANDSHAKE_A_ACCEPT_MY_PRIVKEY,
>> +
>> +	__HANDSHAKE_A_ACCEPT_MAX,
>> +	HANDSHAKE_A_ACCEPT_MAX = (__HANDSHAKE_A_ACCEPT_MAX - 1)
>> +};
>> +
>> +enum {
>> +	HANDSHAKE_A_DONE_STATUS = 1,
>> +	HANDSHAKE_A_DONE_SOCKFD,
>> +	HANDSHAKE_A_DONE_REMOTE_PEERID,
>> +
>> +	__HANDSHAKE_A_DONE_MAX,
>> +	HANDSHAKE_A_DONE_MAX = (__HANDSHAKE_A_DONE_MAX - 1)
>> +};
>> +
>> +enum {
>> +	HANDSHAKE_CMD_READY = 1,
>> +	HANDSHAKE_CMD_ACCEPT,
>> +	HANDSHAKE_CMD_DONE,
>> +
>> +	__HANDSHAKE_CMD_MAX,
>> +	HANDSHAKE_CMD_MAX = (__HANDSHAKE_CMD_MAX - 1)
>> +};
>> +
>> +#define HANDSHAKE_MCGRP_NONE	"none"
>> +
>> +#endif /* _UAPI_LINUX_HANDSHAKE_H */
>> diff --git a/net/Makefile b/net/Makefile
>> index 0914bea9c335..adbb64277601 100644
>> --- a/net/Makefile
>> +++ b/net/Makefile
>> @@ -79,3 +79,4 @@ obj-$(CONFIG_NET_NCSI)		+= ncsi/
>>  obj-$(CONFIG_XDP_SOCKETS)	+= xdp/
>>  obj-$(CONFIG_MPTCP)		+= mptcp/
>>  obj-$(CONFIG_MCTP)		+= mctp/
>> +obj-y				+= handshake/
>> diff --git a/net/handshake/Makefile b/net/handshake/Makefile
>> new file mode 100644
>> index 000000000000..a41b03f4837b
>> --- /dev/null
>> +++ b/net/handshake/Makefile
>> @@ -0,0 +1,11 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +#
>> +# Makefile for the Generic HANDSHAKE service
>> +#
>> +# Author: Chuck Lever <chuck.lever@...cle.com>
>> +#
>> +# Copyright (c) 2023, Oracle and/or its affiliates.
>> +#
>> +
>> +obj-y += handshake.o
>> +handshake-y := netlink.o request.o trace.o
>> diff --git a/net/handshake/handshake.h b/net/handshake/handshake.h
>> new file mode 100644
>> index 000000000000..366c7659ec09
>> --- /dev/null
>> +++ b/net/handshake/handshake.h
>> @@ -0,0 +1,41 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Generic netlink handshake service
>> + *
>> + * Author: Chuck Lever <chuck.lever@...cle.com>
>> + *
>> + * Copyright (c) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +/*
>> + * Data structures and functions that are visible only within the
>> + * handshake module are declared here.
>> + */
>> +
>> +#ifndef _INTERNAL_HANDSHAKE_H
>> +#define _INTERNAL_HANDSHAKE_H
>> +
>> +/*
>> + * One handshake request
>> + */
>> +struct handshake_req {
>> +	struct list_head		hr_list;
>> +	unsigned long			hr_flags;
>> +	const struct handshake_proto	*hr_proto;
>> +	struct socket			*hr_sock;
>> +
>> +	void				(*hr_saved_destruct)(struct sock *sk);
>> +};
>> +
>> +#define HANDSHAKE_F_COMPLETED	BIT(0)
>> +
>> +/* netlink.c */
>> +extern bool handshake_genl_inited;
>> +int handshake_genl_notify(struct net *net, int handler_class, gfp_t flags);
>> +
>> +/* request.c */
>> +void __remove_pending_locked(struct net *net, struct handshake_req *req);
>> +void handshake_complete(struct handshake_req *req, int status,
>> +			struct nlattr **tb);
>> +
>> +#endif /* _INTERNAL_HANDSHAKE_H */
>> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
>> new file mode 100644
>> index 000000000000..581e382236cf
>> --- /dev/null
>> +++ b/net/handshake/netlink.c
>> @@ -0,0 +1,340 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Generic netlink handshake service
>> + *
>> + * Author: Chuck Lever <chuck.lever@...cle.com>
>> + *
>> + * Copyright (c) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#include <linux/types.h>
>> +#include <linux/socket.h>
>> +#include <linux/kernel.h>
>> +#include <linux/module.h>
>> +#include <linux/skbuff.h>
>> +#include <linux/inet.h>
>> +
>> +#include <net/sock.h>
>> +#include <net/genetlink.h>
>> +#include <net/handshake.h>
>> +
>> +#include <uapi/linux/handshake.h>
>> +#include <trace/events/handshake.h>
>> +#include "handshake.h"
>> +
>> +static struct genl_family __ro_after_init handshake_genl_family;
>> +bool handshake_genl_inited;
>> +
>> +/**
>> + * handshake_genl_notify - Notify handlers that a request is waiting
>> + * @net: target network namespace
>> + * @handler_class: target handler
>> + * @flags: memory allocation control flags
>> + *
>> + * Returns zero on success or a negative errno if notification failed.
>> + */
>> +int handshake_genl_notify(struct net *net, int handler_class, gfp_t flags)
>> +{
>> +	struct sk_buff *msg;
>> +	void *hdr;
>> +
>> +	if (!genl_has_listeners(&handshake_genl_family, net, handler_class))
>> +		return -ESRCH;
>> +
>> +	msg = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
>> +	if (!msg)
>> +		return -ENOMEM;
>> +
>> +	hdr = genlmsg_put(msg, 0, 0, &handshake_genl_family, 0,
>> +			  HANDSHAKE_CMD_READY);
>> +	if (!hdr)
>> +		goto out_free;
>> +
>> +	if (nla_put_u32(msg, HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
>> +			handler_class) < 0) {
>> +		genlmsg_cancel(msg, hdr);
>> +		goto out_free;
>> +	}
>> +
>> +	genlmsg_end(msg, hdr);
>> +	return genlmsg_multicast_netns(&handshake_genl_family, net, msg,
>> +				       0, handler_class, flags);
>> +
>> +out_free:
>> +	nlmsg_free(msg);
>> +	return -EMSGSIZE;
>> +}
>> +
>> +/**
>> + * handshake_genl_put - Create a generic netlink message header
>> + * @msg: buffer in which to create the header
>> + * @gi: generic netlink message context
>> + *
>> + * Returns a ready-to-use header, or NULL.
>> + */
>> +struct nlmsghdr *handshake_genl_put(struct sk_buff *msg, struct genl_info *gi)
>> +{
>> +	return genlmsg_put(msg, gi->snd_portid, gi->snd_seq,
>> +			   &handshake_genl_family, 0, gi->genlhdr->cmd);
>> +}
>> +EXPORT_SYMBOL(handshake_genl_put);
>> +
>> +static int handshake_status_reply(struct sk_buff *skb, struct genl_info *gi,
>> +				  int status)
>> +{
>> +	struct nlmsghdr *hdr;
>> +	struct sk_buff *msg;
>> +	int ret;
>> +
>> +	ret = -ENOMEM;
>> +	msg = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
>> +	if (!msg)
>> +		goto out;
>> +	hdr = handshake_genl_put(msg, gi);
>> +	if (!hdr)
>> +		goto out_free;
>> +
>> +	ret = -EMSGSIZE;
>> +	ret = nla_put_u32(msg, HANDSHAKE_A_ACCEPT_STATUS, status);
>> +	if (ret < 0)
>> +		goto out_free;
>> +
>> +	genlmsg_end(msg, hdr);
>> +	return genlmsg_reply(msg, gi);
>> +
>> +out_free:
>> +	genlmsg_cancel(msg, hdr);
>> +out:
>> +	return ret;
>> +}
>> +
>> +/*
>> + * dup() a kernel socket for use as a user space file descriptor
>> + * in the current process.
>> + *
>> + * Implicit argument: "current()"
>> + */
>> +static int handshake_dup(struct socket *kernsock)
>> +{
>> +	struct file *file = get_file(kernsock->file);
>> +	int newfd;
>> +
>> +	newfd = get_unused_fd_flags(O_CLOEXEC);
>> +	if (newfd < 0) {
>> +		fput(file);
>> +		return newfd;
>> +	}
>> +
>> +	fd_install(newfd, file);
>> +	return newfd;
>> +}
>> +
>> +static const struct nla_policy
>> +handshake_accept_nl_policy[HANDSHAKE_A_ACCEPT_HANDLER_CLASS + 1] = {
>> +	[HANDSHAKE_A_ACCEPT_HANDLER_CLASS] = { .type = NLA_U32, },
>> +};
>> +
>> +static int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *gi)
>> +{
>> +	struct nlattr *tb[HANDSHAKE_A_ACCEPT_MAX + 1];
>> +	struct net *net = sock_net(skb->sk);
>> +	struct handshake_req *pos, *req;
>> +	int fd, err;
>> +
>> +	err = -EINVAL;
>> +	if (genlmsg_parse(nlmsg_hdr(skb), &handshake_genl_family, tb,
>> +			  HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
>> +			  handshake_accept_nl_policy, NULL))
>> +		goto out_status;
>> +	if (!tb[HANDSHAKE_A_ACCEPT_HANDLER_CLASS])
>> +		goto out_status;
>> +
>> +	req = NULL;
>> +	spin_lock(&net->hs_lock);
>> +	list_for_each_entry(pos, &net->hs_requests, hr_list) {
>> +		if (pos->hr_proto->hp_handler_class !=
>> +		    nla_get_u32(tb[HANDSHAKE_A_ACCEPT_HANDLER_CLASS]))
>> +			continue;
>> +		__remove_pending_locked(net, pos);
>> +		req = pos;
>> +		break;
>> +	}
>> +	spin_unlock(&net->hs_lock);
>> +	if (!req)
>> +		goto out_status;
>> +
>> +	fd = handshake_dup(req->hr_sock);
>> +	if (fd < 0) {
>> +		err = fd;
>> +		goto out_complete;
>> +	}
>> +	err = req->hr_proto->hp_accept(req, gi, fd);
>> +	if (err)
>> +		goto out_complete;
>> +
>> +	trace_handshake_cmd_accept(net, req, req->hr_sock, fd);
>> +	return 0;
>> +
>> +out_complete:
>> +	handshake_complete(req, -EIO, NULL);
>> +	fput(req->hr_sock->file);
>> +out_status:
>> +	trace_handshake_cmd_accept_err(net, req, NULL, err);
>> +	return handshake_status_reply(skb, gi, err);
>> +}
>> +
>> +static const struct nla_policy
>> +handshake_done_nl_policy[HANDSHAKE_A_DONE_MAX + 1] = {
>> +	[HANDSHAKE_A_DONE_SOCKFD] = { .type = NLA_U32, },
>> +	[HANDSHAKE_A_DONE_STATUS] = { .type = NLA_U32, },
>> +	[HANDSHAKE_A_DONE_REMOTE_PEERID] = { .type = NLA_U32, },
>> +};
>> +
>> +static int handshake_nl_done_doit(struct sk_buff *skb, struct genl_info *gi)
>> +{
>> +	struct nlattr *tb[HANDSHAKE_A_DONE_MAX + 1];
>> +	struct net *net = sock_net(skb->sk);
>> +	struct socket *sock = NULL;
>> +	struct handshake_req *req;
>> +	int fd, status, err;
>> +
>> +	err = genlmsg_parse(nlmsg_hdr(skb), &handshake_genl_family, tb,
>> +			    HANDSHAKE_A_DONE_MAX, handshake_done_nl_policy,
>> +			    NULL);
>> +	if (err || !tb[HANDSHAKE_A_DONE_SOCKFD]) {
>> +		err = -EINVAL;
>> +		goto out_status;
>> +	}
>> +
>> +	fd = nla_get_u32(tb[HANDSHAKE_A_DONE_SOCKFD]);
>> +
>> +	err = 0;
>> +	sock = sockfd_lookup(fd, &err);
>> +	if (err) {
>> +		err = -EBADF;
>> +		goto out_status;
>> +	}
>> +
>> +	req = sock->sk->sk_handshake_req;
>> +	if (!req) {
>> +		err = -EBUSY;
>> +		goto out_status;
>> +	}
>> +
>> +	trace_handshake_cmd_done(net, req, sock, fd);
>> +
>> +	status = -EIO;
>> +	if (tb[HANDSHAKE_A_DONE_STATUS])
>> +		status = nla_get_u32(tb[HANDSHAKE_A_DONE_STATUS]);
>> +
> And this makes me ever so slightly uneasy.
> 
> As 'status' is a netlink attribute it's inevitably defined as 'unsigned'.
> Yet we assume that 'status' is a negative number, leaving us _technically_ in unchartered territory.

Ah, that's an oversight.


> And that is notwithstanding the problem that we haven't even defined _what_ should be in the status attribute.

It's now an errno value.


> Reading the code I assume that it's either '0' for success or a negative number (ie the error code) on failure.
> Which implicitely means that we _never_ set a positive number here.
> So what would we lose if we declare 'status' to carry the _positive_ error number instead?
> It would bring us in-line with the actual netlink attribute definition, we wouldn't need to worry about possible integer overflows, yadda yadda...
> 
> Hmm?

It can also be argued that errnos in user space are positive-valued,
therefore, this user space visible protocol should use a positive
errno.


>> +	handshake_complete(req, status, tb);
>> +	fput(sock->file);
>> +	return 0;
>> +
>> +out_status:
>> +	trace_handshake_cmd_done_err(net, req, sock, err);
>> +	return handshake_status_reply(skb, gi, err);
>> +}
>> +
>> +static const struct genl_split_ops handshake_nl_ops[] = {
>> +	{
>> +		.cmd		= HANDSHAKE_CMD_ACCEPT,
>> +		.doit		= handshake_nl_accept_doit,
>> +		.policy		= handshake_accept_nl_policy,
>> +		.maxattr	= HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
>> +		.flags		= GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
>> +	},
>> +	{
>> +		.cmd		= HANDSHAKE_CMD_DONE,
>> +		.doit		= handshake_nl_done_doit,
>> +		.policy		= handshake_done_nl_policy,
>> +		.maxattr	= HANDSHAKE_A_DONE_REMOTE_PEERID,
>> +		.flags		= GENL_CMD_CAP_DO,
>> +	},
>> +};
>> +
>> +static const struct genl_multicast_group handshake_nl_mcgrps[] = {
>> +	[HANDSHAKE_HANDLER_CLASS_NONE] = { .name = HANDSHAKE_MCGRP_NONE, },
>> +};
>> +
>> +static struct genl_family __ro_after_init handshake_genl_family = {
>> +	.hdrsize		= 0,
>> +	.name			= HANDSHAKE_FAMILY_NAME,
>> +	.version		= HANDSHAKE_FAMILY_VERSION,
>> +	.netnsok		= true,
>> +	.parallel_ops		= true,
>> +	.n_mcgrps		= ARRAY_SIZE(handshake_nl_mcgrps),
>> +	.n_split_ops		= ARRAY_SIZE(handshake_nl_ops),
>> +	.split_ops		= handshake_nl_ops,
>> +	.mcgrps			= handshake_nl_mcgrps,
>> +	.module			= THIS_MODULE,
>> +};
>> +
>> +static int __net_init handshake_net_init(struct net *net)
>> +{
>> +	spin_lock_init(&net->hs_lock);
>> +	INIT_LIST_HEAD(&net->hs_requests);
>> +	net->hs_pending	= 0;
>> +	return 0;
>> +}
>> +
>> +static void __net_exit handshake_net_exit(struct net *net)
>> +{
>> +	struct handshake_req *req;
>> +	LIST_HEAD(requests);
>> +
>> +	/*
>> +	 * This drains the net's pending list. Requests that
>> +	 * have been accepted and are in progress will be
>> +	 * destroyed when the socket is closed.
>> +	 */
>> +	spin_lock(&net->hs_lock);
>> +	list_splice_init(&requests, &net->hs_requests);
>> +	spin_unlock(&net->hs_lock);
>> +
>> +	while (!list_empty(&requests)) {
>> +		req = list_first_entry(&requests, struct handshake_req, hr_list);
>> +		list_del(&req->hr_list);
>> +
>> +		/*
>> +		 * Requests on this list have not yet been
>> +		 * accepted, so they do not have an fd to put.
>> +		 */
>> +
>> +		handshake_complete(req, -ETIMEDOUT, NULL);
>> +	}
>> +}
>> +
>> +static struct pernet_operations handshake_genl_net_ops = {
>> +	.init		= handshake_net_init,
>> +	.exit		= handshake_net_exit,
>> +};
>> +
>> +static int __init handshake_init(void)
>> +{
>> +	int ret;
>> +
>> +	ret = genl_register_family(&handshake_genl_family);
>> +	if (ret) {
>> +		pr_warn("handshake: netlink registration failed (%d)\n", ret);
>> +		return ret;
>> +	}
>> +
>> +	ret = register_pernet_subsys(&handshake_genl_net_ops);
>> +	if (ret) {
>> +		pr_warn("handshake: pernet registration failed (%d)\n", ret);
>> +		genl_unregister_family(&handshake_genl_family);
>> +	}
>> +
>> +	handshake_genl_inited = true;
>> +	return ret;
>> +}
>> +
>> +static void __exit handshake_exit(void)
>> +{
>> +	unregister_pernet_subsys(&handshake_genl_net_ops);
>> +	genl_unregister_family(&handshake_genl_family);
>> +}
>> +
>> +module_init(handshake_init);
>> +module_exit(handshake_exit);
>> diff --git a/net/handshake/request.c b/net/handshake/request.c
>> new file mode 100644
>> index 000000000000..1d3b8e76dd2c
>> --- /dev/null
>> +++ b/net/handshake/request.c
>> @@ -0,0 +1,246 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Handshake request lifetime events
>> + *
>> + * Author: Chuck Lever <chuck.lever@...cle.com>
>> + *
>> + * Copyright (c) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#include <linux/types.h>
>> +#include <linux/socket.h>
>> +#include <linux/kernel.h>
>> +#include <linux/module.h>
>> +#include <linux/skbuff.h>
>> +#include <linux/inet.h>
>> +#include <linux/fdtable.h>
>> +
>> +#include <net/sock.h>
>> +#include <net/genetlink.h>
>> +#include <net/handshake.h>
>> +
>> +#include <uapi/linux/handshake.h>
>> +#include <trace/events/handshake.h>
>> +#include "handshake.h"
>> +
>> +/*
>> + * This limit is to prevent slow remotes from causing denial of service.
>> + * A ulimit-style tunable might be used instead.
>> + */
>> +#define HANDSHAKE_PENDING_MAX (10)
>> +
>> +static void __add_pending_locked(struct net *net, struct handshake_req *req)
>> +{
>> +	net->hs_pending++;
>> +	list_add_tail(&req->hr_list, &net->hs_requests);
>> +}
>> +
>> +void __remove_pending_locked(struct net *net, struct handshake_req *req)
>> +{
>> +	net->hs_pending--;
>> +	list_del_init(&req->hr_list);
>> +}
>> +
>> +/*
>> + * Return values:
>> + *   %true - the request was found on @net's pending list
>> + *   %false - the request was not found on @net's pending list
>> + *
>> + * If @req was on a pending list, it has not yet been accepted.
>> + */
>> +static bool remove_pending(struct net *net, struct handshake_req *req)
>> +{
>> +	bool ret;
>> +
>> +	ret = false;
>> +
>> +	spin_lock(&net->hs_lock);
>> +	if (!list_empty(&req->hr_list)) {
>> +		__remove_pending_locked(net, req);
>> +		ret = true;
>> +	}
>> +	spin_unlock(&net->hs_lock);
>> +
>> +	return ret;
>> +}
>> +
>> +static void handshake_req_destroy(struct handshake_req *req, struct sock *sk)
>> +{
>> +	req->hr_proto->hp_destroy(req);
>> +	sk->sk_handshake_req = NULL;
>> +	kfree(req);
>> +}
>> +
>> +static void handshake_sk_destruct(struct sock *sk)
>> +{
>> +	struct handshake_req *req = sk->sk_handshake_req;
>> +
>> +	if (req) {
>> +		trace_handshake_destruct(sock_net(sk), req, req->hr_sock);
>> +		handshake_req_destroy(req, sk);
>> +	}
>> +}
>> +
>> +/**
>> + * handshake_req_alloc - consumer API to allocate a request
>> + * @sock: open socket on which to perform a handshake
>> + * @proto: security protocol
>> + * @flags: memory allocation flags
>> + *
>> + * Returns an initialized handshake_req or NULL.
>> + */
>> +struct handshake_req *handshake_req_alloc(struct socket *sock,
>> +					  const struct handshake_proto *proto,
>> +					  gfp_t flags)
>> +{
>> +	struct handshake_req *req;
>> +
>> +	/* Avoid accessing uninitialized global variables later on */
>> +	if (!handshake_genl_inited)
>> +		return NULL;
>> +
>> +	req = kzalloc(sizeof(*req) + proto->hp_privsize, flags);
>> +	if (!req)
>> +		return NULL;
>> +
>> +	sock_hold(sock->sk);
>> +
>> +	INIT_LIST_HEAD(&req->hr_list);
>> +	req->hr_sock = sock;
>> +	req->hr_proto = proto;
>> +	return req;
>> +}
>> +EXPORT_SYMBOL(handshake_req_alloc);
>> +
>> +/**
>> + * handshake_req_private - consumer API to return per-handshake private data
>> + * @req: handshake arguments
>> + *
>> + */
>> +void *handshake_req_private(struct handshake_req *req)
>> +{
>> +	return (void *)(req + 1);
>> +}
>> +EXPORT_SYMBOL(handshake_req_private);
>> +
>> +/**
>> + * handshake_req_submit - consumer API to submit a handshake request
>> + * @req: handshake arguments
>> + * @flags: memory allocation flags
>> + *
>> + * Return values:
>> + *   %0: Request queued
>> + *   %-EBUSY: A handshake is already under way for this socket
>> + *   %-ESRCH: No handshake agent is available
>> + *   %-EAGAIN: Too many pending handshake requests
>> + *   %-ENOMEM: Failed to allocate memory
>> + *   %-EMSGSIZE: Failed to construct notification message
>> + *
>> + * A zero return value from handshake_request() means that
>> + * exactly one subsequent completion callback is guaranteed.
>> + *
>> + * A negative return value from handshake_request() means that
>> + * no completion callback will be done and that @req is
>> + * destroyed.
>> + */
>> +int handshake_req_submit(struct handshake_req *req, gfp_t flags)
>> +{
>> +	struct socket *sock = req->hr_sock;
>> +	struct sock *sk = sock->sk;
>> +	struct net *net = sock_net(sk);
>> +	int ret;
>> +
>> +	ret = -EAGAIN;
>> +	if (READ_ONCE(net->hs_pending) >= HANDSHAKE_PENDING_MAX)
>> +		goto out_err;
>> +
>> +	ret = -EBUSY;
>> +	spin_lock(&net->hs_lock);
>> +	if (sk->sk_handshake_req || !list_empty(&req->hr_list)) {
>> +		spin_unlock(&net->hs_lock);
>> +		goto out_err;
>> +	}
>> +	req->hr_saved_destruct = sk->sk_destruct;
>> +	sk->sk_destruct = handshake_sk_destruct;
>> +	sk->sk_handshake_req = req;
>> +	__add_pending_locked(net, req);
>> +	spin_unlock(&net->hs_lock);
>> +
>> +	ret = handshake_genl_notify(net, req->hr_proto->hp_handler_class,
>> +				    flags);
>> +	if (ret) {
>> +		trace_handshake_notify_err(net, req, sock, ret);
>> +		if (remove_pending(net, req))
>> +			goto out_err;
>> +	}
>> +
>> +	trace_handshake_submit(net, req, sock);
>> +	return 0;
>> +
>> +out_err:
>> +	trace_handshake_submit_err(net, req, sock, ret);
>> +	handshake_req_destroy(req, sk);
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL(handshake_req_submit);
>> +
>> +void handshake_complete(struct handshake_req *req, int status,
>> +			struct nlattr **tb)
>> +{
>> +	struct socket *sock = req->hr_sock;
>> +	struct net *net = sock_net(sock->sk);
>> +
>> +	if (!test_and_set_bit(HANDSHAKE_F_COMPLETED, &req->hr_flags)) {
>> +		trace_handshake_complete(net, req, sock, status);
>> +		req->hr_proto->hp_done(req, status, tb);
>> +		__sock_put(sock->sk);
>> +	}
>> +}
>> +
>> +/**
>> + * handshake_req_cancel - consumer API to cancel an in-progress handshake
>> + * @sock: socket on which there is an ongoing handshake
>> + *
>> + * XXX: Perhaps killing the user space agent might also be necessary?
> 
> I thought we had agreed that we would be sending a signal to the userspace process?

We had discussed killing the handler, but I don't think it's necessary.
I'd rather not do something that drastic unless we have no other choice.
So far my testing hasn't shown a need for killing the child process.

I'm also concerned that the kernel could reuse the handler's process ID.
handshake_req_cancel would kill something that is not a handshake agent.


> Ideally we would be sending a SIGHUP, wait for some time on the userspace process to respond with a 'done' message, and send a 'KILL' signal if we haven't received one.
> 
> Obs: Sending a KILL signal would imply that userspace is able to cope with children dying. Which pretty much excludes pthreads, I would think.
> 
> Guess I'll have to consult Stevens :-)

Basically what cancel does is atomically disarm the "done" callback.

The socket belongs to the kernel, so it will live until the kernel is
good and through with it.


>> + *
>> + * Request cancellation races with request completion. To determine
>> + * who won, callers examine the return value from this function.
>> + *
>> + * Return values:
>> + *   %0 - Uncompleted handshake request was canceled or not found
>> + *   %-EBUSY - Handshake request already completed
> 
> EBUSY? Wouldn't be EAGAIN more approriate?

I don't think EAGAIN would be appropriate at all. The situation
is that the handshake completed, so there's no need to call cancel
again. It's synonym, EWOULDBLOCK, is also not a good semantic fit.


> After all, the request is everything _but_ busy...

I'm open to suggestion.

One option is to use a boolean return value instead of an errno.


>> + */
>> +int handshake_req_cancel(struct socket *sock)
>> +{
>> +	struct handshake_req *req;
>> +	struct sock *sk;
>> +	struct net *net;
>> +
>> +	if (!sock)
>> +		return 0;
>> +
>> +	sk = sock->sk;
>> +	req = sk->sk_handshake_req;
>> +	net = sock_net(sk);
>> +
>> +	if (!req) {
>> +		trace_handshake_cancel_none(net, req, sock);
>> +		return 0;
>> +	}
>> +
>> +	if (remove_pending(net, req)) {
>> +		/* Request hadn't been accepted */
>> +		trace_handshake_cancel(net, req, sock);
>> +		return 0;
>> +	}
>> +	if (test_and_set_bit(HANDSHAKE_F_COMPLETED, &req->hr_flags)) {
>> +		/* Request already completed */
>> +		trace_handshake_cancel_busy(net, req, sock);
>> +		return -EBUSY;
>> +	}
>> +
>> +	__sock_put(sk);
>> +	trace_handshake_cancel(net, req, sock);
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL(handshake_req_cancel);
>> diff --git a/net/handshake/trace.c b/net/handshake/trace.c
>> new file mode 100644
>> index 000000000000..3a5b6f29a2b8
>> --- /dev/null
>> +++ b/net/handshake/trace.c
>> @@ -0,0 +1,17 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Trace points for transport security layer handshakes.
>> + *
>> + * Author: Chuck Lever <chuck.lever@...cle.com>
>> + *
>> + * Copyright (c) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#include <linux/types.h>
>> +#include <net/sock.h>
>> +
>> +#include "handshake.h"
>> +
>> +#define CREATE_TRACE_POINTS
>> +
>> +#include <trace/events/handshake.h>
> Cheers,
> 
> Hannes
> 
> 
> 

--
Chuck Lever



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ