[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1B595556-0236-49F3-A8B0-ECF2332450D4@oracle.com>
Date: Mon, 27 Feb 2023 14:59:40 +0000
From: Chuck Lever III <chuck.lever@...cle.com>
To: Hannes Reinecke <hare@...e.de>
CC: Chuck Lever <cel@...nel.org>, "kuba@...nel.org" <kuba@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"kernel-tls-handshake@...ts.linux.dev"
<kernel-tls-handshake@...ts.linux.dev>
Subject: Re: [PATCH v5 1/2] net/handshake: Create a NETLINK service for
handling handshake requests
> On Feb 27, 2023, at 4:24 AM, Hannes Reinecke <hare@...e.de> wrote:
>
> On 2/24/23 20:19, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@...cle.com>
>> When a kernel consumer needs a transport layer security session, it
>> first needs a handshake to negotiate and establish a session. This
>> negotiation can be done in user space via one of the several
>> existing library implementations, or it can be done in the kernel.
>> No in-kernel handshake implementations yet exist. In their absence,
>> we add a netlink service that can:
>> a. Notify a user space daemon that a handshake is needed.
>> b. Once notified, the daemon calls the kernel back via this
>> netlink service to get the handshake parameters, including an
>> open socket on which to establish the session.
>> c. Once the handshake is complete, the daemon reports the
>> session status and other information via a second netlink
>> operation. This operation marks that it is safe for the
>> kernel to use the open socket and the security session
>> established there.
>> The notification service uses a multicast group. Each handshake
>> mechanism (eg, tlshd) adopts its own group number so that the
>> handshake services are completely independent of one another. The
>> kernel can then tell via netlink_has_listeners() whether a handshake
>> service is active and prepared to handle a handshake request.
>> A new netlink operation, ACCEPT, acts like accept(2) in that it
>> instantiates a file descriptor in the user space daemon's fd table.
>> If this operation is successful, the reply carries the fd number,
>> which can be treated as an open and ready file descriptor.
>> While user space is performing the handshake, the kernel keeps its
>> muddy paws off the open socket. A second new netlink operation,
>> DONE, indicates that the user space daemon is finished with the
>> socket and it is safe for the kernel to use again. The operation
>> also indicates whether a session was established successfully.
>> Signed-off-by: Chuck Lever <chuck.lever@...cle.com>
>> ---
>> Documentation/netlink/specs/handshake.yaml | 134 +++++++++++
>> include/net/handshake.h | 45 ++++
>> include/net/net_namespace.h | 5
>> include/net/sock.h | 1
>> include/trace/events/handshake.h | 159 +++++++++++++
>> include/uapi/linux/handshake.h | 63 +++++
>> net/Makefile | 1
>> net/handshake/Makefile | 11 +
>> net/handshake/handshake.h | 41 +++
>> net/handshake/netlink.c | 340 ++++++++++++++++++++++++++++
>> net/handshake/request.c | 246 ++++++++++++++++++++
>> net/handshake/trace.c | 17 +
>> 12 files changed, 1063 insertions(+)
>> create mode 100644 Documentation/netlink/specs/handshake.yaml
>> create mode 100644 include/net/handshake.h
>> create mode 100644 include/trace/events/handshake.h
>> create mode 100644 include/uapi/linux/handshake.h
>> create mode 100644 net/handshake/Makefile
>> create mode 100644 net/handshake/handshake.h
>> create mode 100644 net/handshake/netlink.c
>> create mode 100644 net/handshake/request.c
>> create mode 100644 net/handshake/trace.c
>> diff --git a/Documentation/netlink/specs/handshake.yaml b/Documentation/netlink/specs/handshake.yaml
>> new file mode 100644
>> index 000000000000..683a8f2df0a7
>> --- /dev/null
>> +++ b/Documentation/netlink/specs/handshake.yaml
>> @@ -0,0 +1,134 @@
>> +# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
>> +#
>> +# GENL HANDSHAKE service.
>> +#
>> +# Author: Chuck Lever <chuck.lever@...cle.com>
>> +#
>> +# Copyright (c) 2023, Oracle and/or its affiliates.
>> +#
>> +
>> +name: handshake
>> +
>> +protocol: genetlink-c
>> +
>> +doc: Netlink protocol to request a transport layer security handshake.
>> +
>> +uapi-header: linux/net/handshake.h
>> +
>> +definitions:
>> + -
>> + type: enum
>> + name: handler-class
>> + enum-name:
>> + value-start: 0
>> + entries: [ none ]
>> + -
>> + type: enum
>> + name: msg-type
>> + enum-name:
>> + value-start: 0
>> + entries: [ unspec, clienthello, serverhello ]
>> + -
>> + type: enum
>> + name: auth
>> + enum-name:
>> + value-start: 0
>> + entries: [ unspec, unauth, x509, psk ]
>> +
>> +attribute-sets:
>> + -
>> + name: accept
>> + attributes:
>> + -
>> + name: status
>> + doc: Status of this accept operation
>> + type: u32
>> + value: 1
>> + -
>> + name: sockfd
>> + doc: File descriptor of socket to use
>> + type: u32
>> + -
>> + name: handler-class
>> + doc: Which type of handler is responding
>> + type: u32
>> + enum: handler-class
>> + -
>> + name: message-type
>> + doc: Handshake message type
>> + type: u32
>> + enum: msg-type
>> + -
>> + name: auth
>> + doc: Authentication mode
>> + type: u32
>> + enum: auth
>> + -
>> + name: gnutls-priorities
>> + doc: GnuTLS priority string
>> + type: string
>> + -
>> + name: my-peerid
>> + doc: Serial no of key containing local identity
>> + type: u32
>> + -
>> + name: my-privkey
>> + doc: Serial no of key containing optional private key
>> + type: u32
>> + -
>> + name: done
>> + attributes:
>> + -
>> + name: status
>> + doc: Session status
>> + type: u32
>> + value: 1
>> + -
>> + name: sockfd
>> + doc: File descriptor of socket that has completed
>> + type: u32
>> + -
>> + name: remote-peerid
>> + doc: Serial no of keys containing identities of remote peer
>> + type: u32
>> +
>> +operations:
>> + list:
>> + -
>> + name: ready
>> + doc: Notify handlers that a new handshake request is waiting
>> + value: 1
>> + notify: accept
>> + -
>> + name: accept
>> + doc: Handler retrieves next queued handshake request
>> + attribute-set: accept
>> + flags: [ admin-perm ]
>> + do:
>> + request:
>> + attributes:
>> + - handler-class
>> + reply:
>> + attributes:
>> + - status
>> + - sockfd
>> + - message-type
>> + - auth
>> + - gnutls-priorities
>> + - my-peerid
>> + - my-privkey
>> + -
>> + name: done
>> + doc: Handler reports handshake completion
>> + attribute-set: done
>> + do:
>> + request:
>> + attributes:
>> + - status
>> + - sockfd
>> + - remote-peerid
>> +
>> +mcast-groups:
>> + list:
>> + -
>> + name: none
>> diff --git a/include/net/handshake.h b/include/net/handshake.h
>> new file mode 100644
>> index 000000000000..08f859237936
>> --- /dev/null
>> +++ b/include/net/handshake.h
>> @@ -0,0 +1,45 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Generic HANDSHAKE service.
>> + *
>> + * Author: Chuck Lever <chuck.lever@...cle.com>
>> + *
>> + * Copyright (c) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +/*
>> + * Data structures and functions that are visible only within the
>> + * kernel are declared here.
>> + */
>> +
>> +#ifndef _NET_HANDSHAKE_H
>> +#define _NET_HANDSHAKE_H
>> +
>> +struct handshake_req;
>> +
>> +/*
>> + * Invariants for all handshake requests for one transport layer
>> + * security protocol
>> + */
>> +struct handshake_proto {
>> + int hp_handler_class;
>> + size_t hp_privsize;
>> +
>> + int (*hp_accept)(struct handshake_req *req,
>> + struct genl_info *gi, int fd);
>> + void (*hp_done)(struct handshake_req *req,
>> + int status, struct nlattr **tb);
>> + void (*hp_destroy)(struct handshake_req *req);
>> +};
>> +
>> +extern struct handshake_req *
>> +handshake_req_alloc(struct socket *sock, const struct handshake_proto *proto,
>> + gfp_t flags);
>> +extern void *handshake_req_private(struct handshake_req *req);
>> +extern int handshake_req_submit(struct handshake_req *req, gfp_t flags);
>> +extern int handshake_req_cancel(struct socket *sock);
>> +
>> +extern struct nlmsghdr *handshake_genl_put(struct sk_buff *msg,
>> + struct genl_info *gi);
>> +
>> +#endif /* _NET_HANDSHAKE_H */
>> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
>> index 78beaa765c73..a0ce9de4dab1 100644
>> --- a/include/net/net_namespace.h
>> +++ b/include/net/net_namespace.h
>> @@ -188,6 +188,11 @@ struct net {
>> #if IS_ENABLED(CONFIG_SMC)
>> struct netns_smc smc;
>> #endif
>> +
>> + /* transport layer security handshake requests */
>> + spinlock_t hs_lock;
>> + struct list_head hs_requests;
>> + int hs_pending;
>> } __randomize_layout;
>> #include <linux/seq_file_net.h>
>> diff --git a/include/net/sock.h b/include/net/sock.h
>> index 573f2bf7e0de..2a7345ce2540 100644
>> --- a/include/net/sock.h
>> +++ b/include/net/sock.h
>> @@ -519,6 +519,7 @@ struct sock {
>> struct socket *sk_socket;
>> void *sk_user_data;
>> + void *sk_handshake_req;
>> #ifdef CONFIG_SECURITY
>> void *sk_security;
>> #endif
>> diff --git a/include/trace/events/handshake.h b/include/trace/events/handshake.h
>> new file mode 100644
>> index 000000000000..feffcd1d6256
>> --- /dev/null
>> +++ b/include/trace/events/handshake.h
>> @@ -0,0 +1,159 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#undef TRACE_SYSTEM
>> +#define TRACE_SYSTEM handshake
>> +
>> +#if !defined(_TRACE_HANDSHAKE_H) || defined(TRACE_HEADER_MULTI_READ)
>> +#define _TRACE_HANDSHAKE_H
>> +
>> +#include <linux/net.h>
>> +#include <linux/tracepoint.h>
>> +
>> +DECLARE_EVENT_CLASS(handshake_event_class,
>> + TP_PROTO(
>> + const struct net *net,
>> + const struct handshake_req *req,
>> + const struct socket *sock
>> + ),
>> + TP_ARGS(net, req, sock),
>> + TP_STRUCT__entry(
>> + __field(const void *, req)
>> + __field(const void *, sock)
>> + __field(unsigned int, netns_ino)
>> + ),
>> + TP_fast_assign(
>> + __entry->req = req;
>> + __entry->sock = sock;
>> + __entry->netns_ino = net->ns.inum;
>> + ),
>> + TP_printk("req=%p sock=%p",
>> + __entry->req, __entry->sock
>> + )
>> +);
>> +#define DEFINE_HANDSHAKE_EVENT(name) \
>> + DEFINE_EVENT(handshake_event_class, name, \
>> + TP_PROTO( \
>> + const struct net *net, \
>> + const struct handshake_req *req, \
>> + const struct socket *sock \
>> + ), \
>> + TP_ARGS(net, req, sock))
>> +
>> +DECLARE_EVENT_CLASS(handshake_fd_class,
>> + TP_PROTO(
>> + const struct net *net,
>> + const struct handshake_req *req,
>> + const struct socket *sock,
>> + int fd
>> + ),
>> + TP_ARGS(net, req, sock, fd),
>> + TP_STRUCT__entry(
>> + __field(const void *, req)
>> + __field(const void *, sock)
>> + __field(int, fd)
>> + __field(unsigned int, netns_ino)
>> + ),
>> + TP_fast_assign(
>> + __entry->req = req;
>> + __entry->sock = req->hr_sock;
>> + __entry->fd = fd;
>> + __entry->netns_ino = net->ns.inum;
>> + ),
>> + TP_printk("req=%p sock=%p fd=%d",
>> + __entry->req, __entry->sock, __entry->fd
>> + )
>> +);
>> +#define DEFINE_HANDSHAKE_FD_EVENT(name) \
>> + DEFINE_EVENT(handshake_fd_class, name, \
>> + TP_PROTO( \
>> + const struct net *net, \
>> + const struct handshake_req *req, \
>> + const struct socket *sock, \
>> + int fd \
>> + ), \
>> + TP_ARGS(net, req, sock, fd))
>> +
>> +DECLARE_EVENT_CLASS(handshake_error_class,
>> + TP_PROTO(
>> + const struct net *net,
>> + const struct handshake_req *req,
>> + const struct socket *sock,
>> + int err
>> + ),
>> + TP_ARGS(net, req, sock, err),
>> + TP_STRUCT__entry(
>> + __field(const void *, req)
>> + __field(const void *, sock)
>> + __field(int, err)
>> + __field(unsigned int, netns_ino)
>> + ),
>> + TP_fast_assign(
>> + __entry->req = req;
>> + __entry->sock = sock;
>> + __entry->err = err;
>> + __entry->netns_ino = net->ns.inum;
>> + ),
>> + TP_printk("req=%p sock=%p err=%d",
>> + __entry->req, __entry->sock, __entry->err
>> + )
>> +);
>> +#define DEFINE_HANDSHAKE_ERROR(name) \
>> + DEFINE_EVENT(handshake_error_class, name, \
>> + TP_PROTO( \
>> + const struct net *net, \
>> + const struct handshake_req *req, \
>> + const struct socket *sock, \
>> + int err \
>> + ), \
>> + TP_ARGS(net, req, sock, err))
>> +
>> +
>> +/**
>> + ** Request lifetime events
>> + **/
>> +
>> +DEFINE_HANDSHAKE_EVENT(handshake_submit);
>> +DEFINE_HANDSHAKE_ERROR(handshake_submit_err);
>> +DEFINE_HANDSHAKE_EVENT(handshake_cancel);
>> +DEFINE_HANDSHAKE_EVENT(handshake_cancel_none);
>> +DEFINE_HANDSHAKE_EVENT(handshake_cancel_busy);
>> +DEFINE_HANDSHAKE_EVENT(handshake_destruct);
>> +
>> +
>> +TRACE_EVENT(handshake_complete,
>> + TP_PROTO(
>> + const struct net *net,
>> + const struct handshake_req *req,
>> + const struct socket *sock,
>> + int status
>> + ),
>> + TP_ARGS(net, req, sock, status),
>> + TP_STRUCT__entry(
>> + __field(const void *, req)
>> + __field(const void *, sock)
>> + __field(int, status)
>> + __field(unsigned int, netns_ino)
>> + ),
>> + TP_fast_assign(
>> + __entry->req = req;
>> + __entry->sock = sock;
>> + __entry->status = status;
>> + __entry->netns_ino = net->ns.inum;
>> + ),
>> + TP_printk("req=%p sock=%p status=%d",
>> + __entry->req, __entry->sock, __entry->status
>> + )
>> +);
>> +
>> +/**
>> + ** Netlink events
>> + **/
>> +
>> +DEFINE_HANDSHAKE_ERROR(handshake_notify_err);
>> +DEFINE_HANDSHAKE_FD_EVENT(handshake_cmd_accept);
>> +DEFINE_HANDSHAKE_ERROR(handshake_cmd_accept_err);
>> +DEFINE_HANDSHAKE_FD_EVENT(handshake_cmd_done);
>> +DEFINE_HANDSHAKE_ERROR(handshake_cmd_done_err);
>> +
>> +#endif /* _TRACE_HANDSHAKE_H */
>> +
>> +#include <trace/define_trace.h>
>> diff --git a/include/uapi/linux/handshake.h b/include/uapi/linux/handshake.h
>> new file mode 100644
>> index 000000000000..09fd7c37cba4
>> --- /dev/null
>> +++ b/include/uapi/linux/handshake.h
>> @@ -0,0 +1,63 @@
>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
>> +/* Do not edit directly, auto-generated from: */
>> +/* Documentation/netlink/specs/handshake.yaml */
>> +/* YNL-GEN uapi header */
>> +
>> +#ifndef _UAPI_LINUX_HANDSHAKE_H
>> +#define _UAPI_LINUX_HANDSHAKE_H
>> +
>> +#define HANDSHAKE_FAMILY_NAME "handshake"
>> +#define HANDSHAKE_FAMILY_VERSION 1
>> +
>> +enum {
>> + HANDSHAKE_HANDLER_CLASS_NONE,
>> +};
>> +
>> +enum {
>> + HANDSHAKE_MSG_TYPE_UNSPEC,
>> + HANDSHAKE_MSG_TYPE_CLIENTHELLO,
>> + HANDSHAKE_MSG_TYPE_SERVERHELLO,
>> +};
>> +
>> +enum {
>> + HANDSHAKE_AUTH_UNSPEC,
>> + HANDSHAKE_AUTH_UNAUTH,
>> + HANDSHAKE_AUTH_X509,
>> + HANDSHAKE_AUTH_PSK,
>> +};
>> +
>> +enum {
>> + HANDSHAKE_A_ACCEPT_STATUS = 1,
>> + HANDSHAKE_A_ACCEPT_SOCKFD,
>> + HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
>> + HANDSHAKE_A_ACCEPT_MESSAGE_TYPE,
>> + HANDSHAKE_A_ACCEPT_AUTH,
>> + HANDSHAKE_A_ACCEPT_GNUTLS_PRIORITIES,
>> + HANDSHAKE_A_ACCEPT_MY_PEERID,
>> + HANDSHAKE_A_ACCEPT_MY_PRIVKEY,
>> +
>> + __HANDSHAKE_A_ACCEPT_MAX,
>> + HANDSHAKE_A_ACCEPT_MAX = (__HANDSHAKE_A_ACCEPT_MAX - 1)
>> +};
>> +
>> +enum {
>> + HANDSHAKE_A_DONE_STATUS = 1,
>> + HANDSHAKE_A_DONE_SOCKFD,
>> + HANDSHAKE_A_DONE_REMOTE_PEERID,
>> +
>> + __HANDSHAKE_A_DONE_MAX,
>> + HANDSHAKE_A_DONE_MAX = (__HANDSHAKE_A_DONE_MAX - 1)
>> +};
>> +
>> +enum {
>> + HANDSHAKE_CMD_READY = 1,
>> + HANDSHAKE_CMD_ACCEPT,
>> + HANDSHAKE_CMD_DONE,
>> +
>> + __HANDSHAKE_CMD_MAX,
>> + HANDSHAKE_CMD_MAX = (__HANDSHAKE_CMD_MAX - 1)
>> +};
>> +
>> +#define HANDSHAKE_MCGRP_NONE "none"
>> +
>> +#endif /* _UAPI_LINUX_HANDSHAKE_H */
>> diff --git a/net/Makefile b/net/Makefile
>> index 0914bea9c335..adbb64277601 100644
>> --- a/net/Makefile
>> +++ b/net/Makefile
>> @@ -79,3 +79,4 @@ obj-$(CONFIG_NET_NCSI) += ncsi/
>> obj-$(CONFIG_XDP_SOCKETS) += xdp/
>> obj-$(CONFIG_MPTCP) += mptcp/
>> obj-$(CONFIG_MCTP) += mctp/
>> +obj-y += handshake/
>> diff --git a/net/handshake/Makefile b/net/handshake/Makefile
>> new file mode 100644
>> index 000000000000..a41b03f4837b
>> --- /dev/null
>> +++ b/net/handshake/Makefile
>> @@ -0,0 +1,11 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +#
>> +# Makefile for the Generic HANDSHAKE service
>> +#
>> +# Author: Chuck Lever <chuck.lever@...cle.com>
>> +#
>> +# Copyright (c) 2023, Oracle and/or its affiliates.
>> +#
>> +
>> +obj-y += handshake.o
>> +handshake-y := netlink.o request.o trace.o
>> diff --git a/net/handshake/handshake.h b/net/handshake/handshake.h
>> new file mode 100644
>> index 000000000000..366c7659ec09
>> --- /dev/null
>> +++ b/net/handshake/handshake.h
>> @@ -0,0 +1,41 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Generic netlink handshake service
>> + *
>> + * Author: Chuck Lever <chuck.lever@...cle.com>
>> + *
>> + * Copyright (c) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +/*
>> + * Data structures and functions that are visible only within the
>> + * handshake module are declared here.
>> + */
>> +
>> +#ifndef _INTERNAL_HANDSHAKE_H
>> +#define _INTERNAL_HANDSHAKE_H
>> +
>> +/*
>> + * One handshake request
>> + */
>> +struct handshake_req {
>> + struct list_head hr_list;
>> + unsigned long hr_flags;
>> + const struct handshake_proto *hr_proto;
>> + struct socket *hr_sock;
>> +
>> + void (*hr_saved_destruct)(struct sock *sk);
>> +};
>> +
>> +#define HANDSHAKE_F_COMPLETED BIT(0)
>> +
>> +/* netlink.c */
>> +extern bool handshake_genl_inited;
>> +int handshake_genl_notify(struct net *net, int handler_class, gfp_t flags);
>> +
>> +/* request.c */
>> +void __remove_pending_locked(struct net *net, struct handshake_req *req);
>> +void handshake_complete(struct handshake_req *req, int status,
>> + struct nlattr **tb);
>> +
>> +#endif /* _INTERNAL_HANDSHAKE_H */
>> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
>> new file mode 100644
>> index 000000000000..581e382236cf
>> --- /dev/null
>> +++ b/net/handshake/netlink.c
>> @@ -0,0 +1,340 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Generic netlink handshake service
>> + *
>> + * Author: Chuck Lever <chuck.lever@...cle.com>
>> + *
>> + * Copyright (c) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#include <linux/types.h>
>> +#include <linux/socket.h>
>> +#include <linux/kernel.h>
>> +#include <linux/module.h>
>> +#include <linux/skbuff.h>
>> +#include <linux/inet.h>
>> +
>> +#include <net/sock.h>
>> +#include <net/genetlink.h>
>> +#include <net/handshake.h>
>> +
>> +#include <uapi/linux/handshake.h>
>> +#include <trace/events/handshake.h>
>> +#include "handshake.h"
>> +
>> +static struct genl_family __ro_after_init handshake_genl_family;
>> +bool handshake_genl_inited;
>> +
>> +/**
>> + * handshake_genl_notify - Notify handlers that a request is waiting
>> + * @net: target network namespace
>> + * @handler_class: target handler
>> + * @flags: memory allocation control flags
>> + *
>> + * Returns zero on success or a negative errno if notification failed.
>> + */
>> +int handshake_genl_notify(struct net *net, int handler_class, gfp_t flags)
>> +{
>> + struct sk_buff *msg;
>> + void *hdr;
>> +
>> + if (!genl_has_listeners(&handshake_genl_family, net, handler_class))
>> + return -ESRCH;
>> +
>> + msg = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
>> + if (!msg)
>> + return -ENOMEM;
>> +
>> + hdr = genlmsg_put(msg, 0, 0, &handshake_genl_family, 0,
>> + HANDSHAKE_CMD_READY);
>> + if (!hdr)
>> + goto out_free;
>> +
>> + if (nla_put_u32(msg, HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
>> + handler_class) < 0) {
>> + genlmsg_cancel(msg, hdr);
>> + goto out_free;
>> + }
>> +
>> + genlmsg_end(msg, hdr);
>> + return genlmsg_multicast_netns(&handshake_genl_family, net, msg,
>> + 0, handler_class, flags);
>> +
>> +out_free:
>> + nlmsg_free(msg);
>> + return -EMSGSIZE;
>> +}
>> +
>> +/**
>> + * handshake_genl_put - Create a generic netlink message header
>> + * @msg: buffer in which to create the header
>> + * @gi: generic netlink message context
>> + *
>> + * Returns a ready-to-use header, or NULL.
>> + */
>> +struct nlmsghdr *handshake_genl_put(struct sk_buff *msg, struct genl_info *gi)
>> +{
>> + return genlmsg_put(msg, gi->snd_portid, gi->snd_seq,
>> + &handshake_genl_family, 0, gi->genlhdr->cmd);
>> +}
>> +EXPORT_SYMBOL(handshake_genl_put);
>> +
>> +static int handshake_status_reply(struct sk_buff *skb, struct genl_info *gi,
>> + int status)
>> +{
>> + struct nlmsghdr *hdr;
>> + struct sk_buff *msg;
>> + int ret;
>> +
>> + ret = -ENOMEM;
>> + msg = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
>> + if (!msg)
>> + goto out;
>> + hdr = handshake_genl_put(msg, gi);
>> + if (!hdr)
>> + goto out_free;
>> +
>> + ret = -EMSGSIZE;
>> + ret = nla_put_u32(msg, HANDSHAKE_A_ACCEPT_STATUS, status);
>> + if (ret < 0)
>> + goto out_free;
>> +
>> + genlmsg_end(msg, hdr);
>> + return genlmsg_reply(msg, gi);
>> +
>> +out_free:
>> + genlmsg_cancel(msg, hdr);
>> +out:
>> + return ret;
>> +}
>> +
>> +/*
>> + * dup() a kernel socket for use as a user space file descriptor
>> + * in the current process.
>> + *
>> + * Implicit argument: "current()"
>> + */
>> +static int handshake_dup(struct socket *kernsock)
>> +{
>> + struct file *file = get_file(kernsock->file);
>> + int newfd;
>> +
>> + newfd = get_unused_fd_flags(O_CLOEXEC);
>> + if (newfd < 0) {
>> + fput(file);
>> + return newfd;
>> + }
>> +
>> + fd_install(newfd, file);
>> + return newfd;
>> +}
>> +
>> +static const struct nla_policy
>> +handshake_accept_nl_policy[HANDSHAKE_A_ACCEPT_HANDLER_CLASS + 1] = {
>> + [HANDSHAKE_A_ACCEPT_HANDLER_CLASS] = { .type = NLA_U32, },
>> +};
>> +
>> +static int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *gi)
>> +{
>> + struct nlattr *tb[HANDSHAKE_A_ACCEPT_MAX + 1];
>> + struct net *net = sock_net(skb->sk);
>> + struct handshake_req *pos, *req;
>> + int fd, err;
>> +
>> + err = -EINVAL;
>> + if (genlmsg_parse(nlmsg_hdr(skb), &handshake_genl_family, tb,
>> + HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
>> + handshake_accept_nl_policy, NULL))
>> + goto out_status;
>> + if (!tb[HANDSHAKE_A_ACCEPT_HANDLER_CLASS])
>> + goto out_status;
>> +
>> + req = NULL;
>> + spin_lock(&net->hs_lock);
>> + list_for_each_entry(pos, &net->hs_requests, hr_list) {
>> + if (pos->hr_proto->hp_handler_class !=
>> + nla_get_u32(tb[HANDSHAKE_A_ACCEPT_HANDLER_CLASS]))
>> + continue;
>> + __remove_pending_locked(net, pos);
>> + req = pos;
>> + break;
>> + }
>> + spin_unlock(&net->hs_lock);
>> + if (!req)
>> + goto out_status;
>> +
>> + fd = handshake_dup(req->hr_sock);
>> + if (fd < 0) {
>> + err = fd;
>> + goto out_complete;
>> + }
>> + err = req->hr_proto->hp_accept(req, gi, fd);
>> + if (err)
>> + goto out_complete;
>> +
>> + trace_handshake_cmd_accept(net, req, req->hr_sock, fd);
>> + return 0;
>> +
>> +out_complete:
>> + handshake_complete(req, -EIO, NULL);
>> + fput(req->hr_sock->file);
>> +out_status:
>> + trace_handshake_cmd_accept_err(net, req, NULL, err);
>> + return handshake_status_reply(skb, gi, err);
>> +}
>> +
>> +static const struct nla_policy
>> +handshake_done_nl_policy[HANDSHAKE_A_DONE_MAX + 1] = {
>> + [HANDSHAKE_A_DONE_SOCKFD] = { .type = NLA_U32, },
>> + [HANDSHAKE_A_DONE_STATUS] = { .type = NLA_U32, },
>> + [HANDSHAKE_A_DONE_REMOTE_PEERID] = { .type = NLA_U32, },
>> +};
>> +
>> +static int handshake_nl_done_doit(struct sk_buff *skb, struct genl_info *gi)
>> +{
>> + struct nlattr *tb[HANDSHAKE_A_DONE_MAX + 1];
>> + struct net *net = sock_net(skb->sk);
>> + struct socket *sock = NULL;
>> + struct handshake_req *req;
>> + int fd, status, err;
>> +
>> + err = genlmsg_parse(nlmsg_hdr(skb), &handshake_genl_family, tb,
>> + HANDSHAKE_A_DONE_MAX, handshake_done_nl_policy,
>> + NULL);
>> + if (err || !tb[HANDSHAKE_A_DONE_SOCKFD]) {
>> + err = -EINVAL;
>> + goto out_status;
>> + }
>> +
>> + fd = nla_get_u32(tb[HANDSHAKE_A_DONE_SOCKFD]);
>> +
>> + err = 0;
>> + sock = sockfd_lookup(fd, &err);
>> + if (err) {
>> + err = -EBADF;
>> + goto out_status;
>> + }
>> +
>> + req = sock->sk->sk_handshake_req;
>> + if (!req) {
>> + err = -EBUSY;
>> + goto out_status;
>> + }
>> +
>> + trace_handshake_cmd_done(net, req, sock, fd);
>> +
>> + status = -EIO;
>> + if (tb[HANDSHAKE_A_DONE_STATUS])
>> + status = nla_get_u32(tb[HANDSHAKE_A_DONE_STATUS]);
>> +
> And this makes me ever so slightly uneasy.
>
> As 'status' is a netlink attribute it's inevitably defined as 'unsigned'.
> Yet we assume that 'status' is a negative number, leaving us _technically_ in unchartered territory.
Ah, that's an oversight.
> And that is notwithstanding the problem that we haven't even defined _what_ should be in the status attribute.
It's now an errno value.
> Reading the code I assume that it's either '0' for success or a negative number (ie the error code) on failure.
> Which implicitely means that we _never_ set a positive number here.
> So what would we lose if we declare 'status' to carry the _positive_ error number instead?
> It would bring us in-line with the actual netlink attribute definition, we wouldn't need to worry about possible integer overflows, yadda yadda...
>
> Hmm?
It can also be argued that errnos in user space are positive-valued,
therefore, this user space visible protocol should use a positive
errno.
>> + handshake_complete(req, status, tb);
>> + fput(sock->file);
>> + return 0;
>> +
>> +out_status:
>> + trace_handshake_cmd_done_err(net, req, sock, err);
>> + return handshake_status_reply(skb, gi, err);
>> +}
>> +
>> +static const struct genl_split_ops handshake_nl_ops[] = {
>> + {
>> + .cmd = HANDSHAKE_CMD_ACCEPT,
>> + .doit = handshake_nl_accept_doit,
>> + .policy = handshake_accept_nl_policy,
>> + .maxattr = HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
>> + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
>> + },
>> + {
>> + .cmd = HANDSHAKE_CMD_DONE,
>> + .doit = handshake_nl_done_doit,
>> + .policy = handshake_done_nl_policy,
>> + .maxattr = HANDSHAKE_A_DONE_REMOTE_PEERID,
>> + .flags = GENL_CMD_CAP_DO,
>> + },
>> +};
>> +
>> +static const struct genl_multicast_group handshake_nl_mcgrps[] = {
>> + [HANDSHAKE_HANDLER_CLASS_NONE] = { .name = HANDSHAKE_MCGRP_NONE, },
>> +};
>> +
>> +static struct genl_family __ro_after_init handshake_genl_family = {
>> + .hdrsize = 0,
>> + .name = HANDSHAKE_FAMILY_NAME,
>> + .version = HANDSHAKE_FAMILY_VERSION,
>> + .netnsok = true,
>> + .parallel_ops = true,
>> + .n_mcgrps = ARRAY_SIZE(handshake_nl_mcgrps),
>> + .n_split_ops = ARRAY_SIZE(handshake_nl_ops),
>> + .split_ops = handshake_nl_ops,
>> + .mcgrps = handshake_nl_mcgrps,
>> + .module = THIS_MODULE,
>> +};
>> +
>> +static int __net_init handshake_net_init(struct net *net)
>> +{
>> + spin_lock_init(&net->hs_lock);
>> + INIT_LIST_HEAD(&net->hs_requests);
>> + net->hs_pending = 0;
>> + return 0;
>> +}
>> +
>> +static void __net_exit handshake_net_exit(struct net *net)
>> +{
>> + struct handshake_req *req;
>> + LIST_HEAD(requests);
>> +
>> + /*
>> + * This drains the net's pending list. Requests that
>> + * have been accepted and are in progress will be
>> + * destroyed when the socket is closed.
>> + */
>> + spin_lock(&net->hs_lock);
>> + list_splice_init(&requests, &net->hs_requests);
>> + spin_unlock(&net->hs_lock);
>> +
>> + while (!list_empty(&requests)) {
>> + req = list_first_entry(&requests, struct handshake_req, hr_list);
>> + list_del(&req->hr_list);
>> +
>> + /*
>> + * Requests on this list have not yet been
>> + * accepted, so they do not have an fd to put.
>> + */
>> +
>> + handshake_complete(req, -ETIMEDOUT, NULL);
>> + }
>> +}
>> +
>> +static struct pernet_operations handshake_genl_net_ops = {
>> + .init = handshake_net_init,
>> + .exit = handshake_net_exit,
>> +};
>> +
>> +static int __init handshake_init(void)
>> +{
>> + int ret;
>> +
>> + ret = genl_register_family(&handshake_genl_family);
>> + if (ret) {
>> + pr_warn("handshake: netlink registration failed (%d)\n", ret);
>> + return ret;
>> + }
>> +
>> + ret = register_pernet_subsys(&handshake_genl_net_ops);
>> + if (ret) {
>> + pr_warn("handshake: pernet registration failed (%d)\n", ret);
>> + genl_unregister_family(&handshake_genl_family);
>> + }
>> +
>> + handshake_genl_inited = true;
>> + return ret;
>> +}
>> +
>> +static void __exit handshake_exit(void)
>> +{
>> + unregister_pernet_subsys(&handshake_genl_net_ops);
>> + genl_unregister_family(&handshake_genl_family);
>> +}
>> +
>> +module_init(handshake_init);
>> +module_exit(handshake_exit);
>> diff --git a/net/handshake/request.c b/net/handshake/request.c
>> new file mode 100644
>> index 000000000000..1d3b8e76dd2c
>> --- /dev/null
>> +++ b/net/handshake/request.c
>> @@ -0,0 +1,246 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Handshake request lifetime events
>> + *
>> + * Author: Chuck Lever <chuck.lever@...cle.com>
>> + *
>> + * Copyright (c) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#include <linux/types.h>
>> +#include <linux/socket.h>
>> +#include <linux/kernel.h>
>> +#include <linux/module.h>
>> +#include <linux/skbuff.h>
>> +#include <linux/inet.h>
>> +#include <linux/fdtable.h>
>> +
>> +#include <net/sock.h>
>> +#include <net/genetlink.h>
>> +#include <net/handshake.h>
>> +
>> +#include <uapi/linux/handshake.h>
>> +#include <trace/events/handshake.h>
>> +#include "handshake.h"
>> +
>> +/*
>> + * This limit is to prevent slow remotes from causing denial of service.
>> + * A ulimit-style tunable might be used instead.
>> + */
>> +#define HANDSHAKE_PENDING_MAX (10)
>> +
>> +static void __add_pending_locked(struct net *net, struct handshake_req *req)
>> +{
>> + net->hs_pending++;
>> + list_add_tail(&req->hr_list, &net->hs_requests);
>> +}
>> +
>> +void __remove_pending_locked(struct net *net, struct handshake_req *req)
>> +{
>> + net->hs_pending--;
>> + list_del_init(&req->hr_list);
>> +}
>> +
>> +/*
>> + * Return values:
>> + * %true - the request was found on @net's pending list
>> + * %false - the request was not found on @net's pending list
>> + *
>> + * If @req was on a pending list, it has not yet been accepted.
>> + */
>> +static bool remove_pending(struct net *net, struct handshake_req *req)
>> +{
>> + bool ret;
>> +
>> + ret = false;
>> +
>> + spin_lock(&net->hs_lock);
>> + if (!list_empty(&req->hr_list)) {
>> + __remove_pending_locked(net, req);
>> + ret = true;
>> + }
>> + spin_unlock(&net->hs_lock);
>> +
>> + return ret;
>> +}
>> +
>> +static void handshake_req_destroy(struct handshake_req *req, struct sock *sk)
>> +{
>> + req->hr_proto->hp_destroy(req);
>> + sk->sk_handshake_req = NULL;
>> + kfree(req);
>> +}
>> +
>> +static void handshake_sk_destruct(struct sock *sk)
>> +{
>> + struct handshake_req *req = sk->sk_handshake_req;
>> +
>> + if (req) {
>> + trace_handshake_destruct(sock_net(sk), req, req->hr_sock);
>> + handshake_req_destroy(req, sk);
>> + }
>> +}
>> +
>> +/**
>> + * handshake_req_alloc - consumer API to allocate a request
>> + * @sock: open socket on which to perform a handshake
>> + * @proto: security protocol
>> + * @flags: memory allocation flags
>> + *
>> + * Returns an initialized handshake_req or NULL.
>> + */
>> +struct handshake_req *handshake_req_alloc(struct socket *sock,
>> + const struct handshake_proto *proto,
>> + gfp_t flags)
>> +{
>> + struct handshake_req *req;
>> +
>> + /* Avoid accessing uninitialized global variables later on */
>> + if (!handshake_genl_inited)
>> + return NULL;
>> +
>> + req = kzalloc(sizeof(*req) + proto->hp_privsize, flags);
>> + if (!req)
>> + return NULL;
>> +
>> + sock_hold(sock->sk);
>> +
>> + INIT_LIST_HEAD(&req->hr_list);
>> + req->hr_sock = sock;
>> + req->hr_proto = proto;
>> + return req;
>> +}
>> +EXPORT_SYMBOL(handshake_req_alloc);
>> +
>> +/**
>> + * handshake_req_private - consumer API to return per-handshake private data
>> + * @req: handshake arguments
>> + *
>> + */
>> +void *handshake_req_private(struct handshake_req *req)
>> +{
>> + return (void *)(req + 1);
>> +}
>> +EXPORT_SYMBOL(handshake_req_private);
>> +
>> +/**
>> + * handshake_req_submit - consumer API to submit a handshake request
>> + * @req: handshake arguments
>> + * @flags: memory allocation flags
>> + *
>> + * Return values:
>> + * %0: Request queued
>> + * %-EBUSY: A handshake is already under way for this socket
>> + * %-ESRCH: No handshake agent is available
>> + * %-EAGAIN: Too many pending handshake requests
>> + * %-ENOMEM: Failed to allocate memory
>> + * %-EMSGSIZE: Failed to construct notification message
>> + *
>> + * A zero return value from handshake_request() means that
>> + * exactly one subsequent completion callback is guaranteed.
>> + *
>> + * A negative return value from handshake_request() means that
>> + * no completion callback will be done and that @req is
>> + * destroyed.
>> + */
>> +int handshake_req_submit(struct handshake_req *req, gfp_t flags)
>> +{
>> + struct socket *sock = req->hr_sock;
>> + struct sock *sk = sock->sk;
>> + struct net *net = sock_net(sk);
>> + int ret;
>> +
>> + ret = -EAGAIN;
>> + if (READ_ONCE(net->hs_pending) >= HANDSHAKE_PENDING_MAX)
>> + goto out_err;
>> +
>> + ret = -EBUSY;
>> + spin_lock(&net->hs_lock);
>> + if (sk->sk_handshake_req || !list_empty(&req->hr_list)) {
>> + spin_unlock(&net->hs_lock);
>> + goto out_err;
>> + }
>> + req->hr_saved_destruct = sk->sk_destruct;
>> + sk->sk_destruct = handshake_sk_destruct;
>> + sk->sk_handshake_req = req;
>> + __add_pending_locked(net, req);
>> + spin_unlock(&net->hs_lock);
>> +
>> + ret = handshake_genl_notify(net, req->hr_proto->hp_handler_class,
>> + flags);
>> + if (ret) {
>> + trace_handshake_notify_err(net, req, sock, ret);
>> + if (remove_pending(net, req))
>> + goto out_err;
>> + }
>> +
>> + trace_handshake_submit(net, req, sock);
>> + return 0;
>> +
>> +out_err:
>> + trace_handshake_submit_err(net, req, sock, ret);
>> + handshake_req_destroy(req, sk);
>> + return ret;
>> +}
>> +EXPORT_SYMBOL(handshake_req_submit);
>> +
>> +void handshake_complete(struct handshake_req *req, int status,
>> + struct nlattr **tb)
>> +{
>> + struct socket *sock = req->hr_sock;
>> + struct net *net = sock_net(sock->sk);
>> +
>> + if (!test_and_set_bit(HANDSHAKE_F_COMPLETED, &req->hr_flags)) {
>> + trace_handshake_complete(net, req, sock, status);
>> + req->hr_proto->hp_done(req, status, tb);
>> + __sock_put(sock->sk);
>> + }
>> +}
>> +
>> +/**
>> + * handshake_req_cancel - consumer API to cancel an in-progress handshake
>> + * @sock: socket on which there is an ongoing handshake
>> + *
>> + * XXX: Perhaps killing the user space agent might also be necessary?
>
> I thought we had agreed that we would be sending a signal to the userspace process?
We had discussed killing the handler, but I don't think it's necessary.
I'd rather not do something that drastic unless we have no other choice.
So far my testing hasn't shown a need for killing the child process.
I'm also concerned that the kernel could reuse the handler's process ID.
handshake_req_cancel would kill something that is not a handshake agent.
> Ideally we would be sending a SIGHUP, wait for some time on the userspace process to respond with a 'done' message, and send a 'KILL' signal if we haven't received one.
>
> Obs: Sending a KILL signal would imply that userspace is able to cope with children dying. Which pretty much excludes pthreads, I would think.
>
> Guess I'll have to consult Stevens :-)
Basically what cancel does is atomically disarm the "done" callback.
The socket belongs to the kernel, so it will live until the kernel is
good and through with it.
>> + *
>> + * Request cancellation races with request completion. To determine
>> + * who won, callers examine the return value from this function.
>> + *
>> + * Return values:
>> + * %0 - Uncompleted handshake request was canceled or not found
>> + * %-EBUSY - Handshake request already completed
>
> EBUSY? Wouldn't be EAGAIN more approriate?
I don't think EAGAIN would be appropriate at all. The situation
is that the handshake completed, so there's no need to call cancel
again. It's synonym, EWOULDBLOCK, is also not a good semantic fit.
> After all, the request is everything _but_ busy...
I'm open to suggestion.
One option is to use a boolean return value instead of an errno.
>> + */
>> +int handshake_req_cancel(struct socket *sock)
>> +{
>> + struct handshake_req *req;
>> + struct sock *sk;
>> + struct net *net;
>> +
>> + if (!sock)
>> + return 0;
>> +
>> + sk = sock->sk;
>> + req = sk->sk_handshake_req;
>> + net = sock_net(sk);
>> +
>> + if (!req) {
>> + trace_handshake_cancel_none(net, req, sock);
>> + return 0;
>> + }
>> +
>> + if (remove_pending(net, req)) {
>> + /* Request hadn't been accepted */
>> + trace_handshake_cancel(net, req, sock);
>> + return 0;
>> + }
>> + if (test_and_set_bit(HANDSHAKE_F_COMPLETED, &req->hr_flags)) {
>> + /* Request already completed */
>> + trace_handshake_cancel_busy(net, req, sock);
>> + return -EBUSY;
>> + }
>> +
>> + __sock_put(sk);
>> + trace_handshake_cancel(net, req, sock);
>> + return 0;
>> +}
>> +EXPORT_SYMBOL(handshake_req_cancel);
>> diff --git a/net/handshake/trace.c b/net/handshake/trace.c
>> new file mode 100644
>> index 000000000000..3a5b6f29a2b8
>> --- /dev/null
>> +++ b/net/handshake/trace.c
>> @@ -0,0 +1,17 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Trace points for transport security layer handshakes.
>> + *
>> + * Author: Chuck Lever <chuck.lever@...cle.com>
>> + *
>> + * Copyright (c) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#include <linux/types.h>
>> +#include <net/sock.h>
>> +
>> +#include "handshake.h"
>> +
>> +#define CREATE_TRACE_POINTS
>> +
>> +#include <trace/events/handshake.h>
> Cheers,
>
> Hannes
>
>
>
--
Chuck Lever
Powered by blists - more mailing lists