[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8154E3F3-C8A2-4BCF-8DFC-E00EA7B9CF80@oracle.com>
Date: Tue, 28 Mar 2023 18:19:22 +0000
From: Chuck Lever III <chuck.lever@...cle.com>
To: Jeff Layton <jlayton@...hat.com>
CC: Chuck Lever <cel@...nel.org>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Eric Dumazet <edumazet@...gle.com>,
"open list:NETWORKING [GENERAL]" <netdev@...r.kernel.org>,
"kernel-tls-handshake@...ts.linux.dev"
<kernel-tls-handshake@...ts.linux.dev>,
John Haxby <john.haxby@...cle.com>
Subject: Re: [PATCH v7 1/2] net/handshake: Create a NETLINK service for
handling handshake requests
> On Mar 28, 2023, at 2:14 PM, Jeff Layton <jlayton@...hat.com> wrote:
>
> On Sat, 2023-03-18 at 12:18 -0400, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@...cle.com>
>>
>> When a kernel consumer needs a transport layer security session, it
>> first needs a handshake to negotiate and establish a session. This
>> negotiation can be done in user space via one of the several
>> existing library implementations, or it can be done in the kernel.
>>
>> No in-kernel handshake implementations yet exist. In their absence,
>> we add a netlink service that can:
>>
>> a. Notify a user space daemon that a handshake is needed.
>>
>> b. Once notified, the daemon calls the kernel back via this
>> netlink service to get the handshake parameters, including an
>> open socket on which to establish the session.
>>
>> c. Once the handshake is complete, the daemon reports the
>> session status and other information via a second netlink
>> operation. This operation marks that it is safe for the
>> kernel to use the open socket and the security session
>> established there.
>>
>> The notification service uses a multicast group. Each handshake
>> mechanism (eg, tlshd) adopts its own group number so that the
>> handshake services are completely independent of one another. The
>> kernel can then tell via netlink_has_listeners() whether a handshake
>> service is active and prepared to handle a handshake request.
>>
>> A new netlink operation, ACCEPT, acts like accept(2) in that it
>> instantiates a file descriptor in the user space daemon's fd table.
>> If this operation is successful, the reply carries the fd number,
>> which can be treated as an open and ready file descriptor.
>>
>> While user space is performing the handshake, the kernel keeps its
>> muddy paws off the open socket. A second new netlink operation,
>> DONE, indicates that the user space daemon is finished with the
>> socket and it is safe for the kernel to use again. The operation
>> also indicates whether a session was established successfully.
>>
>> Signed-off-by: Chuck Lever <chuck.lever@...cle.com>
>> ---
>> Documentation/netlink/specs/handshake.yaml | 122 +++++++++++
>> MAINTAINERS | 8 +
>> include/trace/events/handshake.h | 159 ++++++++++++++
>> include/uapi/linux/handshake.h | 70 ++++++
>> net/Kconfig | 5
>> net/Makefile | 1
>> net/handshake/Makefile | 11 +
>> net/handshake/genl.c | 57 +++++
>> net/handshake/genl.h | 23 ++
>> net/handshake/handshake.h | 82 +++++++
>> net/handshake/netlink.c | 316 ++++++++++++++++++++++++++++
>> net/handshake/request.c | 307 +++++++++++++++++++++++++++
>> net/handshake/trace.c | 20 ++
>> 13 files changed, 1181 insertions(+)
>> create mode 100644 Documentation/netlink/specs/handshake.yaml
>> create mode 100644 include/trace/events/handshake.h
>> create mode 100644 include/uapi/linux/handshake.h
>> create mode 100644 net/handshake/Makefile
>> create mode 100644 net/handshake/genl.c
>> create mode 100644 net/handshake/genl.h
>> create mode 100644 net/handshake/handshake.h
>> create mode 100644 net/handshake/netlink.c
>> create mode 100644 net/handshake/request.c
>> create mode 100644 net/handshake/trace.c
>>
>>
>
> [...]
>
>> diff --git a/net/handshake/request.c b/net/handshake/request.c
>> new file mode 100644
>> index 000000000000..3f8ae9e990d2
>> --- /dev/null
>> +++ b/net/handshake/request.c
>> @@ -0,0 +1,307 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Handshake request lifetime events
>> + *
>> + * Author: Chuck Lever <chuck.lever@...cle.com>
>> + *
>> + * Copyright (c) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#include <linux/types.h>
>> +#include <linux/socket.h>
>> +#include <linux/kernel.h>
>> +#include <linux/module.h>
>> +#include <linux/skbuff.h>
>> +#include <linux/inet.h>
>> +#include <linux/fdtable.h>
>> +#include <linux/rhashtable.h>
>> +
>> +#include <net/sock.h>
>> +#include <net/genetlink.h>
>> +#include <net/netns/generic.h>
>> +
>> +#include <uapi/linux/handshake.h>
>> +#include "handshake.h"
>> +
>> +#include <trace/events/handshake.h>
>> +
>> +/*
>> + * We need both a handshake_req -> sock mapping, and a sock ->
>> + * handshake_req mapping. Both are one-to-one.
>> + *
>> + * To avoid adding another pointer field to struct sock, net/handshake
>> + * maintains a hash table, indexed by the memory address of @sock, to
>> + * find the struct handshake_req outstanding for that socket. The
>> + * reverse direction uses a simple pointer field in the handshake_req
>> + * struct.
>> + */
>> +
>> +static struct rhashtable handshake_rhashtbl ____cacheline_aligned_in_smp;
>> +
>> +static const struct rhashtable_params handshake_rhash_params = {
>> + .key_len = sizeof_field(struct handshake_req, hr_sk),
>> + .key_offset = offsetof(struct handshake_req, hr_sk),
>> + .head_offset = offsetof(struct handshake_req, hr_rhash),
>> + .automatic_shrinking = true,
>> +};
>> +
>> +int handshake_req_hash_init(void)
>> +{
>> + return rhashtable_init(&handshake_rhashtbl, &handshake_rhash_params);
>> +}
>> +
>> +void handshake_req_hash_destroy(void)
>> +{
>> + rhashtable_destroy(&handshake_rhashtbl);
>> +}
>> +
>> +struct handshake_req *handshake_req_hash_lookup(struct sock *sk)
>> +{
>> + return rhashtable_lookup_fast(&handshake_rhashtbl, &sk,
>
> Is this correct? It seems like we should be searching for the struct
> sock pointer value, not on the pointer to the pointer (which will be a
> stack var), right?
I copied this from the nfsd_file and nfs4_file code we added recently.
rhashtable_lookup_fast takes a pointer to the key, so a pointer to a
pointer should be correct in this case.
>> + handshake_rhash_params);
>> +}
>> +
>> +static noinline bool handshake_req_hash_add(struct handshake_req *req)
>> +{
>> + int ret;
>> +
>> + ret = rhashtable_lookup_insert_fast(&handshake_rhashtbl,
>> + &req->hr_rhash,
>> + handshake_rhash_params);
>> + return ret == 0;
>> +}
>> +
>> +static noinline void handshake_req_destroy(struct handshake_req *req)
>> +{
>> + if (req->hr_proto->hp_destroy)
>> + req->hr_proto->hp_destroy(req);
>> + rhashtable_remove_fast(&handshake_rhashtbl, &req->hr_rhash,
>> + handshake_rhash_params);
>> + kfree(req);
>> +}
>> +
>> +static void handshake_sk_destruct(struct sock *sk)
>> +{
>> + void (*sk_destruct)(struct sock *sk);
>> + struct handshake_req *req;
>> +
>> + req = handshake_req_hash_lookup(sk);
>> + if (!req)
>> + return;
>> +
>> + trace_handshake_destruct(sock_net(sk), req, sk);
>> + sk_destruct = req->hr_odestruct;
>> + handshake_req_destroy(req);
>> + if (sk_destruct)
>> + sk_destruct(sk);
>> +}
>> +
>> +/**
>> + * handshake_req_alloc - consumer API to allocate a request
>> + * @sock: open socket on which to perform a handshake
>> + * @proto: security protocol
>> + * @flags: memory allocation flags
>> + *
>> + * Returns an initialized handshake_req or NULL.
>> + */
>> +struct handshake_req *handshake_req_alloc(struct socket *sock,
>> + const struct handshake_proto *proto,
>> + gfp_t flags)
>> +{
>> + struct sock *sk = sock->sk;
>> + struct net *net = sock_net(sk);
>> + struct handshake_net *hn = handshake_pernet(net);
>> + struct handshake_req *req;
>> +
>> + if (!hn)
>> + return NULL;
>> +
>> + req = kzalloc(struct_size(req, hr_priv, proto->hp_privsize), flags);
>> + if (!req)
>> + return NULL;
>> +
>> + sock_hold(sk);
>> +
>> + INIT_LIST_HEAD(&req->hr_list);
>> + req->hr_sk = sk;
>> + req->hr_proto = proto;
>> + return req;
>> +}
>> +EXPORT_SYMBOL(handshake_req_alloc);
>> +
>> +/**
>> + * handshake_req_private - consumer API to return per-handshake private data
>> + * @req: handshake arguments
>> + *
>> + */
>> +void *handshake_req_private(struct handshake_req *req)
>> +{
>> + return (void *)&req->hr_priv;
>> +}
>> +EXPORT_SYMBOL(handshake_req_private);
>> +
>> +static bool __add_pending_locked(struct handshake_net *hn,
>> + struct handshake_req *req)
>> +{
>> + if (!list_empty(&req->hr_list))
>> + return false;
>> + hn->hn_pending++;
>> + list_add_tail(&req->hr_list, &hn->hn_requests);
>> + return true;
>> +}
>> +
>> +void __remove_pending_locked(struct handshake_net *hn,
>> + struct handshake_req *req)
>> +{
>> + hn->hn_pending--;
>> + list_del_init(&req->hr_list);
>> +}
>> +
>> +/*
>> + * Returns %true if the request was found on @net's pending list,
>> + * otherwise %false.
>> + *
>> + * If @req was on a pending list, it has not yet been accepted.
>> + */
>> +static bool remove_pending(struct handshake_net *hn, struct handshake_req *req)
>> +{
>> + bool ret;
>> +
>> + ret = false;
>> +
>> + spin_lock(&hn->hn_lock);
>> + if (!list_empty(&req->hr_list)) {
>> + __remove_pending_locked(hn, req);
>> + ret = true;
>> + }
>> + spin_unlock(&hn->hn_lock);
>> +
>> + return ret;
>> +}
>> +
>> +/**
>> + * handshake_req_submit - consumer API to submit a handshake request
>> + * @req: handshake arguments
>> + * @flags: memory allocation flags
>> + *
>> + * Return values:
>> + * %0: Request queued
>> + * %-EBUSY: A handshake is already under way for this socket
>> + * %-ESRCH: No handshake agent is available
>> + * %-EAGAIN: Too many pending handshake requests
>> + * %-ENOMEM: Failed to allocate memory
>> + * %-EMSGSIZE: Failed to construct notification message
>> + * %-EOPNOTSUPP: Handshake module not initialized
>> + *
>> + * A zero return value from handshake_request() means that
>> + * exactly one subsequent completion callback is guaranteed.
>> + *
>> + * A negative return value from handshake_request() means that
>> + * no completion callback will be done and that @req has been
>> + * destroyed.
>> + */
>> +int handshake_req_submit(struct handshake_req *req, gfp_t flags)
>> +{
>> + struct sock *sk = req->hr_sk;
>> + struct net *net = sock_net(sk);
>> + struct handshake_net *hn = handshake_pernet(net);
>> + int ret;
>> +
>> + if (!hn)
>> + return -EOPNOTSUPP;
>> +
>> + ret = -EAGAIN;
>> + if (READ_ONCE(hn->hn_pending) >= hn->hn_pending_max)
>> + goto out_err;
>> +
>> + req->hr_odestruct = sk->sk_destruct;
>> + sk->sk_destruct = handshake_sk_destruct;
>> + spin_lock(&hn->hn_lock);
>> + ret = -EOPNOTSUPP;
>> + if (test_bit(HANDSHAKE_F_NET_DRAINING, &hn->hn_flags))
>> + goto out_unlock;
>> + ret = -EBUSY;
>> + if (!handshake_req_hash_add(req))
>> + goto out_unlock;
>> + if (!__add_pending_locked(hn, req))
>> + goto out_unlock;
>> + spin_unlock(&hn->hn_lock);
>> +
>> + ret = handshake_genl_notify(net, req->hr_proto->hp_handler_class,
>> + flags);
>> + if (ret) {
>> + trace_handshake_notify_err(net, req, sk, ret);
>> + if (remove_pending(hn, req))
>> + goto out_err;
>> + }
>> +
>> + trace_handshake_submit(net, req, sk);
>> + return 0;
>> +
>> +out_unlock:
>> + spin_unlock(&hn->hn_lock);
>> +out_err:
>> + trace_handshake_submit_err(net, req, sk, ret);
>> + handshake_req_destroy(req);
>> + return ret;
>> +}
>> +EXPORT_SYMBOL(handshake_req_submit);
>> +
>> +void handshake_complete(struct handshake_req *req, unsigned int status,
>> + struct genl_info *info)
>> +{
>> + struct sock *sk = req->hr_sk;
>> + struct net *net = sock_net(sk);
>> +
>> + if (!test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
>> + trace_handshake_complete(net, req, sk, status);
>> + req->hr_proto->hp_done(req, status, info);
>> + __sock_put(sk);
>> + }
>> +}
>> +
>> +/**
>> + * handshake_req_cancel - consumer API to cancel an in-progress handshake
>> + * @sock: socket on which there is an ongoing handshake
>> + *
>> + * XXX: Perhaps killing the user space agent might also be necessary?
>> + *
>> + * Request cancellation races with request completion. To determine
>> + * who won, callers examine the return value from this function.
>> + *
>> + * Return values:
>> + * %true - Uncompleted handshake request was canceled or not found
>> + * %false - Handshake request already completed
>> + */
>> +bool handshake_req_cancel(struct socket *sock)
>> +{
>> + struct handshake_req *req;
>> + struct handshake_net *hn;
>> + struct sock *sk;
>> + struct net *net;
>> +
>> + sk = sock->sk;
>> + net = sock_net(sk);
>> + req = handshake_req_hash_lookup(sk);
>> + if (!req) {
>> + trace_handshake_cancel_none(net, req, sk);
>> + return true;
>> + }
>> +
>> + hn = handshake_pernet(net);
>> + if (hn && remove_pending(hn, req)) {
>> + /* Request hadn't been accepted */
>> + trace_handshake_cancel(net, req, sk);
>> + return true;
>> + }
>> + if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
>> + /* Request already completed */
>> + trace_handshake_cancel_busy(net, req, sk);
>> + return false;
>> + }
>> +
>> + __sock_put(sk);
>> + trace_handshake_cancel(net, req, sk);
>> + return true;
>> +}
>> +EXPORT_SYMBOL(handshake_req_cancel);
>> diff --git a/net/handshake/trace.c b/net/handshake/trace.c
>> new file mode 100644
>> index 000000000000..1c4d8e27e17a
>> --- /dev/null
>> +++ b/net/handshake/trace.c
>> @@ -0,0 +1,20 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Trace points for transport security layer handshakes.
>> + *
>> + * Author: Chuck Lever <chuck.lever@...cle.com>
>> + *
>> + * Copyright (c) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#include <linux/types.h>
>> +
>> +#include <net/sock.h>
>> +#include <net/netlink.h>
>> +#include <net/genetlink.h>
>> +
>> +#include "handshake.h"
>> +
>> +#define CREATE_TRACE_POINTS
>> +
>> +#include <trace/events/handshake.h>
>>
>>
>>
>
> --
> Jeff Layton <jlayton@...hat.com>
--
Chuck Lever
Powered by blists - more mailing lists