[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <90C7DF9C-6860-4317-8D01-C60C718C8257@oracle.com>
Date: Mon, 27 Feb 2023 15:39:44 +0000
From: Chuck Lever III <chuck.lever@...cle.com>
To: Hannes Reinecke <hare@...e.de>
CC: Chuck Lever <cel@...nel.org>, "kuba@...nel.org" <kuba@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"kernel-tls-handshake@...ts.linux.dev"
<kernel-tls-handshake@...ts.linux.dev>
Subject: Re: [PATCH v5 1/2] net/handshake: Create a NETLINK service for
handling handshake requests
> On Feb 27, 2023, at 10:14 AM, Hannes Reinecke <hare@...e.de> wrote:
>
> On 2/27/23 15:59, Chuck Lever III wrote:
>>> On Feb 27, 2023, at 4:24 AM, Hannes Reinecke <hare@...e.de> wrote:
>>>
>>> On 2/24/23 20:19, Chuck Lever wrote:
> [ .. ]
>>>> + req = sock->sk->sk_handshake_req;
>>>> + if (!req) {
>>>> + err = -EBUSY;
>>>> + goto out_status;
>>>> + }
>>>> +
>>>> + trace_handshake_cmd_done(net, req, sock, fd);
>>>> +
>>>> + status = -EIO;
>>>> + if (tb[HANDSHAKE_A_DONE_STATUS])
>>>> + status = nla_get_u32(tb[HANDSHAKE_A_DONE_STATUS]);
>>>> +
>>> And this makes me ever so slightly uneasy.
>>>
>>> As 'status' is a netlink attribute it's inevitably defined as 'unsigned'.
>>> Yet we assume that 'status' is a negative number, leaving us _technically_ in unchartered territory.
>> Ah, that's an oversight.
>>> And that is notwithstanding the problem that we haven't even defined _what_ should be in the status attribute.
>> It's now an errno value.
>>> Reading the code I assume that it's either '0' for success or a negative number (ie the error code) on failure.
>>> Which implicitely means that we _never_ set a positive number here.
>>> So what would we lose if we declare 'status' to carry the _positive_ error number instead?
>>> It would bring us in-line with the actual netlink attribute definition, we wouldn't need
>>> to worry about possible integer overflows, yadda yadda...
>>>
>>> Hmm?
>> It can also be argued that errnos in user space are positive-valued,
>> therefore, this user space visible protocol should use a positive
>> errno.
> Thanks.
>
> [ .. ]
>>>> +
>>>> +/**
>>>> + * handshake_req_cancel - consumer API to cancel an in-progress handshake
>>>> + * @sock: socket on which there is an ongoing handshake
>>>> + *
>>>> + * XXX: Perhaps killing the user space agent might also be necessary?
>>>
>>> I thought we had agreed that we would be sending a signal to the userspace process?
>> We had discussed killing the handler, but I don't think it's necessary.
>> I'd rather not do something that drastic unless we have no other choice.
>> So far my testing hasn't shown a need for killing the child process.
>> I'm also concerned that the kernel could reuse the handler's process ID.
>> handshake_req_cancel would kill something that is not a handshake agent.
> Hmm? If that were the case, wouldn't we be sending the netlink message to the
> wrong process, to?
Notifications go to anyone who is listening for handshake requests
and contain nothing but the handler class number. "Who is to respond
to this notification". It is up to those processes to send an ACCEPT
to the kernel, and then later a DONE.
So... listeners have to register to get notifications, and the
registration goes away as soon as the netlink socket is closed. That
is what the long-lived parent tlshd process does.
After notification, the handshake is driven entirely by the handshake
agent (the tlshd child process). The kernel is not otherwise sending
unsolicited netlink messages to anyone.
If you're concerned about the response messages that the kernel
sends back to the handshake agent... any new process would have to
have a netlink socket open, resolved to the HANDSHAKE family, and
it would have to recognize the message sequence ID in the response
message. Very very unlikely that all that would happen.
> And in the absence of any timeout handler: what do we do if userspace is stuck / doesn't make forward progress?
> At one point TCP will timeout, and the client will close the connection.
> Leaving us with (potentially) broken / stuck processes. Sure we would need to initiate some cleanup here, no?
I'm not sure. Test and see.
In my experience, one peer or the other closes the socket, and the
other follows suit. The handshake agent hits an error when it tries
to use the socket, and exits.
>>> Ideally we would be sending a SIGHUP, wait for some time on the userspace
>>> process to respond with a 'done' message, and send a 'KILL' signal if we
>>> haven't received one.
>>>
>>> Obs: Sending a KILL signal would imply that userspace is able to cope with
>>> children dying. Which pretty much excludes pthreads, I would think.
>>>
>>> Guess I'll have to consult Stevens :-)
>> Basically what cancel does is atomically disarm the "done" callback.
>> The socket belongs to the kernel, so it will live until the kernel is
>> good and through with it.
> Oh, the socket does. But the process handling the socket is not.
> So even if we close the socket from the kernel there's no guarantee that userspace will react to it.
If the kernel finishes first (ie, cancels and closes the socket,
as it is supposed to) the user space endpoint is dead. I don't
think it matters what the handshake agent does at that point,
although if this happens frequently, it might amount to a
resource leak.
> Problem here is with using different key materials.
> As the current handshake can only deal with one key at a time the only chance we have for several possible keys is to retry the handshake with the next key.
> But out of necessity we have to use the _same_ connection (as tlshd doesn't control the socket). So we cannot close the socket, and hence we can't notify userspace to give up the handshake attempt.
> Being able to send a signal would be simple; sending SIGHUP to userspace, and wait for the 'done' call.
> If it doesn't come we can terminate all attempts.
> But if we get the 'done' call we know it's safe to start with the next attempt.
We solve this problem by enabling the kernel to provide all those
materials to tlshd in one go.
I don't think there's a "retry" situation here. Once the handshake
has failed, the client peer has to know to try again. That would
mean retrying would have to be part of the upper layer protocol.
Does an NVMe initiator know it has to drive another handshake if
the first one fails, or does it rely on the handshake itself to
try all available identities?
We don't have a choice but to provide all the keys at once and
let the handshake negotiation deal with it.
I'm working on DONE passing multiple remote peer IDs back to the
kernel now. I don't see why ACCEPT couldn't pass multiple peer IDs
the other way.
Note that currently the handshake upcall mechanism supports only
one handshake per socket lifetime, as the handshake_req is
released by the socket's sk_destruct callback.
>>>> + *
>>>> + * Request cancellation races with request completion. To determine
>>>> + * who won, callers examine the return value from this function.
>>>> + *
>>>> + * Return values:
>>>> + * %0 - Uncompleted handshake request was canceled or not found
>>>> + * %-EBUSY - Handshake request already completed
>>>
>>> EBUSY? Wouldn't be EAGAIN more approriate?
>> I don't think EAGAIN would be appropriate at all. The situation
>> is that the handshake completed, so there's no need to call cancel
>> again. It's synonym, EWOULDBLOCK, is also not a good semantic fit.
>>> After all, the request is everything _but_ busy...
>> I'm open to suggestion.
>> One option is to use a boolean return value instead of an errno.
> Yeah, that's probably better.
>
> BTW: thanks for the tracepoints!
>
> Cheers,
>
> Hannes
>
--
Chuck Lever
Powered by blists - more mailing lists