[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <D06D70DC-D053-4212-B72C-82C1BB1AF9F2@oracle.com>
Date: Tue, 28 Feb 2023 14:28:50 +0000
From: Chuck Lever III <chuck.lever@...cle.com>
To: Hannes Reinecke <hare@...e.de>
CC: Chuck Lever <cel@...nel.org>, "kuba@...nel.org" <kuba@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"kernel-tls-handshake@...ts.linux.dev"
<kernel-tls-handshake@...ts.linux.dev>
Subject: Re: [PATCH v5 1/2] net/handshake: Create a NETLINK service for
handling handshake requests
> On Feb 28, 2023, at 1:58 AM, Hannes Reinecke <hare@...e.de> wrote:
>
> On 2/27/23 19:10, Chuck Lever III wrote:
>>> On Feb 27, 2023, at 12:21 PM, Hannes Reinecke <hare@...e.de> wrote:
>>>
>>>> On 2/27/23 16:39, Chuck Lever III wrote:
>>>>> On Feb 27, 2023, at 10:14 AM, Hannes Reinecke <hare@...e.de> wrote:
>>>>>
>>>>> Problem here is with using different key materials.
>>>>> As the current handshake can only deal with one key at a time
>>>>> the only chance we have for several possible keys is to retry
>>>>> the handshake with the next key.
>>>>> But out of necessity we have to use the _same_ connection
>>>>> (as tlshd doesn't control the socket). So we cannot close
>>>>> the socket, and hence we can't notify userspace to give up the handshake attempt.
>>>>> Being able to send a signal would be simple; sending SIGHUP to userspace, and wait for the 'done' call.
>>>>> If it doesn't come we can terminate all attempts.
>>>>> But if we get the 'done' call we know it's safe to start with the next attempt.
>>>> We solve this problem by enabling the kernel to provide all those
>>>> materials to tlshd in one go.
>>> Ah. Right, that would work, too; provide all possible keys to the
>>> 'accept' call and let the userspace agent figure out what to do with
>>> them. That makes life certainly easier for the kernel side.
>>>
>>>> I don't think there's a "retry" situation here. Once the handshake
>>>> has failed, the client peer has to know to try again. That would
>>>> mean retrying would have to be part of the upper layer protocol.
>>>> Does an NVMe initiator know it has to drive another handshake if
>>>> the first one fails, or does it rely on the handshake itself to
>>>> try all available identities?
>>>> We don't have a choice but to provide all the keys at once and
>>>> let the handshake negotiation deal with it.
>>>> I'm working on DONE passing multiple remote peer IDs back to the
>>>> kernel now. I don't see why ACCEPT couldn't pass multiple peer IDs
>>>> the other way.
>>> Nope. That's not required.
>>> DONE can only ever have one peer id (TLS 1.3 specifies that the client
>>> sends a list of identities, the server picks one, and sends that one back
>>> to the client). So for DONE we will only ever have 1 peer ID.
>>> If we allow for several peer IDs to be present in the client ACCEPT message
>>> then we'd need to include the resulting peer ID in the client DONE, too;
>>> otherwise we'll need it for the server DONE only.
>>>
>>> So all in all I think we should be going with the multiple IDs in the
>>> ACCEPT call (ie move the key id from being part of the message into an
>>> attribute), and have a peer id present in the DONE all for both versions,
>>> server and client.
>> To summarize:
>> ---
>> The ACCEPT request (from tlshd) would have just the handler class
>> "Which handler is responding". The kernel uses that to find a
>> handshake request waiting for that type of handler. In our case,
>> "tlshd".
>> The ACCEPT response (from the kernel) would have the socket fd,
>> the handshake parameters, and zero or more peer ID key serial
>> numbers. (Today, just zero or one peer IDs).
>> > There is also an errno status in the ACCEPT response, which
>> the kernel can use to indicate things like "no requests in that
>> class were found" or that the request was otherwise improperly
>> formed.
>> ---
>> The DONE request (from tlshd) would have the socket fd (and
>> implicitly, the handler's PID), the session status, and zero
>> or one remote peer ID key serial numbers.
>> > The DONE response (from the kernel) is an ACK. (Today it's
>> more than that, but that's broken and will be removed).
>> ---
>> For the DONE request, the session status is one of:
>> 0: session established -- see @peerid for authentication status
>> EIO: local error
>> EACCES: handshake rejected
>> For server handshake completion:
>> @peerid contains the remote peer ID if the session was
>> authenticated, or TLS_NO_PEERID if the session was not
>> authenticated.
>> status == EACCES if authentication material was present from
>> both peers but verification failed.
>> For client handshake completion:
>> @peerid contains the remote peer ID if authentication was
>> requested and the session was authenticated
>> status == EACCES if authentication was requested and the
>> session was not authenticated, or if verification failed.
>> (Maybe client could work like the server side, and the
>> kernel consumer would need to figure out if it cares
>> whether there was authentication).
> Yes, that would be my preference. Always return @peerid
> for DONE if the TLS session was established.
You mean if the TLS session was authenticated. The server
won't receive a remote peer identity if the client peer
doesn't authenticate.
> We might also consider returning @peerid with EACCESS
> to indicate the offending ID.
I'll look into that.
>> Is that adequate?
> Yes, it is.
What about the narrow set of DONE status values? You've
recently wanted to add ENOMEM, ENOKEY, and EINVAL to
this set. My experience is that these status values are
nearly always obscured before they can get back to the
requesting user.
Can the kernel make use of ENOMEM, for example? It might
be able to retry, I suppose... retrying is not sensible
for the server side.
> So the only bone of contention is the timeout; as we won't
> be implementing signals I still think that we should have
> a 'timeout' attribute. And if only to feed the TLS timeout
> parameter for gnutls ...
I'm still not seeing the case for making it an individual
parameter for each handshake request. Maybe a config
parameter, if a short timeout is actually needed... even
then, maybe a built-in timeout is preferable to yet another
tuning knob that can be abused.
I'd like to see some testing results to determine that a
short timeout is the only way to handle corner cases.
--
Chuck Lever
Powered by blists - more mailing lists