[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1BD8AD98-0775-4E65-ABC5-23A83AC98D4B@oracle.com>
Date: Wed, 22 Mar 2023 13:35:07 +0000
From: Chuck Lever III <chuck.lever@...cle.com>
To: Paolo Abeni <pabeni@...hat.com>
CC: Chuck Lever <cel@...nel.org>, Jakub Kicinski <kuba@...nel.org>,
Eric Dumazet <edumazet@...gle.com>,
"open list:NETWORKING [GENERAL]" <netdev@...r.kernel.org>,
"kernel-tls-handshake@...ts.linux.dev"
<kernel-tls-handshake@...ts.linux.dev>,
John Haxby <john.haxby@...cle.com>
Subject: Re: [PATCH v7 1/2] net/handshake: Create a NETLINK service for
handling handshake requests
> On Mar 22, 2023, at 5:03 AM, Paolo Abeni <pabeni@...hat.com> wrote:
>
> On Tue, 2023-03-21 at 13:58 +0000, Chuck Lever III wrote:
>>
>>> On Mar 21, 2023, at 7:27 AM, Paolo Abeni <pabeni@...hat.com> wrote:
>>>
>>> On Sat, 2023-03-18 at 12:18 -0400, Chuck Lever wrote:
>>>> +/**
>>>> + * handshake_req_alloc - consumer API to allocate a request
>>>> + * @sock: open socket on which to perform a handshake
>>>> + * @proto: security protocol
>>>> + * @flags: memory allocation flags
>>>> + *
>>>> + * Returns an initialized handshake_req or NULL.
>>>> + */
>>>> +struct handshake_req *handshake_req_alloc(struct socket *sock,
>>>> + const struct handshake_proto *proto,
>>>> + gfp_t flags)
>>>> +{
>>>> + struct sock *sk = sock->sk;
>>>> + struct net *net = sock_net(sk);
>>>> + struct handshake_net *hn = handshake_pernet(net);
>>>> + struct handshake_req *req;
>>>> +
>>>> + if (!hn)
>>>> + return NULL;
>>>> +
>>>> + req = kzalloc(struct_size(req, hr_priv, proto->hp_privsize), flags);
>>>> + if (!req)
>>>> + return NULL;
>>>> +
>>>> + sock_hold(sk);
>>>
>>> The hr_sk reference counting is unclear to me. It looks like
>>> handshake_req retain a reference to such socket, but
>>> handshake_req_destroy()/handshake_sk_destruct() do not release it.
>>
>> If we rely on sk_destruct to release the final reference count,
>> it will never get invoked.
>>
>>
>>> Perhaps is better moving such sock_hold() into handshake_req_submit(),
>>> once that the request is successful?
>>
>> I will do that.
>>
>> Personally, I find it more clear to bump a reference count when
>> saving a copy of the object's pointer, as is done in _alloc. But if
>> others find it easier the other way, I have no problem changing
>> it to suit community preferences.
>
> I made the above suggestion because it looks like the sk reference is
> not released in the handshake_req_submit() error path, but anything
> addressing that would be good enough for me.
Indeed, that was a bug. I've avoided that by re-arranging things
as discussed.
> [...]
>
>>>
>>>> +/**
>>>> + * handshake_req_cancel - consumer API to cancel an in-progress handshake
>>>> + * @sock: socket on which there is an ongoing handshake
>>>> + *
>>>> + * XXX: Perhaps killing the user space agent might also be necessary?
>>>> + *
>>>> + * Request cancellation races with request completion. To determine
>>>> + * who won, callers examine the return value from this function.
>>>> + *
>>>> + * Return values:
>>>> + * %true - Uncompleted handshake request was canceled or not found
>>>> + * %false - Handshake request already completed
>>>> + */
>>>> +bool handshake_req_cancel(struct socket *sock)
>>>> +{
>>>> + struct handshake_req *req;
>>>> + struct handshake_net *hn;
>>>> + struct sock *sk;
>>>> + struct net *net;
>>>> +
>>>> + sk = sock->sk;
>>>> + net = sock_net(sk);
>>>> + req = handshake_req_hash_lookup(sk);
>>>> + if (!req) {
>>>> + trace_handshake_cancel_none(net, req, sk);
>>>> + return true;
>>>> + }
>>>> +
>>>> + hn = handshake_pernet(net);
>>>> + if (hn && remove_pending(hn, req)) {
>>>> + /* Request hadn't been accepted */
>>>> + trace_handshake_cancel(net, req, sk);
>>>> + return true;
>>>> + }
>>>> + if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
>>>> + /* Request already completed */
>>>> + trace_handshake_cancel_busy(net, req, sk);
>>>> + return false;
>>>> + }
>>>> +
>>>> + __sock_put(sk);
>>>
>>> Same here.
>>
>> I'll move the sock_hold() to _submit, and cook up a comment or two.
>
> In such comments please also explain why sock_put() is not needed here
> (and above). e.g. who is retaining the extra sk ref.
One assumes that the API consumer would have a reference, but
perhaps these call sites should be replaced with sock_put().
>>> Side note, I think at this point some tests could surface here? If
>>> user-space-based self-tests are too cumbersome and/or do not offer
>>> adequate coverage perhaps you could consider using kunit?
>>
>> I'm comfortable with Kunit, having just added a bunch of tests
>> for the kernel's SunRPC GSS Kerberos implementation.
>>
>> There, however, I had clearly defined test cases to add, thanks
>> to the RFCs. I guess I'm a little unclear on what specific tests
>> would be necessary or valuable here. Suggestions and existing
>> examples are very welcome.
>
> I *think* that a good start would be exercising the expected code
> paths:
>
> handshake_req_alloc, handshake_req_submit, handshake_complete
> handshake_req_alloc, handshake_req_submit, handshake_cancel
> or even
> tls_*_hello_*, tls_handshake_accept, tls_handshake_done
> tls_*_hello_*, tls_handshake_accept, tls_handshake_cancel
These aren't user APIs, not sure this kind of testing is
especially valuable. I'm thinking maybe the netlink
operations would be a better thing to unit-test, and that
might be better done with user space tests...?
> plus explicitly triggering some errors path e.g.
>
> hn_pending_max+1 consecutive submit with no accept
> handshake_cancel after handshake_complete
> multiple handshake_complete on the same req
> multiple handshake_cancel on the same req
OK. I'm wondering if a user agent needs to be running
for these, in which case, running Kunit in its stand-
alone mode (ie, under UML) might not work at all.
Just thinking out loud... Kunit after all might not be
the right tool for this job.
--
Chuck Lever
Powered by blists - more mailing lists