[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALx6S36ewno1msS3cLhtjQNBz0rEXsFadCQunbjPuOo6goC3tg@mail.gmail.com>
Date: Thu, 18 Jan 2018 09:46:56 -0800
From: Tom Herbert <tom@...bertland.com>
To: James Chapman <jchapman@...alix.com>
Cc: Guillaume Nault <g.nault@...halink.fr>,
David Miller <davem@...emloft.net>,
Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is
already used
On Thu, Jan 18, 2018 at 7:40 AM, James Chapman <jchapman@...alix.com> wrote:
> On 18 January 2018 at 15:18, Guillaume Nault <g.nault@...halink.fr> wrote:
>> On Wed, Jan 17, 2018 at 02:25:38PM -0500, David Miller wrote:
>>> From: James Chapman <jchapman@...alix.com>
>>> Date: Wed, 17 Jan 2018 11:13:33 +0000
>>>
>>> > On 16 January 2018 at 19:00, David Miller <davem@...emloft.net> wrote:
>>> >> From: Tom Herbert <tom@...bertland.com>
>>> >> Date: Tue, 16 Jan 2018 09:36:41 -0800
>>> >>
>>> >>> sk_user_data is set with the sk_callback lock held in code below.
>>> >>> Should be able to take the lock earlier can do this check under the
>>> >>> lock.
>>> >>
>>> >> csock, and this csk, is obtained from an arbitrary one of the
>>> >> process's FDs. It can be any socket type or family, and that socket's
>>> >> family might set sk_user_data without the callback lock.
>>> >>
>>> >> The only socket type check is making sure it is not another PF_KCM
>>> >> socket. So that doesn't help with this problem.
>>> >
>>> > Is it the intention to update all socket code over time to write
>>> > sk_user_data within the sk_callback lock? If so, I'm happy to address
>>> > that in the l2tp code (and update the kcm patch to check sk_user_data
>>> > within the sk_callback lock). Or is the preferred solution to restrict
>>> > KCM to specific socket families, as suggested by Guillaume earlier in
>>> > the thread?
>>>
>>> I think we have a more fundamental issue here.
>>>
>>> sk->sk_user_data is a place where RPC layer specific data is hung off
>>> of. By this definition SunRPC, RXRPC, RDS, TIPC, and KCM are all
>>> using it correctly.
>>>
>>> Phonet has a similar issue to the one seen here, it tests and changes
>>> sk_user_data under lock_sock(). The only requirement it makes is
>>> that the socket type is not SOCK_STREAM. However, this one might be OK
>>> since only pep_sock sockets can be passed down into gprs_attach().
>>>
>> But, if I read it correctly, that doesn't prevent it from being passed
>> to kcm_attach() later on, which will overwrite sk_user_data (unless we
>> update the locking scheme and refuse to overwrite sk_user_data in a
>> race-free way).
>>
>> BTW couldn't the gprs_dev pointer be embedded in struct pep_sock?
>> This way pep_sk(sk)->gp could be used instead of sk->sk_user_data.
>> That'd probably be a violation of the phonet's layering, as that'd
>> tie gprs_dev to pep sockets. OTOH, only pep sockets can currently be
>> attached to gprs_dev, so in practice that might be a reasonable
>> compromise.
>>
>>> Most of these cases like SunRPC, RXRPC, etc. are fine because they
>>> only graft on top of TCP and UDP sockets.
>>>
>>> The weird situation here is that L2TP does tunneling and stores it's
>>> private state in sk->sk_user_data like an RPC layer would. And KCM
>>> allows basically any socket type to be attached.
>>>
>>> The RPC layers create their sockets internally, so I cannot see a way
>>> that those can be sent to a KCM attach operations. And I think that
>>> is why this RPC invariant is important for sk_user_data usage.
>>>
>> SunRPC seems to possibly set sk_user_data on user sockets: svc_addsock()
>> gets a socket using sockfd_lookup() then passes it to svc_setup_socket()
>> which in turn sets sk_user_data. I don't know anything about SunRPC, so
>> I might very well have missed important details, but I believe such a
>> socket could be passed to KCM which could lead to the same kind of
>> issues as for L2TP. Other RPCs look safe to me.
>>
>>> If all else was equal, even though it doesn't make much sense to KCM
>>> attach L2TP sockets to KCM, I would suggest to change L2TP to store
>>> it's private stuff elsewhere.
>>>
>>> But that is not the case. Anything using the generic UDP
>>> encapsulation layer is going to make use of sk->sk_user_data like this
>>> (see setup_udp_tunnel_sock).
>>>
>> Most UDP encapsulations only use kernel sockets though. It seems that
>> only L2TP and GTP use setup_udp_tunnel_sock() with userpsace sockets.
>> So it might be feasible to restrict usage of sk_user_data to kernel
>> sockets only.
>>
>> For L2TP, we probably can adapt l2tp_sock_to_tunnel() so that it does
>> a lookup in a hashtable indexed by the socket pointer, rather than
>> dereferencing sk_user_data. That doesn't look very satisfying to me,
>> but that's the only way I found so far.
>
> L2TP needs a way to get at its local data from the socket in the data path.
>
>> We also have another user of sk_user_data in l2tp_ppp, but since it
>> uses its own socket type, I guess we could simply embed the pointer in
>> its parent structure.
>>
>>> It looks like over time we've accumulated this new class of uses
>>> of sk->sk_user_data, ho hum...
>>>
>>> And it's not like we can add a test to KCM to avoid these socket
>>> types, because they will look like normal UDP datagram sockets.
>>>
>>> What a mess...
>>>
>>> Furthermore, even if you add a test to KCM, you will now need to
>>> add the same test to L2TP and anything else which uses sk_user_data
>>> for tunneling and for which userspace has access to the socket fd.
>>>
>>> And it will be racy, indeed, until all such users align to the same
>>> precise locking scheme for tests and updates to sk_user_data.
>>>
>>> Again, what a mess...
>>>
>> So, if I understand correctly, we can either restrict sk_user_data to
>> kernel sockets so that KCM couldn't act on them (but then why would we
>> make an exception for KCM and allow it to set sk_user_data on
>> non-kernel sockets?).
>> Or we could agree on a locking scheme for sk_user_data and update all
>> users so that they'd fail instead of overwriting it when it's not NULL.
>>
>> Assuming my understanding is correct, do you have any preference for
>> fixing this issue? Or any other ideas?
>
> Could we add a new pointer, say, encap_user_data to struct udp_sock
> and use it instead of sk_user_data for UDP-encap sockets?
Then that's increasing the udp_sock structure size for a narrow use
case which will get push back. I think it's going to be better to
stick with one sock pointer. We could maybe redefine sk_user_data as a
pointer to an allocated structure or array so it can hold multiple
user_data pointers (in lieu of chaining).
Tom
Powered by blists - more mailing lists