[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <cc33931c-2821-4292-85db-86017de6afeb@huawei.com>
Date: Fri, 8 Nov 2024 21:04:51 +0800
From: "liujian (CE)" <liujian56@...wei.com>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
CC: <Dai.Ngo@...cle.com>, <anna@...nel.org>, <chuck.lever@...cle.com>,
<davem@...emloft.net>, <ebiederm@...ssion.com>, <edumazet@...gle.com>,
<geert+renesas@...der.be>, <jlayton@...nel.org>, <kuba@...nel.org>,
<linux-nfs@...r.kernel.org>, <neilb@...e.de>, <netdev@...r.kernel.org>,
<ofir.gal@...umez.com>, <okorniev@...hat.com>, <pabeni@...hat.com>,
<tom@...pey.com>, <trondmy@...nel.org>
Subject: Re: [PATCH net v2] sunrpc: fix one UAF issue caused by sunrpc kernel
tcp socket
在 2024/11/8 4:59, Kuniyuki Iwashima 写道:
> From: "liujian (CE)" <liujian56@...wei.com>
> Date: Thu, 7 Nov 2024 20:03:40 +0800
>>>> diff --git a/net/socket.c b/net/socket.c
>>>> index 042451f01c65..e64a02445b1a 100644
>>>> --- a/net/socket.c
>>>> +++ b/net/socket.c
>>>> @@ -1651,6 +1651,34 @@ int sock_create_kern(struct net *net, int family, int type, int protocol, struct
>>>> }
>>>> EXPORT_SYMBOL(sock_create_kern);
>>>>
>>>> +int sock_create_kern_getnet(struct net *net, int family, int type, int proto, struct socket **res)
>>>> +{
>>>> + struct sock *sk;
>>>> + int ret;
>>>> +
>>>> + if (!maybe_get_net(net))
>>>> + return -EINVAL;
>>>
>>> Is this really safe ?
>>>
>>> IIUC, maybe_get_net() is safe for a net only when it is fetched under
>>> RCU, then rcu_read_lock() prevents cleanup_net() from reusing the net
>>> by rcu_barrier().
>>>
>>> Otherwise, there should be a small chance that the same slab object is
>>> reused for another netns between fetching the net and reaching here.
>>>
>>> svc_create_socket() is called much later after the netns is fetched,
>>> and _svc_xprt_create() calls try_module_get() before ->xpo_create().
>>> So, it seems the path is not under RCU and maybe_get_net() must be
>>> called much earlier by each call site.
>>>
>>> For this reason, when I write a patch for the same issue in CIFS,
>>> I delayed put_net() to cifsd kthread so that the netns refcnt taken
>>> for each CIFS server info lives until the last __sock_create() attempt
>>> from cifsd.
>>>
>>> https://lore.kernel.org/linux-cifs/20241102212438.76691-1-kuniyu@amazon.com/
>>>
>> Okay, got it. thank you.
>> Looking at the nfs and nfsd processing flow, it seems that the call to
>> __sock_create() to create a TCP socket is always after the mount
>> operation get_net(). So it should be fine to use get_net() directly.
>
> Is there any chance that a concurrent unmount releases the
> last refcount by put_net() while another thread trying to call
> __sock_create() ?
>
> CIFS was the case.
>
>
>> So
>> here I'm going to change may_get_net() to get_net(), move
>> sock_create_kern_getnet() to the sunrpc module, and rename it to
>> something more appropriate. Is that okay?
>
> Could you go without adding a helper and do the conversion in sunrpc
> code as CIFS did ?
>
Ok, I will send v3 as you said.
Looking forward to your changes as described below.
Thank you.
> I plan to resurrect my patch and remove such socket conversion altogether
> in the next cycle after the CIFS fix lands on net-next.
>
> https://lore.kernel.org/netdev/20240227011041.97375-4-kuniyu@amazon.com/
> https://github.com/q2ven/linux/commits/427_2
> https://github.com/q2ven/linux/commit/2e54a8cc84f1e9ce60a0e4693c79a8e74c3dbeb9
>
> I inspected all the callers of __sock_create() and friends, and all
> __sock_create() can be replaced with sock_create_kern(), so I will
> unexport __sock_create() and then add a new parameter hold_net to it.
>
> Then, I'll rename sock_create_kern() to sock_create_net_noref() and add
> a fat comment to catch in-kernel users attention so that they no longer
> use _kern() API blindly without care about netns reference. Also, I'll
> add sock_create_net() and use it for MPTCP, SMC, CIFS, (and sunrpc) etc.
>
> RDS uses maybe_net_get() but I think this is still buggy and I need
> to check more.
Powered by blists - more mailing lists