lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <D2F7BBDB-B164-4D38-B4F6-AA33F1EE1773@oracle.com>
Date: Fri, 19 May 2023 14:54:11 +0000
From: Chuck Lever III <chuck.lever@...cle.com>
To: Ding Hui <dinghui@...gfor.com.cn>
CC: Jeff Layton <jlayton@...nel.org>,
        Trond Myklebust
	<trond.myklebust@...merspace.com>,
        Anna Schumaker <anna@...nel.org>,
        "David
 S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>, Jakub
 Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Linux NFS
 Mailing List <linux-nfs@...r.kernel.org>,
        "netdev@...r.kernel.org"
	<netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>,
        "stable@...r.kernel.org"
	<stable@...r.kernel.org>
Subject: Re: [PATCH] SUNRPC: Fix UAF in svc_tcp_listen_data_ready()



> On May 15, 2023, at 10:55 AM, Ding Hui <dinghui@...gfor.com.cn> wrote:
> 
> On 2023/5/15 20:01, Chuck Lever III wrote:
>>> On May 14, 2023, at 10:13 PM, Ding Hui <dinghui@...gfor.com.cn> wrote:
>>> 
>>> After the listener svc_sock be freed, and before invoking svc_tcp_accept()
>>> for the established child sock, there is a window that the newsock
>>> retaining a freed listener svc_sock in sk_user_data which cloning from
>>> parent.
>> Thank you, I will apply this (after testing it).
>> The next step is to figure out why SUNRPC is trying to accept
>> on a dead listener. Any thoughts about that?
> 
> A child sock is cloned from the listener sock, it inherits sk_data_ready
> and sk_user_data from its parent sock, which is svc_tcp_listen_data_ready()
> and listener svc_sock, the sk_state of the child becomes ESTABLISHED once
> after TCP handshake in protocol stack.
> 
> Case 1:
> 
> listener sock      | child sock            |   nfsd thread
> =>sk_data_ready    | =>sk_data_ready       |
> -------------------+-----------------------+----------------------
> svc_tcp_listen_data_ready
>  svsk is listener svc_sock
>  set_bit(XPT_CONN)
>                                             svc_recv
>                                               svc_tcp_accept(listener)
>                                                 kernel_accept get child sock as newsock
>                                                 svc_setup_socket(newsock)
>                                                   newsock->sk_data_ready=svc_data_ready
>                                                   newsock->sk_user_data=newsvsk
>                    svc_data_ready
>                      svsk is newsvsk
> 
> 
> Case 2:
> 
> listener sock      | child sock            |   nfsd thread
> =>sk_data_ready    | =>sk_data_ready       |
> -------------------+-----------------------+----------------------
> svc_tcp_listen_data_ready
>  svsk is listener svc_sock
>  set_bit(XPT_CONN)
>                    svc_tcp_listen_data_ready
>                      svsk is listener svc_sock
>                                             svc_recv
>                                               svc_tcp_accept(listener)
>                                                 kernel_accept get the child sock as newsock
>                                                 svc_setup_socket(newsock)
>                                                   newsock->sk_data_ready=svc_data_ready
>                                                   newsock->sk_user_data=newsvsk
>                    svc_data_ready
>                      svsk is newsvsk
> 
> 
> The UAF case:
> 
> listener sock      | child sock            |   rpc.nfsd 0
> =>sk_data_ready    | =>sk_data_ready       |
> -------------------+-----------------------+----------------------
> svc_tcp_listen_data_ready
>  svsk is listener svc_sock
>  set_bit(XPT_CONN)
>                                            svc_xprt_destroy_all
>                                              svc_xprt_free
>                                                kfree listener svc_sock
>                                            // the child sock has not yet been accepted,
>                                            // so it is not managed by SUNRPC for now.
>                    svc_tcp_listen_data_ready
>                      svsk is listener svc_sock
>                      svsk->sk_odata // UAF!

Thanks for the ladder diagrams, that helps.


>>> In the race windows if data is received on the newsock, we will
>>> observe use-after-free report in svc_tcp_listen_data_ready().
>>> 
>>> Reproduce by two tasks:
>>> 
>>> 1. while :; do rpc.nfsd 0 ; rpc.nfsd; done
>>> 2. while :; do echo "" | ncat -4 127.0.0.1 2049 ; done
>> I will continue attempting to reproduce, as I would like a
>> root cause for this issue.

svc_xprt shutdown seems to be unordered. I would think that
it should unregister with the portmapper, close the listener
so no new connections can be established, then close the
children last. That doesn't appear to be how it works, though.

But if the listener happens to be closed first, that frees
the svc_sock that might be relied upon by children sockets.
The shutdown ordering seems to be the crux of the problem,
and pinning the svc_xprt won't help, as you pointed out.

Making the children not rely on the listener does address
the crash, but doesn't fix the design. But the design
problem doesn't seem to be urgent. So I'm going to file a
low priority bug to document the design issue.


>>> KASAN report:
>>> 
>>>  ==================================================================
>>>  BUG: KASAN: slab-use-after-free in svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc]
>>>  Read of size 8 at addr ffff888139d96228 by task nc/102553
>>>  CPU: 7 PID: 102553 Comm: nc Not tainted 6.3.0+ #18
>>>  Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
>>>  Call Trace:
>>>   <IRQ>
>>>   dump_stack_lvl+0x33/0x50
>>>   print_address_description.constprop.0+0x27/0x310
>>>   print_report+0x3e/0x70
>>>   kasan_report+0xae/0xe0
>>>   svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc]
>>>   tcp_data_queue+0x9f4/0x20e0
>>>   tcp_rcv_established+0x666/0x1f60
>>>   tcp_v4_do_rcv+0x51c/0x850
>>>   tcp_v4_rcv+0x23fc/0x2e80
>>>   ip_protocol_deliver_rcu+0x62/0x300
>>>   ip_local_deliver_finish+0x267/0x350
>>>   ip_local_deliver+0x18b/0x2d0
>>>   ip_rcv+0x2fb/0x370
>>>   __netif_receive_skb_one_core+0x166/0x1b0
>>>   process_backlog+0x24c/0x5e0
>>>   __napi_poll+0xa2/0x500
>>>   net_rx_action+0x854/0xc90
>>>   __do_softirq+0x1bb/0x5de
>>>   do_softirq+0xcb/0x100
>>>   </IRQ>
>>>   <TASK>
>>>   ...
>>>   </TASK>
>>> 
>>>  Allocated by task 102371:
>>>   kasan_save_stack+0x1e/0x40
>>>   kasan_set_track+0x21/0x30
>>>   __kasan_kmalloc+0x7b/0x90
>>>   svc_setup_socket+0x52/0x4f0 [sunrpc]
>>>   svc_addsock+0x20d/0x400 [sunrpc]
>>>   __write_ports_addfd+0x209/0x390 [nfsd]
>>>   write_ports+0x239/0x2c0 [nfsd]
>>>   nfsctl_transaction_write+0xac/0x110 [nfsd]
>>>   vfs_write+0x1c3/0xae0
>>>   ksys_write+0xed/0x1c0
>>>   do_syscall_64+0x38/0x90
>>>   entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>> 
>>>  Freed by task 102551:
>>>   kasan_save_stack+0x1e/0x40
>>>   kasan_set_track+0x21/0x30
>>>   kasan_save_free_info+0x2a/0x50
>>>   __kasan_slab_free+0x106/0x190
>>>   __kmem_cache_free+0x133/0x270
>>>   svc_xprt_free+0x1e2/0x350 [sunrpc]
>>>   svc_xprt_destroy_all+0x25a/0x440 [sunrpc]
>>>   nfsd_put+0x125/0x240 [nfsd]
>>>   nfsd_svc+0x2cb/0x3c0 [nfsd]
>>>   write_threads+0x1ac/0x2a0 [nfsd]
>>>   nfsctl_transaction_write+0xac/0x110 [nfsd]
>>>   vfs_write+0x1c3/0xae0
>>>   ksys_write+0xed/0x1c0
>>>   do_syscall_64+0x38/0x90
>>>   entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>> 
>>> Fix the UAF by simply doing nothing in svc_tcp_listen_data_ready()
>>> if state != TCP_LISTEN, that will avoid dereferencing svsk for all
>>> child socket.
>>> 
>>> Link: https://lore.kernel.org/lkml/20230507091131.23540-1-dinghui@sangfor.com.cn/
>>> Fixes: fa9251afc33c ("SUNRPC: Call the default socket callbacks instead of open coding")
>>> Signed-off-by: Ding Hui <dinghui@...gfor.com.cn>
>>> Cc: <stable@...r.kernel.org>
>>> ---
>>> net/sunrpc/svcsock.c | 23 +++++++++++------------
>>> 1 file changed, 11 insertions(+), 12 deletions(-)
>>> 
>>> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
>>> index a51c9b989d58..9aca6e1e78e4 100644
>>> --- a/net/sunrpc/svcsock.c
>>> +++ b/net/sunrpc/svcsock.c
>>> @@ -825,12 +825,6 @@ static void svc_tcp_listen_data_ready(struct sock *sk)
>>> 
>>> trace_sk_data_ready(sk);
>>> 
>>> - if (svsk) {
>>> - /* Refer to svc_setup_socket() for details. */
>>> - rmb();
>>> - svsk->sk_odata(sk);
>>> - }
>>> -
>>> /*
>>> * This callback may called twice when a new connection
>>> * is established as a child socket inherits everything
>>> @@ -839,13 +833,18 @@ static void svc_tcp_listen_data_ready(struct sock *sk)
>>> *    when one of child sockets become ESTABLISHED.
>>> * 2) data_ready method of the child socket may be called
>>> *    when it receives data before the socket is accepted.
>>> - * In case of 2, we should ignore it silently.
>>> + * In case of 2, we should ignore it silently and DO NOT
>>> + * dereference svsk.
>>> */
>>> - if (sk->sk_state == TCP_LISTEN) {
>>> - if (svsk) {
>>> - set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
>>> - svc_xprt_enqueue(&svsk->sk_xprt);
>>> - }
>>> + if (sk->sk_state != TCP_LISTEN)
>>> + return;
>>> +
>>> + if (svsk) {
>>> + /* Refer to svc_setup_socket() for details. */
>>> + rmb();
>>> + svsk->sk_odata(sk);
>>> + set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
>>> + svc_xprt_enqueue(&svsk->sk_xprt);
>>> }
>>> }
>>> 
>>> -- 
>>> 2.17.1
>>> 
>> --
>> Chuck Lever
> 
> -- 
> Thanks,
> -dinghui


--
Chuck Lever



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ