lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 5 Oct 2023 20:14:37 +0200
From: Wenjia Zhang <wenjia@...ux.ibm.com>
To: "D. Wythe" <alibuda@...ux.alibaba.com>,
        Alexandra Winter <wintera@...ux.ibm.com>
Cc: jaka@...ux.ibm.com, kgraul@...ux.ibm.com, kuba@...nel.org,
        davem@...emloft.net, netdev@...r.kernel.org,
        linux-s390@...r.kernel.org, linux-rdma@...r.kernel.org
Subject: Re: [PATCH net] net/smc: fix panic smc_tcp_syn_recv_sock() while
 closing listen socket



On 26.09.23 11:06, D. Wythe wrote:
> 
> 
> On 9/26/23 3:18 PM, Alexandra Winter wrote:
>>
>> On 26.09.23 05:00, D. Wythe wrote:
>>> You are right. The key point is how to ensure the valid of smc sock 
>>> during the life time of clc sock, If so, READ_ONCE is good
>>> enough. Unfortunately, I found  that there are no such guarantee, so 
>>> it's still a life-time problem.
>> Did you discover a scenario, where clc sock could live longer than smc 
>> sock?
>> Wouldn't that be a dangerous scenario in itself? I still have some 
>> hope that the lifetime of an smc socket is by design longer
>> than that of the corresponding tcp socket.
> 
> 
> Hi Alexandra,
> 
> Yes there is. Considering scenario:
> 
> tcp_v4_rcv(skb)
> 
> /* req sock */
> reqsk = _inet_lookup_skb(skb)
> 
> /* listen sock */
> sk = reqsk(reqsk)->rsk_listener;
> sock_hold(sk);
> tcp_check_req(sk)
> 
> 
>                                                  smc_release /* release 
> smc listen sock */
>                                                  __smc_release
> smc_close_active()         /*  smc_sk->sk_state = SMC_CLOSED; */
>                                                      if 
> (smc_sk->sk_state == SMC_CLOSED)
> smc_clcsock_release();
> sock_release(clcsk);        /* close clcsock */
>      sock_put(sk);              /* might not  the final refcnt */
> 
> sock_put(smc_sk)    /* might be the final refcnt of smc_sock  */
> 
> syn_recv_sock(sk...)
> /* might be the final refcnt of tcp listen sock */
> sock_put(sk);
> 
> Fortunately, this scenario only affects smc_syn_recv_sock and 
> smc_hs_congested, as other callbacks already have locks to protect smc,
> which can guarantee that the sk_user_data is either NULL (set in 
> smc_close_active) or valid under the lock.
> I'm kind of confused with this scenario. How could the 
smc_clcsock_release()->sock_release(clcsk) happen?
Because the syn_recv_sock happens short prior to accept(), that means 
that the &smc->tcp_listen_work is already triggered but the real 
accept() is still not happening. At this moment, the incoming connection 
is being added into the accept queue. Thus, if the sk->sk_state is 
changed from SMC_LISTEN to SMC_CLOSED in smc_close_active(), there is 
still "flush_work(&smc->tcp_listen_work);" after that. That ensures the 
smc_clcsock_release() should not happen, if smc_clcsock_accept() is not 
finished. Do you think that the execution of the &smc->tcp_listen_work 
is already done? Or am I missing something?

>> Considering the const, maybe
>>> we need to do :
>>>
>>> 1. hold a refcnt of smc_sock for syn_recv_sock to keep smc sock valid 
>>> during life time of clc sock
>>> 2. put the refcnt of smc_sock in sk_destruct in tcp_sock to release 
>>> the very smc sock .
>>>
>>> In that way, we can always make sure the valid of smc sock during the 
>>> life time of clc sock. Then we can use READ_ONCE rather
>>> than lock.  What do you think ?
>> I am not sure I fully understand the details what you propose to do. 
>> And it is not only syn_recv_sock(), right?
>> You need to consider all relations between smc socks and tcp socks; 
>> fallback to tcp, initial creation, children of listen sockets, 
>> variants of shutdown, ... Preferrably a single simple mechanism covers 
>> all situations. Maybe there is such a mechanism already today?
>> (I don't think clcsock->sk->sk_user_data or sk_callback_lock provide 
>> this general coverage)
>> If we really have a gap, a general refcnt'ing on smc sock could be a 
>> solution, but needs to be designed carefully.
> 
> You are right , we need designed it with care, we will try the 
> referenced solutions internally first, and I will also send some RFCs so 
> that everyone can track the latest progress
> and make it can be all agreed.
>> Many thanks to you and the team to help make smc more stable and robust.
> 
> Our pleasure 😁.  The stability of smc is important to us too.
> 
> Best wishes,
> D. Wythe
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ