[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <745d3174-f497-4d6a-ba13-1074128ad99d@linux.ibm.com>
Date: Thu, 12 Oct 2023 13:51:54 +0200
From: Wenjia Zhang <wenjia@...ux.ibm.com>
To: "D. Wythe" <alibuda@...ux.alibaba.com>, kgraul@...ux.ibm.com,
jaka@...ux.ibm.com, wintera@...ux.ibm.com
Cc: kuba@...nel.org, davem@...emloft.net, netdev@...r.kernel.org,
linux-s390@...r.kernel.org, linux-rdma@...r.kernel.org
Subject: Re: [PATCH net 1/5] net/smc: fix dangling sock under state
SMC_APPFINCLOSEWAIT
On 12.10.23 04:37, D. Wythe wrote:
>
>
> On 10/12/23 4:31 AM, Wenjia Zhang wrote:
>>
>>
>> On 11.10.23 09:33, D. Wythe wrote:
>>> From: "D. Wythe" <alibuda@...ux.alibaba.com>
>>>
>>> Considering scenario:
>>>
>>> smc_cdc_rx_handler_rwwi
>>> __smc_release
>>> sock_set_flag
>>> smc_close_active()
>>> sock_set_flag
>>>
>>> __set_bit(DEAD) __set_bit(DONE)
>>>
>>> Dues to __set_bit is not atomic, the DEAD or DONE might be lost.
>>> if the DEAD flag lost, the state SMC_CLOSED will be never be reached
>>> in smc_close_passive_work:
>>>
>>> if (sock_flag(sk, SOCK_DEAD) &&
>>> smc_close_sent_any_close(conn)) {
>>> sk->sk_state = SMC_CLOSED;
>>> } else {
>>> /* just shutdown, but not yet closed locally */
>>> sk->sk_state = SMC_APPFINCLOSEWAIT;
>>> }
>>>
>>> Replace sock_set_flags or __set_bit to set_bit will fix this problem.
>>> Since set_bit is atomic.
>>>
>> I didn't really understand the scenario. What is
>> smc_cdc_rx_handler_rwwi()? What does it do? Don't it get the lock
>> during the runtime?
>>
>
> Hi Wenjia,
>
> Sorry for that, It is not smc_cdc_rx_handler_rwwi() but
> smc_cdc_rx_handler();
>
> Following is a more specific description of the issues
>
>
> lock_sock()
> __smc_release
>
> smc_cdc_rx_handler()
> smc_cdc_msg_recv()
> bh_lock_sock()
> smc_cdc_msg_recv_action()
> sock_set_flag(DONE) sock_set_flag(DEAD)
> __set_bit __set_bit
> bh_unlock_sock()
> release_sock()
>
>
>
> Note : |bh_lock_sock|and |lock_sock|are not mutually exclusive. They are
> actually used for different purposes and contexts.
>
>
ok, that's true that |bh_lock_sock|and |lock_sock|are not really
mutually exclusive. However, since bh_lock_sock() is used, this scenario
you described above should not happen, because that gets the
sk_lock.slock. Following this scenarios, IMO, only the following
situation can happen.
lock_sock()
__smc_release
smc_cdc_rx_handler()
smc_cdc_msg_recv()
bh_lock_sock()
smc_cdc_msg_recv_action()
sock_set_flag(DONE)
bh_unlock_sock()
sock_set_flag(DEAD)
release_sock()
>
>>> Signed-off-by: D. Wythe <alibuda@...ux.alibaba.com>
>>> ---
>>> net/smc/af_smc.c | 4 ++--
>>> net/smc/smc.h | 5 +++++
>>> net/smc/smc_cdc.c | 2 +-
>>> net/smc/smc_close.c | 2 +-
>>> 4 files changed, 9 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
>>> index bacdd97..5ad2a9f 100644
>>> --- a/net/smc/af_smc.c
>>> +++ b/net/smc/af_smc.c
>>> @@ -275,7 +275,7 @@ static int __smc_release(struct smc_sock *smc)
>>> if (!smc->use_fallback) {
>>> rc = smc_close_active(smc);
>>> - sock_set_flag(sk, SOCK_DEAD);
>>> + smc_sock_set_flag(sk, SOCK_DEAD);
>>> sk->sk_shutdown |= SHUTDOWN_MASK;
>>> } else {
>>> if (sk->sk_state != SMC_CLOSED) {
>>> @@ -1742,7 +1742,7 @@ static int smc_clcsock_accept(struct smc_sock
>>> *lsmc, struct smc_sock **new_smc)
>>> if (new_clcsock)
>>> sock_release(new_clcsock);
>>> new_sk->sk_state = SMC_CLOSED;
>>> - sock_set_flag(new_sk, SOCK_DEAD);
>>> + smc_sock_set_flag(new_sk, SOCK_DEAD);
>>> sock_put(new_sk); /* final */
>>> *new_smc = NULL;
>>> goto out;
>>> diff --git a/net/smc/smc.h b/net/smc/smc.h
>>> index 24745fd..e377980 100644
>>> --- a/net/smc/smc.h
>>> +++ b/net/smc/smc.h
>>> @@ -377,4 +377,9 @@ void smc_fill_gid_list(struct smc_link_group *lgr,
>>> int smc_nl_enable_hs_limitation(struct sk_buff *skb, struct
>>> genl_info *info);
>>> int smc_nl_disable_hs_limitation(struct sk_buff *skb, struct
>>> genl_info *info);
>>> +static inline void smc_sock_set_flag(struct sock *sk, enum
>>> sock_flags flag)
>>> +{
>>> + set_bit(flag, &sk->sk_flags);
>>> +}
>>> +
>>> #endif /* __SMC_H */
>>> diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
>>> index 89105e9..01bdb79 100644
>>> --- a/net/smc/smc_cdc.c
>>> +++ b/net/smc/smc_cdc.c
>>> @@ -385,7 +385,7 @@ static void smc_cdc_msg_recv_action(struct
>>> smc_sock *smc,
>>> smc->sk.sk_shutdown |= RCV_SHUTDOWN;
>>> if (smc->clcsock && smc->clcsock->sk)
>>> smc->clcsock->sk->sk_shutdown |= RCV_SHUTDOWN;
>>> - sock_set_flag(&smc->sk, SOCK_DONE);
>>> + smc_sock_set_flag(&smc->sk, SOCK_DONE);
>>> sock_hold(&smc->sk); /* sock_put in close_work */
>>> if (!queue_work(smc_close_wq, &conn->close_work))
>>> sock_put(&smc->sk);
>>> diff --git a/net/smc/smc_close.c b/net/smc/smc_close.c
>>> index dbdf03e..449ef45 100644
>>> --- a/net/smc/smc_close.c
>>> +++ b/net/smc/smc_close.c
>>> @@ -173,7 +173,7 @@ void smc_close_active_abort(struct smc_sock *smc)
>>> break;
>>> }
>>> - sock_set_flag(sk, SOCK_DEAD);
>>> + smc_sock_set_flag(sk, SOCK_DEAD);
>>> sk->sk_state_change(sk);
>>> if (release_clcsock) {
>
Powered by blists - more mailing lists