[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YgXKsNIdJIgEhEkd@TonyMac-Alibaba>
Date: Fri, 11 Feb 2022 10:32:16 +0800
From: Tony Lu <tonylu@...ux.alibaba.com>
To: Wen Gu <guwen@...ux.alibaba.com>
Cc: kgraul@...ux.ibm.com, davem@...emloft.net, kuba@...nel.org,
linux-s390@...r.kernel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH net] net/smc: Avoid overwriting the copies of clcsock
callback functions
On Thu, Feb 10, 2022 at 04:56:00PM +0800, Wen Gu wrote:
>
>
> On 2022/2/10 10:50 am, Tony Lu wrote:
>
> > I am wondering that there is a potential racing. If ->use_fallback is
> > setted to true, but the rest of replacing process is on the way, others
> > who tested and passed ->use_fallback, they would get old value before
> > replacing.
> >
>
> Thanks for your comments.
>
> I understand your concern. But when I went through all the places that
> check for smc->use_fallback, I haven't found the exact potential racing
> point. Please point out if I missed something. Thank you.
>
> In my humble opinion, most of the operations after smc->use_fallback check
> have no direct relationship with what did in smc_switch_to_fallback() (the
> replacement of clcsock callback functions), except for which in smc_sendmsg(),
> smc_recvmsg() and smc_sendpage():
>
> smc_sendmsg():
>
> if (smc->use_fallback) {
> rc = smc->clcsock->ops->sendmsg(smc->clcsock, msg, len);
> }
>
> smc_recvmsg():
>
> if (smc->use_fallback) {
> rc = smc->clcsock->ops->recvmsg(smc->clcsock, msg, len, flags);
> }
>
> smc_sendpage():
>
> if (smc->use_fallback) {
> rc = kernel_sendpage(smc->clcsock, page, offset,
> size, flags);
> }
>
> If smc->use_fallback is set to true, but callback functions (sk_data_ready ...)
> of clcsock haven't been replaced yet at this moment, there may be a racing as
> you described.
>
> But it won't happen, because fallback must already be done before sending and receiving.
>
> What do you think about it?
>
I am concerning about the non-blocking work in workqueue. If we can make
sure the order of fallback is determined, it would be safe. From your
analysis, I think it is safe for now.
Let's back to the patch, the original version of switch_to_fallback()
has a implicit reentrant semantics. This fixes should work, thanks.
Thanks for your detailed investigation.
Best regards,
Tony Lu
Powered by blists - more mailing lists