[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4ec6e460-96d1-fedc-96ff-79a98fd38de8@linux.ibm.com>
Date: Wed, 29 Dec 2021 13:56:42 +0100
From: Karsten Graul <kgraul@...ux.ibm.com>
To: Wen Gu <guwen@...ux.alibaba.com>, davem@...emloft.net,
kuba@...nel.org
Cc: linux-s390@...r.kernel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, dust.li@...ux.alibaba.com,
tonylu@...ux.alibaba.com
Subject: Re: [RFC PATCH net v2 1/2] net/smc: Resolve the race between link
group access and termination
On 28/12/2021 16:13, Wen Gu wrote:
> We encountered some crashes caused by the race between the access
> and the termination of link groups.
While I agree with the problems you found I am not sure if the solution is the right one.
At the moment conn->lgr is checked all over the code as indication if a connection
still has a valid link group. When you change this semantic by leaving conn->lgr set
after the connection was unregistered from its link group then I expect various new problems
to happen.
For me the right solution would be to use correct locking before conn->lgr is checked and used.
In smc_lgr_unregister_conn() the lgr->conns_lock is used when conn->lgr is unset (note that
it is better to have that "conn->lgr = NULL;" line INSIDE the lock in this function).
And on any places in the code where conn->lgr is used you get the read_lock while lgr is accessed.
This could solve the problem, using existing mechanisms, right? Opinions?
Powered by blists - more mailing lists