[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67fbe385-3682-be4e-15fe-f26cc56fd56b@gmail.com>
Date: Mon, 13 Mar 2023 14:43:43 -0500
From: Bob Pearson <rpearsonhpe@...il.com>
To: Joel Fernandes <joel@...lfernandes.org>,
Uladzislau Rezki <urezki@...il.com>
Cc: Zhu Yanjun <zyjzyj2000@...il.com>, Jason Gunthorpe <jgg@...pe.ca>,
Leon Romanovsky <leon@...nel.org>, linux-rdma@...r.kernel.org,
Jason Gunthorpe <jgg@...dia.com>,
LKML <linux-kernel@...r.kernel.org>, RCU <rcu@...r.kernel.org>,
"Paul E . McKenney" <paulmck@...nel.org>,
Oleksiy Avramchenko <oleksiy.avramchenko@...y.com>,
Jens Axboe <axboe@...nel.dk>,
Philipp Reisner <philipp.reisner@...bit.com>,
Bryan Tan <bryantan@...are.com>,
Steven Rostedt <rostedt@...dmis.org>,
Eric Dumazet <edumazet@...gle.com>,
Ariel Levkovich <lariel@...dia.com>,
Theodore Ts'o <tytso@....edu>, Julian Anastasov <ja@....bg>
Subject: Re: [PATCH 07/13] RDMA/rxe: Rename kfree_rcu() to
kfree_rcu_mightsleep()
On 3/9/23 18:55, Joel Fernandes wrote:
> On Thu, Mar 09, 2023 at 03:13:08PM +0100, Uladzislau Rezki wrote:
>>> On Wed, Feb 01, 2023 at 04:08:13PM +0100, Uladzislau Rezki (Sony) wrote:
>>>> The kfree_rcu()'s single argument name is deprecated therefore
>>>> rename it to kfree_rcu_mightsleep() variant. The goal is explicitly
>>>> underline that it is for sleepable contexts.
>>>>
>>>> Please check the RXE driver in a way that a single argument can
>>>> be used. Briefly looking at it and rcu_head should be embed to
>>>> free an obj over RCU-core. The context might be atomic.
>>>>
>>>> Cc: Bob Pearson <rpearsonhpe@...il.com>
>>>> Cc: Jason Gunthorpe <jgg@...dia.com>
>>>> Signed-off-by: Uladzislau Rezki (Sony) <urezki@...il.com>
>>>> ---
>>>> drivers/infiniband/sw/rxe/rxe_pool.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>> Could you please add you reviwed-by or Acked-by tags so we can bring
>>> our series with renaming for the next merge window?
>>>
>>> Thanks!
>>>
>> __rxe_cleanup() can be called in two contexts, sleepable and not.
>> Therefore usage of a single argument of the kvfree_rcu() is not correct
>> here.
>>
>> Could you please fix and check your driver? If my above statement
>> is not correct, please provide Acked-by or Reviwed-by tags to the
>> path that is in question.
>>
>> Otherwise please add an rcu_head in your data to free objects over
>> kvfree_rcu() using double argument API.
>>
>> Could you please support?
>
> Also this one needs renaming? It came in because of the commit in 6.3-rc1:
> 72a03627443d ("RDMA/rxe: Remove rxe_alloc()")
>
> It could be squashed into this patch itself since it is infiniband related.
>
> Paul noticed that this breaks dropping the old API on -next, so it is
> blocking the renaming.
>
> ---8<-----------------------
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> index b10aa1580a64..ae3a100e18fb 100644
> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> @@ -731,7 +731,7 @@ int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
> return -EINVAL;
>
> rxe_cleanup(mr);
> - kfree_rcu(mr);
> + kfree_rcu_mightsleep(mr);
> return 0;
> }
>
I just got back from a 1 week vacation and missed all this.
The "RDMA/rxe: Remove rxe_alloc()" patch just moved the memory allocation for MR (verbs) objects outside
of the rxe_pool code since it only applied to MRs and not the other verbs objects (AH, QP, CQ, ...).
That code has to handle a unique situation for AH objects which can be created or destroyed by connection
manager code in atomic context while all the other ones including MRs are always created/destroyed in process
context. All objects other than MR's are created/destroyed in the rdma-core code (drivers/infiniband/core).
The rxe driver keeps xarray's of pointers to the various objects which are protected by rcu locking and so
it made sense to use kfree_rcu to delete the object with a delay. In the MR case ..._mightsleep seems harmless
and should not be an issue.
However on reflection, all the references to the MR objects are ref counted and they have been dropped before
reaching the kfree and so there really never was a good reason to use kfree_rcu in the first place. So
a better solution would be to replace kfree_rcu with kfree. There is a timeout in completion_done() that
triggers a WARN_ON() and this is only seen if the driver is broken for some reason but that is equivalent to
getting a seg fault so no reason to further delay the kfree.
Reviewed-by: Bob Pearson <rpearsonhpe@...il.com>
Powered by blists - more mailing lists