[<prev] [next>] [day] [month] [year] [list]
Message-ID: <YdLpWxwn7WPdvEno@unreal>
Date: Mon, 3 Jan 2022 14:17:31 +0200
From: Leon Romanovsky <leon@...nel.org>
To: Hillf Danton <hdanton@...a.com>
Cc: Jason Gunthorpe <jgg@...dia.com>,
Aharon Landau <aharonl@...dia.com>,
linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org
Subject: Re: [PATCH rdma-next v1 6/7] RDMA/mlx5: Delay the deregistration of
a non-cache mkey
On Sun, Jan 02, 2022 at 11:03:10AM +0800, Hillf Danton wrote:
> On Thu, 30 Dec 2021 13:23:23 +0200
> > From: Aharon Landau <aharonl@...dia.com>
> >
> > When restarting an application with many non-cached mkeys, all the mkeys
> > will be destroyed and then recreated.
> >
> > This process takes a long time (about 20 seconds for deregistration and
> > 28 seconds for registration of 100,000 MRs).
> >
> > To shorten the restart runtime, insert the mkeys temporarily into the
> > cache and schedule a delayed work to destroy them later. If there is no
> > fitting entry to these mkeys, create a temporary entry that fits them.
> >
> > If 30 seconds have passed and no user reclaimed the temporarily cached
> > mkeys, the scheduled work will destroy the mkeys and the temporary
> > entries.
> >
> > When restarting an application, the mkeys will still be in the cache
> > when trying to reg them again, therefore, the registration will be
> > faster (4 seconds for deregistration and 5 seconds or registration of
> > 100,000 MRs).
> >
> > Signed-off-by: Aharon Landau <aharonl@...dia.com>
> > Signed-off-by: Leon Romanovsky <leonro@...dia.com>
> > ---
> > drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 +
> > drivers/infiniband/hw/mlx5/mr.c | 131 ++++++++++++++++++++++++++-
> > 2 files changed, 132 insertions(+), 2 deletions(-)
<...>
> > + if (!ent->is_tmp)
> > + mr->mmkey.cache_ent = ent;
> > + else {
> > + ent->total_mrs--;
> > + cancel_delayed_work(&ent->dev->cache.remove_ent_dwork);
> > + queue_delayed_work(ent->dev->cache.wq,
> > + &ent->dev->cache.remove_ent_dwork,
> > + msecs_to_jiffies(30 * 1000));
> > + }
>
> Nit: collapse cancel and queue into mod_delayed_work().
>
> > }
<...>
> > + INIT_WORK(&ent->work, cache_work_func);
> > + INIT_DELAYED_WORK(&ent->dwork, delayed_cache_work_func);
>
> More important IMHO is to cut work in a seperate patch given that dwork can
> be queued with zero delay and both work callbacks are simple wrappers of
> __cache_work_func().
Thanks, I'll collect more feedback and resubmit.
>
> Hillf
Powered by blists - more mailing lists