[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAEz=LcsoDbGmTmVmGyPbcoFjahyf-ruuddjbSXE2W5EH9KDtmA@mail.gmail.com>
Date: Wed, 1 Nov 2023 08:58:27 +0800
From: Greg Sword <gregsword0@...il.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: Zhu Yanjun <yanjun.zhu@...ux.dev>,
"Zhijian Li (Fujitsu)" <lizhijian@...itsu.com>,
"zyjzyj2000@...il.com" <zyjzyj2000@...il.com>,
"leon@...nel.org" <leon@...nel.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"rpearsonhpe@...il.com" <rpearsonhpe@...il.com>,
"Daisuke Matsuda (Fujitsu)" <matsuda-daisuke@...itsu.com>,
"bvanassche@....org" <bvanassche@....org>
Subject: Re: [PATCH RFC 1/2] RDMA/rxe: don't allow registering !PAGE_SIZE mr
On Tue, Oct 31, 2023 at 9:19 PM Jason Gunthorpe <jgg@...pe.ca> wrote:
>
> On Tue, Oct 31, 2023 at 04:52:23PM +0800, Zhu Yanjun wrote:
> > 在 2023/10/30 20:40, Jason Gunthorpe 写道:
> > > On Mon, Oct 30, 2023 at 07:51:41AM +0000, Zhijian Li (Fujitsu) wrote:
> > > >
> > > >
> > > > On 27/10/2023 13:41, Li Zhijian wrote:
> > > > > mr->page_list only encodes *page without page offset, when
> > > > > page_size != PAGE_SIZE, we cannot restore the address with a wrong
> > > > > page_offset.
> > > > >
> > > > > Note that this patch will break some ULPs that try to register 4K
> > > > > MR when PAGE_SIZE is not 4K.
> > > > > SRP and nvme over RXE is known to be impacted.
> > > > >
> > > > > Signed-off-by: Li Zhijian <lizhijian@...itsu.com>
> > > > > ---
> > > > > drivers/infiniband/sw/rxe/rxe_mr.c | 6 ++++++
> > > > > 1 file changed, 6 insertions(+)
> > > > >
> > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> > > > > index f54042e9aeb2..61a136ea1d91 100644
> > > > > --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> > > > > +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> > > > > @@ -234,6 +234,12 @@ int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sgl,
> > > > > struct rxe_mr *mr = to_rmr(ibmr);
> > > > > unsigned int page_size = mr_page_size(mr);
> > > > > + if (page_size != PAGE_SIZE) {
> > > >
> > > > It seems this condition is too strict, it should be:
> > > > if (!IS_ALIGNED(page_size, PAGE_SIZE))
> > > >
> > > > So that, page_size with (N * PAGE_SIZE) can work as previously.
> > > > Because the offset(mr.iova & page_mask) will get lost only when !IS_ALIGNED(page_size, PAGE_SIZE)
> > >
> > > That makes sense
> >
> > I read all the discussions very carefully.
> >
> > Thanks, Greg.
> >
> > Because RXE only supports PAGE_SIZE, when CONFIG_ARM64_64K_PAGES is enabled,
> > the PAGE_SIZE is 64K, when CONFIG_ARM64_64K_PAGES is disabled, PAGE_SIZE is
> > 4K.
> >
> > But NVMe calls ib_map_mr_sg with a fixed size SZ_4K. When
> > CONFIG_ARM64_64K_PAGES is enabled, it is still 4K. This is not a problem in
> > RXE. This problem is in NVMe.
>
> Maybe, but no real RDMA devices don't support 4K.
>
> The xarray conversion may need revision to use physical addresses
> instead of storing struct pages so it can handle this kind of
> segmentation.
This problem can not be fixed until rxe supports multiple page sizes
including 4K page size.
Now it is not fixed. It is an intermediate way.
>
> Certainly in the mean time it should be rejected.
>
> Jason
Powered by blists - more mailing lists