[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BL1PR21MB3283790E8270ED6C639AAB0DD6D19@BL1PR21MB3283.namprd21.prod.outlook.com>
Date: Wed, 18 May 2022 05:59:00 +0000
From: Ajay Sharma <sharmaajay@...rosoft.com>
To: Jason Gunthorpe <jgg@...pe.ca>, Long Li <longli@...rosoft.com>
CC: KY Srinivasan <kys@...rosoft.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Wei Liu <wei.liu@...nel.org>, Dexuan Cui <decui@...rosoft.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Leon Romanovsky <leon@...nel.org>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
Ajay Sharma <sharmaajay@...rosoft.com>
Subject: RE: [EXTERNAL] Re: [PATCH 05/12] net: mana: Set the DMA device max
page size
Thanks Long.
Hello Jason,
I am the author of the patch.
To your comment below :
" As I've already said, you are supposed to set the value that limits to ib_sge and *NOT* the value that is related to ib_umem_find_best_pgsz. It is usually 2G because the ib_sge's typically work on a 32 bit length."
The ib_sge is limited by the __sg_alloc_table_from_pages() which uses ib_dma_max_seg_size() which is what is set by the eth driver using dma_set_max_seg_size() . Currently our hw does not support PTEs larger than 2M.
So ib_umem_find_best_pgsz() takes as an input PG_SZ_BITMAP . The bitmap has all the bits set for the page sizes supported by the HW.
#define PAGE_SZ_BM (SZ_4K | SZ_8K | SZ_16K | SZ_32K | SZ_64K | SZ_128K \
| SZ_256K | SZ_512K | SZ_1M | SZ_2M)
Are you suggesting we are too restrictive in the bitmap we are passing ? or that we should not set this bitmap let the function choose default ?
Regards,
Ajay
-----Original Message-----
From: Jason Gunthorpe <jgg@...pe.ca>
Sent: Tuesday, May 17, 2022 5:04 PM
To: Long Li <longli@...rosoft.com>
Cc: Ajay Sharma <sharmaajay@...rosoft.com>; KY Srinivasan <kys@...rosoft.com>; Haiyang Zhang <haiyangz@...rosoft.com>; Stephen Hemminger <sthemmin@...rosoft.com>; Wei Liu <wei.liu@...nel.org>; Dexuan Cui <decui@...rosoft.com>; David S. Miller <davem@...emloft.net>; Jakub Kicinski <kuba@...nel.org>; Paolo Abeni <pabeni@...hat.com>; Leon Romanovsky <leon@...nel.org>; linux-hyperv@...r.kernel.org; netdev@...r.kernel.org; linux-kernel@...r.kernel.org; linux-rdma@...r.kernel.org
Subject: [EXTERNAL] Re: [PATCH 05/12] net: mana: Set the DMA device max page size
[You don't often get email from jgg@...pe.ca. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification.]
On Tue, May 17, 2022 at 08:04:58PM +0000, Long Li wrote:
> > Subject: Re: [PATCH 05/12] net: mana: Set the DMA device max page
> > size
> >
> > On Tue, May 17, 2022 at 07:32:51PM +0000, Long Li wrote:
> > > > Subject: Re: [PATCH 05/12] net: mana: Set the DMA device max
> > > > page size
> > > >
> > > > On Tue, May 17, 2022 at 02:04:29AM -0700,
> > > > longli@...uxonhyperv.com
> > wrote:
> > > > > From: Long Li <longli@...rosoft.com>
> > > > >
> > > > > The system chooses default 64K page size if the device does
> > > > > not specify the max page size the device can handle for DMA.
> > > > > This do not work well when device is registering large chunk
> > > > > of memory in that a large page size is more efficient.
> > > > >
> > > > > Set it to the maximum hardware supported page size.
> > > >
> > > > For RDMA devices this should be set to the largest segment size
> > > > an ib_sge can take in when posting work. It should not be the
> > > > page size of MR. 2M is a weird number for that, are you sure it is right?
> > >
> > > Yes, this is the maximum page size used in hardware page tables.
> >
> > As I said, it should be the size of the sge in the WQE, not the
> > "hardware page tables"
>
> This driver uses the following code to figure out the largest page
> size for memory registration with hardware:
>
> page_sz = ib_umem_find_best_pgsz(mr->umem, PAGE_SZ_BM, iova);
>
> In this function, mr->umem is created with ib_dma_max_seg_size() as
> its max segment size when creating its sgtable.
>
> The purpose of setting DMA page size to 2M is to make sure this
> function returns the largest possible MR size that the hardware can
> take. Otherwise, this function will return 64k: the default DMA size.
As I've already said, you are supposed to set the value that limits to ib_sge and *NOT* the value that is related to ib_umem_find_best_pgsz. It is usually 2G because the ib_sge's typically work on a 32 bit length.
Jason
Powered by blists - more mailing lists