lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 18 May 2022 05:59:00 +0000
From:   Ajay Sharma <sharmaajay@...rosoft.com>
To:     Jason Gunthorpe <jgg@...pe.ca>, Long Li <longli@...rosoft.com>
CC:     KY Srinivasan <kys@...rosoft.com>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        Wei Liu <wei.liu@...nel.org>, Dexuan Cui <decui@...rosoft.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Leon Romanovsky <leon@...nel.org>,
        "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
        Ajay Sharma <sharmaajay@...rosoft.com>
Subject: RE: [EXTERNAL] Re: [PATCH 05/12] net: mana: Set the DMA device max
 page size

Thanks Long. 
Hello Jason,
I am the author of the patch.
To your comment below : 
" As I've already said, you are supposed to set the value that limits to ib_sge and *NOT* the value that is related to ib_umem_find_best_pgsz. It is usually 2G because the ib_sge's typically work on a 32 bit length."

The ib_sge is limited by the __sg_alloc_table_from_pages() which uses ib_dma_max_seg_size() which is what is set by the eth driver using dma_set_max_seg_size() . Currently our hw does not support PTEs larger than 2M. 

So ib_umem_find_best_pgsz()  takes as an input PG_SZ_BITMAP .  The bitmap has all the bits set for the page sizes supported by the HW.

#define PAGE_SZ_BM (SZ_4K | SZ_8K | SZ_16K | SZ_32K | SZ_64K | SZ_128K \
		    | SZ_256K | SZ_512K | SZ_1M | SZ_2M)

 Are you suggesting we are too restrictive in the bitmap  we are passing ? or that we should not set this bitmap let the function choose default ?

Regards,
Ajay

-----Original Message-----
From: Jason Gunthorpe <jgg@...pe.ca> 
Sent: Tuesday, May 17, 2022 5:04 PM
To: Long Li <longli@...rosoft.com>
Cc: Ajay Sharma <sharmaajay@...rosoft.com>; KY Srinivasan <kys@...rosoft.com>; Haiyang Zhang <haiyangz@...rosoft.com>; Stephen Hemminger <sthemmin@...rosoft.com>; Wei Liu <wei.liu@...nel.org>; Dexuan Cui <decui@...rosoft.com>; David S. Miller <davem@...emloft.net>; Jakub Kicinski <kuba@...nel.org>; Paolo Abeni <pabeni@...hat.com>; Leon Romanovsky <leon@...nel.org>; linux-hyperv@...r.kernel.org; netdev@...r.kernel.org; linux-kernel@...r.kernel.org; linux-rdma@...r.kernel.org
Subject: [EXTERNAL] Re: [PATCH 05/12] net: mana: Set the DMA device max page size

[You don't often get email from jgg@...pe.ca. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification.]

On Tue, May 17, 2022 at 08:04:58PM +0000, Long Li wrote:
> > Subject: Re: [PATCH 05/12] net: mana: Set the DMA device max page 
> > size
> >
> > On Tue, May 17, 2022 at 07:32:51PM +0000, Long Li wrote:
> > > > Subject: Re: [PATCH 05/12] net: mana: Set the DMA device max 
> > > > page size
> > > >
> > > > On Tue, May 17, 2022 at 02:04:29AM -0700, 
> > > > longli@...uxonhyperv.com
> > wrote:
> > > > > From: Long Li <longli@...rosoft.com>
> > > > >
> > > > > The system chooses default 64K page size if the device does 
> > > > > not specify the max page size the device can handle for DMA. 
> > > > > This do not work well when device is registering large chunk 
> > > > > of memory in that a large page size is more efficient.
> > > > >
> > > > > Set it to the maximum hardware supported page size.
> > > >
> > > > For RDMA devices this should be set to the largest segment size 
> > > > an ib_sge can take in when posting work. It should not be the 
> > > > page size of MR. 2M is a weird number for that, are you sure it is right?
> > >
> > > Yes, this is the maximum page size used in hardware page tables.
> >
> > As I said, it should be the size of the sge in the WQE, not the 
> > "hardware page tables"
>
> This driver uses the following code to figure out the largest page 
> size for memory registration with hardware:
>
> page_sz = ib_umem_find_best_pgsz(mr->umem, PAGE_SZ_BM, iova);
>
> In this function, mr->umem is created with ib_dma_max_seg_size() as 
> its max segment size when creating its sgtable.
>
> The purpose of setting DMA page size to 2M is to make sure this 
> function returns the largest possible MR size that the hardware can 
> take. Otherwise, this function will return 64k: the default DMA size.

As I've already said, you are supposed to set the value that limits to ib_sge and *NOT* the value that is related to ib_umem_find_best_pgsz. It is usually 2G because the ib_sge's typically work on a 32 bit length.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ