linux-kernel - Re: [PATCH] mm/gup: restore the ability to pin more than 2GB at a time

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20241031000218.GA6900@nvidia.com>
Date: Wed, 30 Oct 2024 21:02:18 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: John Hubbard <jhubbard@...dia.com>
Cc: David Hildenbrand <david@...hat.com>,
	Alistair Popple <apopple@...dia.com>,
	Christoph Hellwig <hch@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org,
	linux-stable@...r.kernel.org,
	Vivek Kasireddy <vivek.kasireddy@...el.com>,
	Dave Airlie <airlied@...hat.com>, Gerd Hoffmann <kraxel@...hat.com>,
	Matthew Wilcox <willy@...radead.org>, Peter Xu <peterx@...hat.com>,
	Arnd Bergmann <arnd@...db.de>,
	Daniel Vetter <daniel.vetter@...ll.ch>,
	Dongwon Kim <dongwon.kim@...el.com>,
	Hugh Dickins <hughd@...gle.com>,
	Junxiao Chang <junxiao.chang@...el.com>,
	Mike Kravetz <mike.kravetz@...cle.com>,
	Oscar Salvador <osalvador@...e.de>
Subject: Re: [PATCH] mm/gup: restore the ability to pin more than 2GB at a
 time

On Wed, Oct 30, 2024 at 11:34:49AM -0700, John Hubbard wrote:

> From a very high level design perspective, it's not yet clear to me
> that there is either a "preferred" or "not recommended" aspect to
> pinning in batches vs. all at once here, as long as one stays
> below the type (int, long, unsigned...) limits of the API. Batching
> seems like what you do if the internal implementation is crippled
> and unable to meet its API requirements. So the fact that many
> callers do batching is sort of "tail wags dog".

No.. all things need to do batching because nothing should be storing
a linear struct page array that is so enormous. That is going to
create vmemap pressure that is not desirable.

For instance rdma pins in batches and copies the pins into a scatter
list and never has an allocation over PAGE_SIZE.

iommufd transfers them into a radix tree.

It is not so much that there is a limit, but that good kernel code
just *shouldn't* be allocating gigantic contiguous memory arrays at
all.

Jason