lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Zys1luxxLWwy0yXh@localhost.localdomain>
Date: Wed, 6 Nov 2024 10:23:34 +0100
From: Oscar Salvador <osalvador@...e.de>
To: John Hubbard <jhubbard@...dia.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org,
	David Hildenbrand <david@...hat.com>,
	Vivek Kasireddy <vivek.kasireddy@...el.com>,
	Dave Airlie <airlied@...hat.com>, Gerd Hoffmann <kraxel@...hat.com>,
	Matthew Wilcox <willy@...radead.org>,
	Christoph Hellwig <hch@...radead.org>,
	Jason Gunthorpe <jgg@...dia.com>, Peter Xu <peterx@...hat.com>,
	Arnd Bergmann <arnd@...db.de>,
	Daniel Vetter <daniel.vetter@...ll.ch>,
	Dongwon Kim <dongwon.kim@...el.com>,
	Hugh Dickins <hughd@...gle.com>,
	Junxiao Chang <junxiao.chang@...el.com>,
	Mike Kravetz <mike.kravetz@...cle.com>,
	linux-stable@...r.kernel.org
Subject: Re: [PATCH v2 1/1] [PATCH] mm/gup: avoid an unnecessary allocation
 call for FOLL_LONGTERM cases

On Mon, Nov 04, 2024 at 07:29:44PM -0800, John Hubbard wrote:
> commit 53ba78de064b ("mm/gup: introduce
> check_and_migrate_movable_folios()") created a new constraint on the
> pin_user_pages*() API family: a potentially large internal allocation
> must now occur, for FOLL_LONGTERM cases.
> 
> A user-visible consequence has now appeared: user space can no longer
> pin more than 2GB of memory anymore on x86_64. That's because, on a 4KB
> PAGE_SIZE system, when user space tries to (indirectly, via a device
> driver that calls pin_user_pages()) pin 2GB, this requires an allocation
> of a folio pointers array of MAX_PAGE_ORDER size, which is the limit for
> kmalloc().
> 
> In addition to the directly visible effect described above, there is
> also the problem of adding an unnecessary allocation. The **pages array
> argument has already been allocated, and there is no need for a
> redundant **folios array allocation in this case.
> 
> Fix this by avoiding the new allocation entirely. This is done by
> referring to either the original page[i] within **pages, or to the
> associated folio. Thanks to David Hildenbrand for suggesting this
> approach and for providing the initial implementation (which I've tested
> and adjusted slightly) as well.
> 
> Fixes: 53ba78de064b ("mm/gup: introduce check_and_migrate_movable_folios()")
> Suggested-by: David Hildenbrand <david@...hat.com>
> Cc: Vivek Kasireddy <vivek.kasireddy@...el.com>
> Cc: Dave Airlie <airlied@...hat.com>
> Cc: Gerd Hoffmann <kraxel@...hat.com>
> Cc: Matthew Wilcox <willy@...radead.org>
> Cc: Christoph Hellwig <hch@...radead.org>
> Cc: Jason Gunthorpe <jgg@...dia.com>
> Cc: Peter Xu <peterx@...hat.com>
> Cc: Arnd Bergmann <arnd@...db.de>
> Cc: Daniel Vetter <daniel.vetter@...ll.ch>
> Cc: Dongwon Kim <dongwon.kim@...el.com>
> Cc: Hugh Dickins <hughd@...gle.com>
> Cc: Junxiao Chang <junxiao.chang@...el.com>
> Cc: Mike Kravetz <mike.kravetz@...cle.com>
> Cc: Oscar Salvador <osalvador@...e.de>
> Cc: linux-stable@...r.kernel.org
> Signed-off-by: John Hubbard <jhubbard@...dia.com>

Hi John, thanks for doing this.

Reviewed-by: Oscar Salvador <osalvador@...e.de>

Nit below:

> +static int
> +migrate_longterm_unpinnable_folios(struct list_head *movable_folio_list,
> +				   struct pages_or_folios *pofs)
>  {
>  	int ret;
>  	unsigned long i;
>  
> -	for (i = 0; i < nr_folios; i++) {
> -		struct folio *folio = folios[i];
> +	for (i = 0; i < pofs->nr_entries; i++) {
> +		struct folio *folio = pofs_get_folio(pofs, i);
>  
>  		if (folio_is_device_coherent(folio)) {
>  			/*
> @@ -2344,7 +2380,7 @@ static int migrate_longterm_unpinnable_folios(
>  			 * convert the pin on the source folio to a normal
>  			 * reference.
>  			 */
> -			folios[i] = NULL;
> +			pofs_clear_entry(pofs, i);
>  			folio_get(folio);
>  			gup_put_folio(folio, 1, FOLL_PIN);
>  
> @@ -2363,8 +2399,8 @@ static int migrate_longterm_unpinnable_folios(
>  		 * calling folio_isolate_lru() which takes a reference so the
>  		 * folio won't be freed if it's migrating.
>  		 */
> -		unpin_folio(folios[i]);
> -		folios[i] = NULL;
> +		unpin_folio(pofs_get_folio(pofs, i));

We already retrieved the folio before, cannot we just bypass
pofs_get_folio() here?


-- 
Oscar Salvador
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ