linux-kernel - Re: [v2 linus-tree PATCH] mm: gup: do not call try_grab

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20240627164240.47ae4e1d0e7b1ddb11aedaf3@linux-foundation.org>
Date: Thu, 27 Jun 2024 16:42:40 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Yang Shi <yang@...amperecomputing.com>
Cc: peterx@...hat.com, yangge1116@....com, david@...hat.com,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org, stable@...r.kernel.org,
 Vivek Kasireddy <vivek.kasireddy@...el.com>
Subject: Re: [v2 linus-tree PATCH] mm: gup: do not call try_grab_folio() in
 slow path

On Thu, 27 Jun 2024 16:16:01 -0700 Yang Shi <yang@...amperecomputing.com> wrote:

> The try_grab_folio() is supposed to be used in fast path and it elevates
> folio refcount by using add ref unless zero.  We are guaranteed to have
> at least one stable reference in slow path, so the simple atomic add
> could be used.  The performance difference should be trivial, but the
> misuse may be confusing and misleading.
> 
> In another thread [1] a kernel warning was reported when pinning folio
> in CMA memory when launching SEV virtual machine.  The splat looks like:
> 
> [  464.325306] WARNING: CPU: 13 PID: 6734 at mm/gup.c:1313 __get_user_pages+0x423/0x520
> [  464.325464] CPU: 13 PID: 6734 Comm: qemu-kvm Kdump: loaded Not tainted 6.6.33+ #6
> [  464.325477] RIP: 0010:__get_user_pages+0x423/0x520
> [  464.325515] Call Trace:
> [  464.325520]  <TASK>
> [  464.325523]  ? __get_user_pages+0x423/0x520
> [  464.325528]  ? __warn+0x81/0x130
> [  464.325536]  ? __get_user_pages+0x423/0x520
> [  464.325541]  ? report_bug+0x171/0x1a0
> [  464.325549]  ? handle_bug+0x3c/0x70
> [  464.325554]  ? exc_invalid_op+0x17/0x70
> [  464.325558]  ? asm_exc_invalid_op+0x1a/0x20
> [  464.325567]  ? __get_user_pages+0x423/0x520
> [  464.325575]  __gup_longterm_locked+0x212/0x7a0
> [  464.325583]  internal_get_user_pages_fast+0xfb/0x190
> [  464.325590]  pin_user_pages_fast+0x47/0x60
> [  464.325598]  sev_pin_memory+0xca/0x170 [kvm_amd]
> [  464.325616]  sev_mem_enc_register_region+0x81/0x130 [kvm_amd]
> 
> Per the analysis done by yangge, when starting the SEV virtual machine,
> it will call pin_user_pages_fast(..., FOLL_LONGTERM, ...) to pin the
> memory.  But the page is in CMA area, so fast GUP will fail then
> fallback to the slow path due to the longterm pinnalbe check in
> try_grab_folio().
> The slow path will try to pin the pages then migrate them out of CMA
> area.  But the slow path also uses try_grab_folio() to pin the page,
> it will also fail due to the same check then the above warning
> is triggered.
> 

The remainder of mm-unstable actually applies OK on top of this.

I applied the below as a fixup to Vivek's "mm/gup: introduce
memfd_pin_folios() for pinning memfd folios".  After this, your v1
patch reverts cleanly.

--- a/mm/gup.c~mm-gup-introduce-memfd_pin_folios-for-pinning-memfd-folios-fix
+++ a/mm/gup.c
@@ -3856,14 +3856,15 @@ long memfd_pin_folios(struct file *memfd
 				    next_idx != folio_index(fbatch.folios[i]))
 					continue;
 
-				folio = try_grab_folio(&fbatch.folios[i]->page,
-						       1, FOLL_PIN);
-				if (!folio) {
+				if (try_grab_folio(fbatch.folios[i],
+						       1, FOLL_PIN)) {
 					folio_batch_release(&fbatch);
 					ret = -EINVAL;
 					goto err;
 				}
 
+				folio = fbatch.folios[i];
+
 				if (nr_folios == 0)
 					*offset = offset_in_folio(folio, start);
 
_