[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <jmxnalmkkc5ztfhokqtzqihsdji2gprnv5z4tzruxi6iqgfkni@aerronulpyem>
Date: Tue, 28 Oct 2025 11:48:51 +0000
From: Pedro Falcato <pfalcato@...e.de>
To: Dev Jain <dev.jain@....com>
Cc: linux-kernel@...r.kernel.org, stable@...r.kernel.org,
David Hildenbrand <david@...hat.com>, Andrew Morton <akpm@...ux-foundation.org>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>, Barry Song <baohua@...nel.org>,
"open list:MEMORY MAPPING" <linux-mm@...ck.org>
Subject: Re: [PATCH] mm/mremap: Honour writable bit in mremap pte batching
On Tue, Oct 28, 2025 at 12:09:52PM +0530, Dev Jain wrote:
> Currently mremap folio pte batch ignores the writable bit during figuring
> out a set of similar ptes mapping the same folio. Suppose that the first
> pte of the batch is writable while the others are not - set_ptes will
> end up setting the writable bit on the other ptes, which is a violation
> of mremap semantics. Therefore, use FPB_RESPECT_WRITE to check the writable
> bit while determining the pte batch.
>
Hmm, it seems to be like we're doing the wrong thing by default here?
I must admit I haven't followed the contpte work as much as I would've
liked, but it doesn't make much sense to me why FPB_RESPECT_WRITE would
be an option you have to explicitly pass, and where folio_pte_batch (the
"simple" interface) doesn't Just Do The Right Thing for naive callers.
Auditing all callers:
- khugepaged clears a variable number of ptes
- memory.c clears a variable number of ptes
- mempolicy.c grabs folios for migrations
- mlock.c steps over nr_ptes - 1 ptes, speeding up traversal
- mremap is borked since we're remapping nr_ptes ptes
- rmap.c TTU unmaps nr_ptes ptes for a given folio
so while the vast majority of callers don't seem to care, it would make
sense that folio_pte_batch() works conservatively by default, and
folio_pte_batch_flags() would allow for further batching (or maybe
we would add a separate folio_pte_batch_clear() or
folio_pte_batch_greedy() or whatnot).
> Cc: stable@...r.kernel.org #6.17
> Fixes: f822a9a81a31 ("mm: optimize mremap() by PTE batching")
> Reported-by: David Hildenbrand <david@...hat.com>
> Debugged-by: David Hildenbrand <david@...hat.com>
> Signed-off-by: Dev Jain <dev.jain@....com>
But the solution itself looks okay to me. so, fwiw:
Acked-by: Pedro Falcato <pfalcato@...e.de>
--
Pedro
Powered by blists - more mailing lists