[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z77zNK7mRdjwILL7@gmail.com>
Date: Wed, 26 Feb 2025 11:55:48 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Matthew Wilcox <willy@...radead.org>, linux-kernel@...r.kernel.org,
linux-tip-commits@...r.kernel.org,
kernel test robot <oliver.sang@...el.com>, x86@...nel.org
Subject: Re: [tip: x86/mm] x86/mm: Clear _PAGE_DIRTY when we clear _PAGE_RW
* Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> On Tue, 25 Feb 2025 at 20:00, Matthew Wilcox <willy@...radead.org> wrote:
> >
> > I think the entire point of this file is to manipulate kernel mappings.
>
> Very likely. But just looking at the patch, it was very non-obvious.
Yeah, agreed - and I've extended the changelog the following way:
Subject: [PATCH] x86/mm: Clear _PAGE_DIRTY for kernel mappings when we clear _PAGE_RW
The bit pattern of _PAGE_DIRTY set and _PAGE_RW clear is used to mark
shadow stacks. This is currently checked for in mk_pte() but not
pfn_pte(). If we add the check to pfn_pte(), it catches vfree()
calling set_direct_map_invalid_noflush() which calls
__change_page_attr() which loads the old protection bits from the
PTE, clears the specified bits and uses pfn_pte() to construct the
new PTE.
We should, therefore, for kernel mappings, clear the _PAGE_DIRTY bit
consistently whenever we clear _PAGE_RW. I opted to do it in the
callers in case we want to use __change_page_attr() to create shadow
stacks inside the kernel at some point in the future. Arguably, we
might also want to clear _PAGE_ACCESSED here.
Note that the 3 functions involved:
__set_pages_np()
kernel_map_pages_in_pgd()
kernel_unmap_pages_in_pgd()
Only ever manipulate non-swappable kernel mappings, so maintaining
the DIRTY:1|RW:0 special pattern for shadow stacks and DIRTY:0
pattern for non-shadow-stack entries can be maintained consistently
and doesn't result in the unintended clearing of a live dirty bit
that could corrupt (destroy) dirty bit information for user mappings.
Is this explanation better?
Thanks,
Ingo
Powered by blists - more mailing lists