[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1f2a8ed4-aaff-4be7-b3b6-63d2841a2908@redhat.com>
Date: Thu, 14 Mar 2024 17:42:20 +0100
From: David Hildenbrand <david@...hat.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org, x86@...nel.org,
Wupeng Ma <mawupeng1@...wei.com>, Dave Hansen <dave.hansen@...ux.intel.com>,
Andy Lutomirski <luto@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>, "H. Peter Anvin" <hpa@...or.com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v1] x86/mm/pat: fix VM_PAT handling in COW mappings
On 12.03.24 20:38, David Hildenbrand wrote:
> On 12.03.24 20:22, Matthew Wilcox wrote:
>> On Tue, Mar 12, 2024 at 07:11:18PM +0100, David Hildenbrand wrote:
>>> PAT handling won't do the right thing in COW mappings: the first PTE
>>> (or, in fact, all PTEs) can be replaced during write faults to point at
>>> anon folios. Reliably recovering the correct PFN and cachemode using
>>> follow_phys() from PTEs will not work in COW mappings.
>>
>> I guess the first question is: Why do we want to support COW mappings
>> of VM_PAT areas? What breaks if we just disallow it?
>
> Well, that was my first approach. Then I decided to be less radical (IOW
> make my life easier by breaking less user space) and "fix it" with
> minimal effort.
>
> Chances of breaking some weird user space is possible, although I
> believe for most such mappings MAP_PRIVATE doesn't make too much sense
> sense.
>
> Nasty COW support for VM_PFNMAP mappings dates back forever. So does PAT
> support.
>
> I can try finding digging through some possible user space users tomorrow.
As discussed, MAP_PRIVATE doesn't make too much sense for most PFNMAP
mappings.
However, /dev/mem and /proc/vmcore are still used with MAP_PRIVATE in
some cases.
Side note: /proc/vmcore is a bit weird: mmap_vmcore() sets VM_MIXEDMAP,
and then we might call remap_pfn_range(), which sets VM_PFNMAP. I'm not
so sure if that's what we want to happen ...
As far as I can see, makedumpfile always mmap's memory to be dumped
(/dev/mem, /proc/vmcore) using PROT_READ+MAP_PRIVATE, resulting in a COW
mapping.
In my opinion, we should use this fairly simple fix to keep it working
for now and look into disabling any MAP_PRIVATE of VM_PFNMAP separately,
for all architectures.
But I'll leave the decision to x86 maintainers.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists