[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0743193c-80a0-4ef8-9cd7-cb732f3761ab@redhat.com>
Date: Tue, 14 Jan 2025 14:17:50 +0100
From: David Hildenbrand <david@...hat.com>
To: Jason Gunthorpe <jgg@...dia.com>, Ankit Agrawal <ankita@...dia.com>
Cc: "maz@...nel.org" <maz@...nel.org>,
"oliver.upton@...ux.dev" <oliver.upton@...ux.dev>,
"joey.gouly@....com" <joey.gouly@....com>,
"suzuki.poulose@....com" <suzuki.poulose@....com>,
"yuzenghui@...wei.com" <yuzenghui@...wei.com>,
"catalin.marinas@....com" <catalin.marinas@....com>,
"will@...nel.org" <will@...nel.org>,
"ryan.roberts@....com" <ryan.roberts@....com>,
"shahuang@...hat.com" <shahuang@...hat.com>,
"lpieralisi@...nel.org" <lpieralisi@...nel.org>,
Aniket Agashe <aniketa@...dia.com>, Neo Jia <cjia@...dia.com>,
Kirti Wankhede <kwankhede@...dia.com>,
"Tarun Gupta (SW-GPU)" <targupta@...dia.com>,
Vikram Sethi <vsethi@...dia.com>, Andy Currid <acurrid@...dia.com>,
Alistair Popple <apopple@...dia.com>, John Hubbard <jhubbard@...dia.com>,
Dan Williams <danw@...dia.com>, Zhi Wang <zhiw@...dia.com>,
Matt Ochs <mochs@...dia.com>, Uday Dhoke <udhoke@...dia.com>,
Dheeraj Nigam <dnigam@...dia.com>,
"alex.williamson@...hat.com" <alex.williamson@...hat.com>,
"sebastianene@...gle.com" <sebastianene@...gle.com>,
"coltonlewis@...gle.com" <coltonlewis@...gle.com>,
"kevin.tian@...el.com" <kevin.tian@...el.com>,
"yi.l.liu@...el.com" <yi.l.liu@...el.com>, "ardb@...nel.org"
<ardb@...nel.org>, "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"gshan@...hat.com" <gshan@...hat.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"kvmarm@...ts.linux.dev" <kvmarm@...ts.linux.dev>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH v2 1/1] KVM: arm64: Allow cacheable stage 2 mapping using
VMA flags
On 13.01.25 17:27, Jason Gunthorpe wrote:
> On Fri, Jan 10, 2025 at 09:15:53PM +0000, Ankit Agrawal wrote:
>>>>>> This patch solves the problems where it is possible for the kernel to
>>>>>> have VMAs pointing at cachable memory without causing
>>>>>> pfn_is_map_memory() to be true, eg DAX memremap cases and CXL/pre-CXL
>>>>>> devices. This memory is now properly marked as cachable in KVM.
>>>>>
>>>>> Does this only imply in worse performance, or does this also affect
>>>>> correctness? I suspect performance is the problem, correct?
>>>>
>>>> Correctness. Things like atomics don't work on non-cachable mappings.
>>>
>>> Hah! This needs to be highlighted in the patch description. And maybe
>>> this even implies Fixes: etc?
>>
>> Understood. I'll put that in the patch description.
>>
>>>>> Likely you assume to never end up with COW VM_PFNMAP -- I think it's
>>>>> possible when doing a MAP_PRIVATE /dev/mem mapping on systems that allow
>>>>> for mapping /dev/mem. Maybe one could just reject such cases (if KVM PFN
>>>>> lookup code not already rejects them, which might just be that case IIRC).
>>>>
>>>> At least VFIO enforces SHARED or it won't create the VMA.
>>>>
>>>> drivers/vfio/pci/vfio_pci_core.c: if ((vma->vm_flags & VM_SHARED) == 0)
>>>
>>> That makes a lot of sense for VFIO.
>>
>> So, I suppose we don't need to check this? Specially if we only extend the
>> changes to the following case:
I would check if it is a VM_PFNMAP, and outright refuse any page if
is_cow_mapping(vma->vm_flags) is true.
>> - type is VM_PFNMAP &&
>> - user mapping is cacheable (MT_NORMAL or MT_NORMAL_TAGGED) &&
>> - The suggested VM_FORCE_CACHED is set.
>
> Do we really want another weirdly defined VMA flag? I'd really like to
> avoid this..
Agreed.
Can't we do a "this is a weird VM_PFNMAP thing, let's consult the VMA
prot + whatever PFN information to find out if it is weird-device and
how we could safely map it?"
Ideally, we'd separate this logic from the "this is a normal VMA that
doesn't need any such special casing", and even stop playing PFN games
on these normal VMAs completely.
How is the VFIO going to know any better if it should set
> the flag when the questions seem to be around things like MTE that
> have nothing to do with VFIO?
I assume MTE does not apply at all to VM_PFNMAP, at least
arch_calc_vm_flag_bits() tells me that VM_MTE_ALLOWED should never get
set there.
So for VFIO and friends with VM_PFNMAP mapping, we can completely ignore
that maybe?
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists