[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aG0TaqOgKInyopW0@willie-the-truck>
Date: Tue, 8 Jul 2025 13:47:38 +0100
From: Will Deacon <will@...nel.org>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: David Hildenbrand <david@...hat.com>, Ankit Agrawal <ankita@...dia.com>,
"maz@...nel.org" <maz@...nel.org>,
"oliver.upton@...ux.dev" <oliver.upton@...ux.dev>,
"joey.gouly@....com" <joey.gouly@....com>,
"suzuki.poulose@....com" <suzuki.poulose@....com>,
"yuzenghui@...wei.com" <yuzenghui@...wei.com>,
"catalin.marinas@....com" <catalin.marinas@....com>,
"ryan.roberts@....com" <ryan.roberts@....com>,
"shahuang@...hat.com" <shahuang@...hat.com>,
"lpieralisi@...nel.org" <lpieralisi@...nel.org>,
"ddutile@...hat.com" <ddutile@...hat.com>,
"seanjc@...gle.com" <seanjc@...gle.com>,
Aniket Agashe <aniketa@...dia.com>, Neo Jia <cjia@...dia.com>,
Kirti Wankhede <kwankhede@...dia.com>,
Krishnakant Jaju <kjaju@...dia.com>,
"Tarun Gupta (SW-GPU)" <targupta@...dia.com>,
Vikram Sethi <vsethi@...dia.com>, Andy Currid <acurrid@...dia.com>,
Alistair Popple <apopple@...dia.com>,
John Hubbard <jhubbard@...dia.com>, Dan Williams <danw@...dia.com>,
Zhi Wang <zhiw@...dia.com>, Matt Ochs <mochs@...dia.com>,
Uday Dhoke <udhoke@...dia.com>, Dheeraj Nigam <dnigam@...dia.com>,
"alex.williamson@...hat.com" <alex.williamson@...hat.com>,
"sebastianene@...gle.com" <sebastianene@...gle.com>,
"coltonlewis@...gle.com" <coltonlewis@...gle.com>,
"kevin.tian@...el.com" <kevin.tian@...el.com>,
"yi.l.liu@...el.com" <yi.l.liu@...el.com>,
"ardb@...nel.org" <ardb@...nel.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"gshan@...hat.com" <gshan@...hat.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"tabba@...gle.com" <tabba@...gle.com>,
"qperret@...gle.com" <qperret@...gle.com>,
"kvmarm@...ts.linux.dev" <kvmarm@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>,
"maobibo@...ngson.cn" <maobibo@...ngson.cn>,
"pbonzini@...hat.com" <pbonzini@...hat.com>
Subject: Re: [PATCH v9 3/6] KVM: arm64: Block cacheable PFNMAP mapping
On Fri, Jul 04, 2025 at 01:47:50PM -0300, Jason Gunthorpe wrote:
> On Fri, Jul 04, 2025 at 05:04:17PM +0100, Will Deacon wrote:
> > On Fri, Jul 04, 2025 at 02:21:32PM +0200, David Hildenbrand wrote:
> > > On 30.06.25 14:25, Jason Gunthorpe wrote:
> > > > On Mon, Jun 30, 2025 at 01:56:43AM +0000, Ankit Agrawal wrote:
> > > > > > Sorry for the drive-by comment, but I was looking at this old series from
> > > > > > Paolo (look at the cover letter and patch 5):
> > > > > >
> > > > > > https://lore.kernel.org/r/20250109133817.314401-1-pbonzini@redhat.com
> > > > > >
> > > > > > in which he points out that the arm64 get_vma_page_shift() function
> > > > > > incorrectly assumes that a VM_PFNMAP VMA is physically contiguous, which
> > > > > > may not be the case if a driver calls remap_pfn_range() to mess around
> > > > > > with mappings within the VMA. I think that implies that the optimisation
> > > > > > in 2aa53d68cee6 ("KVM: arm64: Try stage2 block mapping for host device
> > > > > > MMIO") is unsound.
> > > > >
> > > > > Hm yeah, that does seem problematic. Perhaps we need a new
> > > > > vma flag that could help the driver communicate to the KVM that the
> > > > > mapping is contiguous and it can go ahead with the optimization?
> > > > > E.g. something similar to VM_ALLOW_ANY_UNCACHED.
> > > >
> > > > I think Paolo has the right direction - remove any attempts by KVM to
> > > > expand contiguity, it should only copy the primary's PTEs and rely on
> > > > the primary to discover contiguity. No new flags.
> > >
> > > 100%
> >
> > The part I don't understand, however, is that I can't see an MMU notifier
> > anywhere on the successful remap_pfn_range() path. So if a driver is
> > using that interface to change the mapping properties of a VM_PFNMAP VMA,
> > how do we ensure that the guest doesn't use whatever stale mappings it's
> > faulted in previously? Did I just miss something?
>
> Generally mmu notifiers are for invalidation, not used when
> establishing new mappings.
>
> It is not legal to use remap_pfn_range() to replace already mapped
> PTEs. It can only be used during a fop mmap callback to establish the
> first mapping during VMA creation. Thus there can be no present
> mapping cached in a secondary and no need to invalidate.
Aha, thanks, I had completely missed that. I thought remap_pfn_range()
was quietly unmapping any existing mapping it ran into, but that's not
the case at all.
Cheers,
Will
Powered by blists - more mailing lists