[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250630122501.GQ167785@nvidia.com>
Date: Mon, 30 Jun 2025 09:25:01 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Ankit Agrawal <ankita@...dia.com>
Cc: Will Deacon <will@...nel.org>, "maz@...nel.org" <maz@...nel.org>,
"oliver.upton@...ux.dev" <oliver.upton@...ux.dev>,
"joey.gouly@....com" <joey.gouly@....com>,
"suzuki.poulose@....com" <suzuki.poulose@....com>,
"yuzenghui@...wei.com" <yuzenghui@...wei.com>,
"catalin.marinas@....com" <catalin.marinas@....com>,
"ryan.roberts@....com" <ryan.roberts@....com>,
"shahuang@...hat.com" <shahuang@...hat.com>,
"lpieralisi@...nel.org" <lpieralisi@...nel.org>,
"david@...hat.com" <david@...hat.com>,
"ddutile@...hat.com" <ddutile@...hat.com>,
"seanjc@...gle.com" <seanjc@...gle.com>,
Aniket Agashe <aniketa@...dia.com>, Neo Jia <cjia@...dia.com>,
Kirti Wankhede <kwankhede@...dia.com>,
Krishnakant Jaju <kjaju@...dia.com>,
"Tarun Gupta (SW-GPU)" <targupta@...dia.com>,
Vikram Sethi <vsethi@...dia.com>, Andy Currid <acurrid@...dia.com>,
Alistair Popple <apopple@...dia.com>,
John Hubbard <jhubbard@...dia.com>, Dan Williams <danw@...dia.com>,
Zhi Wang <zhiw@...dia.com>, Matt Ochs <mochs@...dia.com>,
Uday Dhoke <udhoke@...dia.com>, Dheeraj Nigam <dnigam@...dia.com>,
"alex.williamson@...hat.com" <alex.williamson@...hat.com>,
"sebastianene@...gle.com" <sebastianene@...gle.com>,
"coltonlewis@...gle.com" <coltonlewis@...gle.com>,
"kevin.tian@...el.com" <kevin.tian@...el.com>,
"yi.l.liu@...el.com" <yi.l.liu@...el.com>,
"ardb@...nel.org" <ardb@...nel.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"gshan@...hat.com" <gshan@...hat.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"tabba@...gle.com" <tabba@...gle.com>,
"qperret@...gle.com" <qperret@...gle.com>,
"kvmarm@...ts.linux.dev" <kvmarm@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>,
"maobibo@...ngson.cn" <maobibo@...ngson.cn>,
"pbonzini@...hat.com" <pbonzini@...hat.com>
Subject: Re: [PATCH v9 3/6] KVM: arm64: Block cacheable PFNMAP mapping
On Mon, Jun 30, 2025 at 01:56:43AM +0000, Ankit Agrawal wrote:
> > Sorry for the drive-by comment, but I was looking at this old series from
> > Paolo (look at the cover letter and patch 5):
> >
> > https://lore.kernel.org/r/20250109133817.314401-1-pbonzini@redhat.com
> >
> > in which he points out that the arm64 get_vma_page_shift() function
> > incorrectly assumes that a VM_PFNMAP VMA is physically contiguous, which
> > may not be the case if a driver calls remap_pfn_range() to mess around
> > with mappings within the VMA. I think that implies that the optimisation
> > in 2aa53d68cee6 ("KVM: arm64: Try stage2 block mapping for host device
> > MMIO") is unsound.
>
> Hm yeah, that does seem problematic. Perhaps we need a new
> vma flag that could help the driver communicate to the KVM that the
> mapping is contiguous and it can go ahead with the optimization?
> E.g. something similar to VM_ALLOW_ANY_UNCACHED.
I think Paolo has the right direction - remove any attempts by KVM to
expand contiguity, it should only copy the primary's PTEs and rely on
the primary to discover contiguity. No new flags.
Instead we need to keep pushing forward with Peter Xu's work to have
PFNMAPs use large PTEs. I believe ARM still has some missing parts for
full support bot overall it is a far better direction.
> > But it got me thinking -- given that remap_pfn_range() also takes a 'prot'
> > argument, how do we ensure that this is reflected in the guest? It feels
> > a bit dodgy to rely on drivers always passing 'vma->vm_page_prot'.
But broadly, if you take that perspective KVM as a misdesign.. KVM
should be getting the VMA's PTE so that the S2 PTE can duplicate the
page-by-page flags, instead of only copying the writable bit.
So, let's call this out of scope for this series. ARM64 KVM does not
support page-by-page cachability, it never has, and there is no known
use case right now.. The only user of VMA_CACHED PFNMAPs is being
consistent.
Jason
Powered by blists - more mailing lists