[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240814144307.GP2032816@nvidia.com>
Date: Wed, 14 Aug 2024 11:43:07 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Peter Xu <peterx@...hat.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Oscar Salvador <osalvador@...e.de>,
Axel Rasmussen <axelrasmussen@...gle.com>,
linux-arm-kernel@...ts.infradead.org, x86@...nel.org,
Will Deacon <will@...nel.org>, Gavin Shan <gshan@...hat.com>,
Paolo Bonzini <pbonzini@...hat.com>, Zi Yan <ziy@...dia.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Catalin Marinas <catalin.marinas@....com>,
Ingo Molnar <mingo@...hat.com>,
Alistair Popple <apopple@...dia.com>,
Borislav Petkov <bp@...en8.de>,
David Hildenbrand <david@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>, kvm@...r.kernel.org,
Dave Hansen <dave.hansen@...ux.intel.com>,
Alex Williamson <alex.williamson@...hat.com>,
Yan Zhao <yan.y.zhao@...el.com>
Subject: Re: [PATCH 00/19] mm: Support huge pfnmaps
On Wed, Aug 14, 2024 at 07:35:01AM -0700, Sean Christopherson wrote:
> On Wed, Aug 14, 2024, Jason Gunthorpe wrote:
> > On Fri, Aug 09, 2024 at 12:08:50PM -0400, Peter Xu wrote:
> > > Overview
> > > ========
> > >
> > > This series is based on mm-unstable, commit 98808d08fc0f of Aug 7th latest,
> > > plus dax 1g fix [1]. Note that this series should also apply if without
> > > the dax 1g fix series, but when without it, mprotect() will trigger similar
> > > errors otherwise on PUD mappings.
> > >
> > > This series implements huge pfnmaps support for mm in general. Huge pfnmap
> > > allows e.g. VM_PFNMAP vmas to map in either PMD or PUD levels, similar to
> > > what we do with dax / thp / hugetlb so far to benefit from TLB hits. Now
> > > we extend that idea to PFN mappings, e.g. PCI MMIO bars where it can grow
> > > as large as 8GB or even bigger.
> >
> > FWIW, I've started to hear people talk about needing this in the VFIO
> > context with VMs.
> >
> > vfio/iommufd will reassemble the contiguous range from the 4k PFNs to
> > setup the IOMMU, but KVM is not able to do it so reliably.
>
> Heh, KVM should very reliably do the exact opposite, i.e. KVM should never create
> a huge page unless the mapping is huge in the primary MMU. And that's very much
> by design, as KVM has no knowledge of what actually resides at a given PFN, and
> thus can't determine whether or not its safe to create a huge page if KVM happens
> to realize the VM has access to a contiguous range of memory.
Oh? Someone told me recently x86 kvm had code to reassemble contiguous
ranges?
I don't quite understand your safety argument, if the VMA has 1G of
contiguous physical memory described with 4K it is definitely safe for
KVM to reassemble that same memory and represent it as 1G.
Jason
Powered by blists - more mailing lists