[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e69815db698474e113dec16bd33116e54cb21c2a.camel@intel.com>
Date: Mon, 19 Jan 2026 08:49:58 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "seanjc@...gle.com" <seanjc@...gle.com>, "Zhao, Yan Y"
<yan.y.zhao@...el.com>
CC: "Du, Fan" <fan.du@...el.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"Li, Xiaoyao" <xiaoyao.li@...el.com>, "Hansen, Dave" <dave.hansen@...el.com>,
"thomas.lendacky@....com" <thomas.lendacky@....com>, "tabba@...gle.com"
<tabba@...gle.com>, "vbabka@...e.cz" <vbabka@...e.cz>, "david@...nel.org"
<david@...nel.org>, "michael.roth@....com" <michael.roth@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"binbin.wu@...ux.intel.com" <binbin.wu@...ux.intel.com>, "Peng, Chao P"
<chao.p.peng@...el.com>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
"ackerleytng@...gle.com" <ackerleytng@...gle.com>, "kas@...nel.org"
<kas@...nel.org>, "nik.borisov@...e.com" <nik.borisov@...e.com>, "Weiny, Ira"
<ira.weiny@...el.com>, "francescolavra.fl@...il.com"
<francescolavra.fl@...il.com>, "Yamahata, Isaku" <isaku.yamahata@...el.com>,
"sagis@...gle.com" <sagis@...gle.com>, "Gao, Chao" <chao.gao@...el.com>,
"Edgecombe, Rick P" <rick.p.edgecombe@...el.com>, "Miao, Jun"
<jun.miao@...el.com>, "Annapurve, Vishal" <vannapurve@...gle.com>,
"jgross@...e.com" <jgross@...e.com>, "pgonda@...gle.com" <pgonda@...gle.com>,
"x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH v3 11/24] KVM: x86/mmu: Introduce
kvm_split_cross_boundary_leafs()
On Mon, 2026-01-19 at 08:35 +0000, Huang, Kai wrote:
> On Mon, 2026-01-19 at 09:28 +0800, Zhao, Yan Y wrote:
> > > I find the "cross_boundary" termininology extremely confusing. I also dislike
> > > the concept itself, in the sense that it shoves a weird, specific concept into
> > > the guts of the TDP MMU.
> > > The other wart is that it's inefficient when punching a large hole. E.g. say
> > > there's a 16TiB guest_memfd instance (no idea if that's even possible), and then
> > > userpace punches a 12TiB hole. Walking all ~12TiB just to _maybe_ split the head
> > > and tail pages is asinine.
> > That's a reasonable concern. I actually thought about it.
> > My consideration was as follows:
> > Currently, we don't have such large areas. Usually, the conversion ranges are
> > less than 1GB. Though the initial conversion which converts all memory from
> > private to shared may be wide, there are usually no mappings at that stage. So,
> > the traversal should be very fast (since the traversal doesn't even need to go
> > down to the 2MB/1GB level).
> >
> > If the caller of kvm_split_cross_boundary_leafs() finds it needs to convert a
> > very large range at runtime, it can optimize by invoking the API twice:
> > once for range [start, ALIGN(start, 1GB)), and
> > once for range [ALIGN_DOWN(end, 1GB), end).
> >
> > I can also implement this optimization within kvm_split_cross_boundary_leafs()
> > by checking the range size if you think that would be better.
>
> I am not sure why do we even need kvm_split_cross_boundary_leafs(), if you
> want to do optimization.
>
> I think I've raised this in v2, and asked why not just letting the caller
> to figure out the ranges to split for a given range (see at the end of
> [*]), because the "cross boundary" can only happen at the beginning and
> end of the given range, if possible.
>
> [*]:
> https://lore.kernel.org/all/35fd7d70475d5743a3c45bc5b8118403036e439b.camel@intel.com/
Hmm.. thinking again, if you have multiple places needing to do this, then
kvm_split_cross_boundary_leafs() may serve as a helper to calculate the
ranges to split.
Powered by blists - more mailing lists