lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a228fc5a355aa8dc80314648a8c37a6500d34ebc.camel@intel.com>
Date: Fri, 16 May 2025 23:47:19 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "Huang, Kai" <kai.huang@...el.com>, "Zhao, Yan Y" <yan.y.zhao@...el.com>
CC: "Shutemov, Kirill" <kirill.shutemov@...el.com>, "Li, Xiaoyao"
	<xiaoyao.li@...el.com>, "Du, Fan" <fan.du@...el.com>, "Hansen, Dave"
	<dave.hansen@...el.com>, "david@...hat.com" <david@...hat.com>, "Li,
 Zhiquan1" <zhiquan1.li@...el.com>, "vbabka@...e.cz" <vbabka@...e.cz>,
	"tabba@...gle.com" <tabba@...gle.com>, "thomas.lendacky@....com"
	<thomas.lendacky@....com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "seanjc@...gle.com" <seanjc@...gle.com>,
	"Weiny, Ira" <ira.weiny@...el.com>, "michael.roth@....com"
	<michael.roth@....com>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
	"Yamahata, Isaku" <isaku.yamahata@...el.com>, "ackerleytng@...gle.com"
	<ackerleytng@...gle.com>, "binbin.wu@...ux.intel.com"
	<binbin.wu@...ux.intel.com>, "Peng, Chao P" <chao.p.peng@...el.com>,
	"quic_eberman@...cinc.com" <quic_eberman@...cinc.com>, "Annapurve, Vishal"
	<vannapurve@...gle.com>, "jroedel@...e.de" <jroedel@...e.de>, "Miao, Jun"
	<jun.miao@...el.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
	"pgonda@...gle.com" <pgonda@...gle.com>, "x86@...nel.org" <x86@...nel.org>
Subject: Re: [RFC PATCH 09/21] KVM: TDX: Enable 2MB mapping size after TD is
 RUNNABLE

On Fri, 2025-05-16 at 22:35 +0000, Huang, Kai wrote:
> > For TDs expect #VE, guests access private memory before accept it.
> > In that case, upon KVM receives EPT violation, there's no expected level
> > from
> > the TDX module. Returning PT_LEVEL_4K at the end basically disables huge
> > pages
> > for those TDs.
> 
> Just want to make sure I understand correctly:
> 
> Linux TDs always ACCEPT memory first before touching that memory, therefore
> KVM
> should always be able to get the accept level for Linux TDs.
> 
> In other words, returning PG_LEVEL_4K doesn't impact establishing large page
> mapping for Linux TDs.
> 
> However, other TDs may choose to touch memory first to receive #VE and then
> accept that memory.  Returning PG_LEVEL_2M allows those TDs to use large page
> mappings in SEPT.  Otherwise, returning PG_LEVEL_4K essentially disables large
> page for them (since we don't support PROMOTE for now?).
> 
> But in the above text you mentioned that, if doing so, because we choose to
> ignore splitting request on read, returning 2M could result in *endless* EPT
> violation.
> 
> So to me it seems you choose a design that could bring performance gain for
> certain non-Linux TDs when they follow a certain behaviour but otherwise could
> result in endless EPT violation in KVM.
> 
> I am not sure how is this OK?  Or probably I have misunderstanding?

Good point. And if we just pass 4k level if the EPT violation doesn't have the
accept size, then force prefetch to 4k too, like this does. Then what needs
fault path demotion? Guest double accept bugs?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ