lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <92f4d9766a5b381abeac6c4f6afcf0e4cba183d9.camel@intel.com>
Date: Thu, 11 Dec 2025 00:07:49 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "kvm@...r.kernel.org" <kvm@...r.kernel.org>, "linux-coco@...ts.linux.dev"
	<linux-coco@...ts.linux.dev>, "Huang, Kai" <kai.huang@...el.com>, "Li,
 Xiaoyao" <xiaoyao.li@...el.com>, "Hansen, Dave" <dave.hansen@...el.com>,
	"Zhao, Yan Y" <yan.y.zhao@...el.com>, "Wu, Binbin" <binbin.wu@...el.com>,
	"kas@...nel.org" <kas@...nel.org>, "seanjc@...gle.com" <seanjc@...gle.com>,
	"mingo@...hat.com" <mingo@...hat.com>, "pbonzini@...hat.com"
	<pbonzini@...hat.com>, "tglx@...utronix.de" <tglx@...utronix.de>, "Yamahata,
 Isaku" <isaku.yamahata@...el.com>, "nik.borisov@...e.com"
	<nik.borisov@...e.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "Annapurve, Vishal" <vannapurve@...gle.com>,
	"Gao, Chao" <chao.gao@...el.com>, "bp@...en8.de" <bp@...en8.de>,
	"x86@...nel.org" <x86@...nel.org>
CC: "kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>
Subject: Re: [PATCH v4 06/16] x86/virt/tdx: Improve PAMT refcounts allocation
 for sparse memory

On Thu, 2025-11-27 at 09:36 +0200, Nikolay Borisov wrote:
> I agree with your analysis but this needs to be described not only in 
> the commit message but also as a code comment because you intentionally 
> omit locking since that particular pte (at that point) can only have a 
> single user so no race conditions are possible.

While commenting and writing up the log reasoning for why this is safe, I ended
up revisiting why we need to do all this manual PTE modification in the first
place. 

In Kiryl's v2 he had the vmalloc area allocated with VM_IOREMAP. Kai asked why,
and seemingly as a result, it was changed to VM_SPARSE. It turns out the
VM_SPARSE flag is a newer thing for dealing with sparsely populated vmalloc
address space mappings, exactly like we have here. It comes with some helpers
for mapping and unmapping sparse vmalloc ranges. So I played around with these a
bit.

The current apply_to_page_range() version is optimally efficient, but the
allocation is a one-time operation, and the freeing is actually only called in
an error path. I'm not sure it needs to be optimal.

If you use the helpers to populate, you pretty much just need to allocate all
the pages up front, and then call vm_area_map_pages() to map them. To unmap (the
error path), the code pattern is to iterate through the mapping a page at a
time, fetch the page with vmalloc_to_page(), unmap and free the page. At most
this is 2,097,152 iterations. Not fast, but not going to hang boot either.

So rather than the try to justify the locking, I'm thinking to go with a
stupider, simpler mapping/unmapping method that uses the ready made VM_SPARSE
helpers.

I hate to change things at this point, but I think the discussed reasoning is
going to beg the question of why it needs to be complicated.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ