lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <diqzldtsfsdr.fsf@ackerleytng-ctop.c.googlers.com>
Date: Wed, 26 Feb 2025 18:55:12 +0000
From: Ackerley Tng <ackerleytng@...gle.com>
To: Ackerley Tng <ackerleytng@...gle.com>
Cc: peterx@...hat.com, tabba@...gle.com, quic_eberman@...cinc.com, 
	roypat@...zon.co.uk, jgg@...dia.com, david@...hat.com, rientjes@...gle.com, 
	fvdl@...gle.com, jthoughton@...gle.com, seanjc@...gle.com, 
	pbonzini@...hat.com, zhiquan1.li@...el.com, fan.du@...el.com, 
	jun.miao@...el.com, isaku.yamahata@...el.com, muchun.song@...ux.dev, 
	mike.kravetz@...cle.com, erdemaktas@...gle.com, vannapurve@...gle.com, 
	qperret@...gle.com, jhubbard@...dia.com, willy@...radead.org, 
	shuah@...nel.org, brauner@...nel.org, bfoster@...hat.com, 
	kent.overstreet@...ux.dev, pvorel@...e.cz, rppt@...nel.org, 
	richard.weiyang@...il.com, anup@...infault.org, haibo1.xu@...el.com, 
	ajones@...tanamicro.com, vkuznets@...hat.com, maciej.wieczor-retman@...el.com, 
	pgonda@...gle.com, oliver.upton@...ux.dev, linux-kernel@...r.kernel.org, 
	linux-mm@...ck.org, kvm@...r.kernel.org, linux-kselftest@...r.kernel.org, 
	linux-fsdevel@...ck.org
Subject: Re: [RFC PATCH 14/39] KVM: guest_memfd: hugetlb: initialization and cleanup

Ackerley Tng <ackerleytng@...gle.com> writes:

> Peter Xu <peterx@...hat.com> writes:
>
>> On Tue, Sep 10, 2024 at 11:43:45PM +0000, Ackerley Tng wrote:
>>> +/**
>>> + * Removes folios in range [@lstart, @lend) from page cache of inode, updates
>>> + * inode metadata and hugetlb reservations.
>>> + */
>>> +static void kvm_gmem_hugetlb_truncate_folios_range(struct inode *inode,
>>> +						   loff_t lstart, loff_t lend)
>>> +{
>>> +	struct kvm_gmem_hugetlb *hgmem;
>>> +	struct hstate *h;
>>> +	int gbl_reserve;
>>> +	int num_freed;
>>> +
>>> +	hgmem = kvm_gmem_hgmem(inode);
>>> +	h = hgmem->h;
>>> +
>>> +	num_freed = kvm_gmem_hugetlb_filemap_remove_folios(inode->i_mapping,
>>> +							   h, lstart, lend);
>>> +
>>> +	gbl_reserve = hugepage_subpool_put_pages(hgmem->spool, num_freed);
>>> +	hugetlb_acct_memory(h, -gbl_reserve);
>>
>> I wonder whether this is needed, and whether hugetlb_acct_memory() needs to
>> be exported in the other patch.
>>
>> IIUC subpools manages the global reservation on its own when min_pages is
>> set (which should be gmem's case, where both max/min set to gmem size).
>> That's in hugepage_put_subpool() -> unlock_or_release_subpool().
>>
>
> Thank you for pointing this out! You are right and I will remove
> hugetlb_acct_memory() from here.
>

I looked further at the folio cleanup process in free_huge_folio() and I
realized I should be returning the pages to the subpool via
free_huge_folio(). There should be no call to
hugepage_subpool_put_pages() directly from this truncate function.

To use free_huge_folio() to return the pages to the subpool, I will
clear the restore_reserve flag once guest_memfd allocates a folio. All
the guest_memfd hugetlb folios will always have the restore_reserve flag
cleared.

With the restore_reserve flag cleared, free_huge_folio() will do
hugepage_subpool_put_pages(), and then restore the reservation in hstate
as well.

Returning the folio to the subpool on freeing is important and correct,
since if/when the folio_put() callback is used, the filemap may not hold
the last refcount on the folio, so truncation may not be when the folio
should not be returned to the subpool.

>>> +
>>> +	spin_lock(&inode->i_lock);
>>> +	inode->i_blocks -= blocks_per_huge_page(h) * num_freed;
>>> +	spin_unlock(&inode->i_lock);
>>> +}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ