lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 26 Mar 2024 12:54:10 +0000
From: Will Deacon <will@...nel.org>
To: Nanyong Sun <sunnanyong@...wei.com>
Cc: David Rientjes <rientjes@...gle.com>,
	Catalin Marinas <catalin.marinas@....com>,
	Matthew Wilcox <willy@...radead.org>, muchun.song@...ux.dev,
	Andrew Morton <akpm@...ux-foundation.org>,
	anshuman.khandual@....com, wangkefeng.wang@...wei.com,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, Yu Zhao <yuzhao@...gle.com>,
	Yosry Ahmed <yosryahmed@...gle.com>,
	Sourav Panda <souravpanda@...gle.com>
Subject: Re: [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize

On Mon, Mar 25, 2024 at 11:24:34PM +0800, Nanyong Sun wrote:
> On 2024/3/14 7:32, David Rientjes wrote:
> > On Thu, 8 Feb 2024, Will Deacon wrote:
> > > > How about take a new lock with irq disabled during BBM, like:
> > > > 
> > > > +void vmemmap_update_pte(unsigned long addr, pte_t *ptep, pte_t pte)
> > > > +{
> > > > +     (NEW_LOCK);
> > > > +    pte_clear(&init_mm, addr, ptep);
> > > > +    flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> > > > +    set_pte_at(&init_mm, addr, ptep, pte);
> > > > +    spin_unlock_irq(NEW_LOCK);
> > > > +}
> > > I really think the only maintainable way to achieve this is to avoid the
> > > possibility of a fault altogether.
> > > 
> > Nanyong, are you still actively working on making HVO possible on arm64?
> > 
> > This would yield a substantial memory savings on hosts that are largely
> > configured with hugetlbfs.  In our case, the size of this hugetlbfs pool
> > is actually never changed after boot, but it sounds from the thread that
> > there was an idea to make HVO conditional on FEAT_BBM.  Is this being
> > pursued?
> > 
> > If so, any testing help needed?
> I'm afraid that FEAT_BBM may not solve the problem here, because from Arm
> ARM,
> I see that FEAT_BBM is only used for changing block size. Therefore, in this
> HVO feature,
> it can work in the split PMD stage, that is, BBM can be avoided in
> vmemmap_split_pmd,
> but in the subsequent vmemmap_remap_pte, the Output address of PTE still
> needs to be
> changed. I'm afraid FEAT_BBM is not competent for this stage. Perhaps my
> understanding
> of ARM FEAT_BBM is wrong, and I hope someone can correct me.
> Actually, the solution I first considered was to use the stop_machine
> method, but we have
> products that rely on /proc/sys/vm/nr_overcommit_hugepages to dynamically
> use hugepages,
> so I have to consider performance issues. If your product does not change
> the amount of huge
> pages after booting, using stop_machine() may be a feasible way.
> So far, I still haven't come up with a good solution.

Oh, I hadn't appreciated that you needed to remap the memmap live. How
do you synchronise the two copies in that case? I think we (i.e. the arch
folks) probably need some more explanation on exactly who can race with
what here, otherwise I don't grok how this can work.

Thanks,

Will

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ