linux-kernel - Re: [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d1671959-74a4-8ea5-81f0-539df8d9c0f0@huawei.com>
Date: Sat, 27 Jan 2024 13:04:15 +0800
From: Nanyong Sun <sunnanyong@...wei.com>
To: Catalin Marinas <catalin.marinas@....com>
CC: <will@...nel.org>, <mike.kravetz@...cle.com>, <muchun.song@...ux.dev>,
	<akpm@...ux-foundation.org>, <anshuman.khandual@....com>,
	<willy@...radead.org>, <wangkefeng.wang@...wei.com>,
	<linux-arm-kernel@...ts.infradead.org>, <linux-kernel@...r.kernel.org>,
	<linux-mm@...ck.org>
Subject: Re: [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize


On 2024/1/26 2:06, Catalin Marinas wrote:
> On Sat, Jan 13, 2024 at 05:44:33PM +0800, Nanyong Sun wrote:
>> HVO was previously disabled on arm64 [1] due to the lack of necessary
>> BBM(break-before-make) logic when changing page tables.
>> This set of patches fix this by adding necessary BBM sequence when
>> changing page table, and supporting vmemmap page fault handling to
>> fixup kernel address translation fault if vmemmap is concurrently accessed.
> I'm not keen on this approach. I'm not even sure it's safe. In the
> second patch, you take the init_mm.page_table_lock on the fault path but
> are we sure this is unlocked when the fault was taken?
I think this situation is impossible. In the implementation of the 
second patch, when the page table is being corrupted
(the time window when a page fault may occur), vmemmap_update_pte() 
already holds the init_mm.page_table_lock,
and unlock it until page table update is done.Another thread could not 
hold the init_mm.page_table_lock and
also trigger a page fault at the same time.
If I have missed any points in my thinking, please correct me. Thank you.
> Basically you can
> get a fault anywhere something accesses a struct page.
>
> How often is this code path called? I wonder whether a stop_machine()
> approach would be simpler.
As long as allocating or releasing hugetlb is called.  We cannot limit 
users to only allocate or release hugetlb
when booting or not running any workload on all other cpus, so if use 
stop_machine(), it will be triggered
8 times every 2M and 4096 times every 1G, which is probably too expensive.
I saw that on the X86, in order to improve performance, optimizations 
such as batch tlb flushing have been done,
means that some users are concerned about the performance of hugetlb 
allocation:
https://lwn.net/ml/linux-kernel/20230905214412.89152-1-mike.kravetz@oracle.com/ 

> Andrew, I'd suggest we drop these patches from the mm tree for the time
> being. They haven't received much review from the arm64 folk. Thanks.
>