lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20241204222054.GA37229@cathedrallabs.org>
Date: Wed, 4 Dec 2024 17:20:54 -0500
From: Aristeu Rozanski <aris@...vo.org>
To: Koichiro Den <koichiro.den@...onical.com>
Cc: linux-mm@...ck.org, muchun.song@...ux.dev, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] hugetlb: prioritize surplus allocation from current node

On Thu, Dec 05, 2024 at 01:55:03AM +0900, Koichiro Den wrote:
> Previously, surplus allocations triggered by mmap were typically made
> from the node where the process was running. On a page fault, the area
> was reliably dequeued from the hugepage_freelists for that node.
> However, since commit 003af997c8a9 ("hugetlb: force allocating surplus
> hugepages on mempolicy allowed nodes"), dequeue_hugetlb_folio_vma() may
> fall back to other nodes unnecessarily even if there is no MPOL_BIND
> policy, causing folios to be dequeued from nodes other than the current
> one.
> 
> Also, allocating from the node where the current process is running is
> likely to result in a performance win, as mmap-ing processes often
> touch the area not so long after allocation. This change minimizes
> surprises for users relying on the previous behavior while maintaining
> the benefit introduced by the commit.
> 
> So, prioritize the node the current process is running on when possible.
> 
> Signed-off-by: Koichiro Den <koichiro.den@...onical.com>
> ---
>  mm/hugetlb.c | 20 +++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 5c8de0f5c760..0fa24e105202 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -2463,7 +2463,13 @@ static int gather_surplus_pages(struct hstate *h, long delta)
>  	long needed, allocated;
>  	bool alloc_ok = true;
>  	int node;
> -	nodemask_t *mbind_nodemask = policy_mbind_nodemask(htlb_alloc_mask(h));
> +	nodemask_t *mbind_nodemask, alloc_nodemask;
> +
> +	mbind_nodemask = policy_mbind_nodemask(htlb_alloc_mask(h));
> +	if (mbind_nodemask)
> +		nodes_and(alloc_nodemask, *mbind_nodemask, cpuset_current_mems_allowed);
> +	else
> +		alloc_nodemask = cpuset_current_mems_allowed;
>  
>  	lockdep_assert_held(&hugetlb_lock);
>  	needed = (h->resv_huge_pages + delta) - h->free_huge_pages;
> @@ -2479,8 +2485,16 @@ static int gather_surplus_pages(struct hstate *h, long delta)
>  	spin_unlock_irq(&hugetlb_lock);
>  	for (i = 0; i < needed; i++) {
>  		folio = NULL;
> -		for_each_node_mask(node, cpuset_current_mems_allowed) {
> -			if (!mbind_nodemask || node_isset(node, *mbind_nodemask)) {
> +
> +		/* Prioritize current node */
> +		if (node_isset(numa_mem_id(), alloc_nodemask))
> +			folio = alloc_surplus_hugetlb_folio(h, htlb_alloc_mask(h),
> +					numa_mem_id(), NULL);
> +
> +		if (!folio) {
> +			for_each_node_mask(node, alloc_nodemask) {
> +				if (node == numa_mem_id())
> +					continue;
>  				folio = alloc_surplus_hugetlb_folio(h, htlb_alloc_mask(h),
>  						node, NULL);
>  				if (folio)

Acked-by: Aristeu Rozanski <aris@...vo.org>

-- 
Aristeu


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ