lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 23 Feb 2024 15:19:34 +0100
From: Oscar Salvador <osalvador@...e.de>
To: Baolin Wang <baolin.wang@...ux.alibaba.com>
Cc: akpm@...ux-foundation.org, muchun.song@...ux.dev, david@...hat.com,
	linmiaohe@...wei.com, naoya.horiguchi@....com, mhocko@...nel.org,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 2/3] mm: hugetlb: make the hugetlb migration strategy
 consistent

On Fri, Feb 23, 2024 at 10:56:48AM +0800, Baolin Wang wrote:
 
> I previously considered passing the MR_reason argument to the
> htlb_modify_alloc_mask(), which is only used by hugetlb migration.
> But in alloc_hugetlb_folio_nodemask(), if there are available hugetlb on
> other nodes, we should allow migrating, that will not break the per-node
> hugetlb pool.
> 
> That's why I just change the gfp_mask for allocating a new hguetlb when
> migration, that can break the pool.

Code-wise I think this is good, but I'm having some feelings
about where filter out the mask.
Ok, I'm trying to get my head around this.
It's been a while since I looked into hugetlb code, so here we go.

You mentioned that the only reason not to fiddle with gfp_mask before calling
in alloc_hugetlb_folio_nodemask(), was that we might be able to find a hugetlb
page in another node, and that that's ok because since all nodes remain with
the same number of hugetlb pages, per-node pool doesn't get broken.

Now, I see that dequeue_hugetlb_folio_nodemask() first tries to get the zonelist
of the preferred node, and AFAICS, if it has !GFP_THISNODE, it should also
get the zonelists of all other nodes, so we might fallback.

In the hope of finding a way to be able to filter out in htlb_modify_alloc_mask(),
I was trying to see whether we could skip GFP_THISNODE in
dequeue_hugetlb_folio_nodemask() but no because we might end up dequeueing
a hugetlb which sits in another node, while we really specified __GFP_THISNODE.

The only way might be to somehow decouple dequeue_hugetlb_folio_nodemask()
from alloc_hugetlb_folio_nodemask() and do some kind of gfp modification
between the two calls.

Another thing I dislike is the "-1" in alloc_hugetlb_folio_vma().
I think at least it deserves a comment like "Passing -1 will make us stick
to GFP_THISNODE".
Although that is another thing, we will pass "-1" which forces GFP_THISNODE
when allocating a newly fresh hugetlb page, but in dequeue_hugetlb_folio_nodemask()
we might get a page from a different node.
That doesn't break per-node pool, but it is somehow odd?
 

-- 
Oscar Salvador
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ