lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b0d9f82f-6e29-4613-a7ab-183e888a8eff@redhat.com>
Date: Fri, 10 Jan 2025 09:14:47 +0100
From: David Hildenbrand <david@...hat.com>
To: yangge1116@....com, akpm@...ux-foundation.org
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org, 21cnbao@...il.com,
 baolin.wang@...ux.alibaba.com, liuzixing@...on.cn
Subject: Re: [PATCH] mm/hugetlb: prevent reuse of isolated free hugepages

On 10.01.25 03:56, yangge1116@....com wrote:
> From: yangge <yangge1116@....com>
> 
> When there are free hugetlb folios in the hugetlb pool, during the
> migration of in-use hugetlb folios, new folios is allocated from
> the free hugetlb pool. After the migration is completed, the old
> folios are released back to the free hugetlb pool. However, after
> the old folios are released to the free hugetlb pool, they may be
> reallocated. When replace_free_hugepage_folios() is executed later,
> it cannot release these old folios back to the buddy system.
> 
> As discussed with David in [1], when alloc_contig_range() is used
> to migrate multiple in-use hugetlb pages, it can lead to the issue
> described above. For example:
> 
> [huge 0] [huge 1]
> 
> To migrate huge 0, we obtain huge x from the pool. After the migration
> is completed, we return the now-freed huge 0 back to the pool. When
> it's time to migrate huge 1, we can simply reuse the now-freed huge 0
> from the pool. As a result, when replace_free_hugepage_folios() is
> executed, it cannot release huge 0 back to the buddy system.
> 
> To slove the proble above, we should prevent reuse of isolated free
> hugepages.

s/slove/solve/
s/proble/problem/

> 
> Link: https://lore.kernel.org/lkml/1734503588-16254-1-git-send-email-yangge1116@126.com/
> Fixes: 08d312ee4c0a ("mm: replace free hugepage folios after migration")

The commit id is not stable yet.

$ git tag --contains  08d312ee4c0a
mm-everything-2025-01-09-06-44
next-20250110


We should squash this into the original fix. Can you resend the whole 
thing and merge the patch descriptions?

> Signed-off-by: yangge <yangge1116@....com>
> ---
>   mm/hugetlb.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 9a55960..e5f9999 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -48,6 +48,7 @@
>   #include <linux/page_owner.h>
>   #include "internal.h"
>   #include "hugetlb_vmemmap.h"
> +#include <linux/page-isolation.h>
>   
>   int hugetlb_max_hstate __read_mostly;
>   unsigned int default_hstate_idx;
> @@ -1273,6 +1274,9 @@ static struct folio *dequeue_hugetlb_folio_node_exact(struct hstate *h,
>   		if (folio_test_hwpoison(folio))
>   			continue;
>   
> +		if (is_migrate_isolate_page(&folio->page))
> +			continue;
> +
>   		list_move(&folio->lru, &h->hugepage_activelist);
>   		folio_ref_unfreeze(folio, 1);
>   		folio_clear_hugetlb_freed(folio);

Sorry for not getting back to your previous mail, this week was a bit crazy.

This will work reliably if the huge page does not span more than a 
single page block.

Assuming it would span multiple ones, we might have only isolated the 
last etc. pageblock. For the common cases it might do, but not for all 
cases unfortunately (especially not gigantic pages, but I recall we skip 
them during alloc_contig_pages(); I recall some oddities on ppc even 
without gigantic pages involved).

One option would be to stare at all involved pageblocks, although a bit 
nasty ... let me think about this.

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ