lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231014000450.GA253713@monkey>
Date:   Fri, 13 Oct 2023 17:04:50 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Nathan Chancellor <nathan@...nel.org>
Cc:     Usama Arif <usama.arif@...edance.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
        muchun.song@...ux.dev, songmuchun@...edance.com,
        fam.zheng@...edance.com, liangma@...ngbit.com,
        punit.agrawal@...edance.com,
        Konrad Dybcio <konrad.dybcio@...aro.org>, llvm@...ts.linux.dev
Subject: Re: [PATCH] mm: hugetlb: Only prep and add allocated folios for
 non-gigantic pages

On 10/12/23 17:12, Mike Kravetz wrote:
> On 10/12/23 07:53, Mike Kravetz wrote:
> > On 10/11/23 17:03, Nathan Chancellor wrote:
> > > On Mon, Oct 09, 2023 at 06:23:45PM -0700, Mike Kravetz wrote:
> > > > On 10/09/23 15:56, Usama Arif wrote:
> > 
> > Thank you Nathan!  That is very helpful.
> > 
> > I will use this information to try and recreate.  If I can recreate, I
> > should be able to get to root cause.
> 
> I could easily recreate the issue using the provided instructions.  First
> thing I did was add a few printk's to check/verify state.  The beginning
> of gather_bootmem_prealloc looked like this:

Hi Nathan,

This is looking more and more like a Clang issue to me.  I did a little
more problem isolation today.  Here is what I did:

- Check out commit "hugetlb: restructure pool allocations" in linux-next
- Fix the known issue with early disable/enable IRQs via locking by
  applying:

commit 266789498210dff6cf9a14b64fa3a5cb2fcc5858
Author: Mike Kravetz <mike.kravetz@...cle.com>
Date:   Fri Oct 13 13:14:15 2023 -0700

    fix prep_and_add_allocated_folios locking

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index c843506654f8..d8ab2d9b391b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2246,15 +2246,16 @@ static struct folio *alloc_fresh_hugetlb_folio(struct hstate *h,
 static void prep_and_add_allocated_folios(struct hstate *h,
 					struct list_head *folio_list)
 {
+	unsigned long flags;
 	struct folio *folio, *tmp_f;
 
 	/* Add all new pool pages to free lists in one lock cycle */
-	spin_lock_irq(&hugetlb_lock);
+	spin_lock_irqsave(&hugetlb_lock, flags);
 	list_for_each_entry_safe(folio, tmp_f, folio_list, lru) {
 		__prep_account_new_huge_page(h, folio_nid(folio));
 		enqueue_hugetlb_folio(h, folio);
 	}
-	spin_unlock_irq(&hugetlb_lock);
+	spin_unlock_irqrestore(&hugetlb_lock, flags);
 }
 
 /*

- Add the following code which would only trigger a BUG if we were to
  traverse an empty list; which should NEVER happen.

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index d8ab2d9b391b..be234831b33f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3294,11 +3294,21 @@ static void __init gather_bootmem_prealloc(void)
 	LIST_HEAD(folio_list);
 	struct huge_bootmem_page *m;
 	struct hstate *h, *prev_h = NULL;
+	bool empty;
+
+	empty = list_empty(&huge_boot_pages);
+	if (empty)
+		printk("gather_bootmem_prealloc: huge_boot_pages list empty\n");
 
 	list_for_each_entry(m, &huge_boot_pages, list) {
 		struct page *page = virt_to_page(m);
 		struct folio *folio = (void *)page;
 
+		if (empty) {
+			printk("    Traversing an empty list as if not empty!!!\n");
+			BUG();
+		}
+
 		h = m->hstate;
 		/*
 		 * It is possible to have multiple huge page sizes (hstates)

- As you have experienced, this will BUG if built with LLVM 17.0.2 and
  CONFIG_INIT_STACK_NONE

- It will NOT BUG if built with LLVM 13.0.1 but will BUG if built with
  LLVM llvm-14.0.6-x86_64 and later.

As mentioned in the previous email, the generated code for loop entry
looks wrong to my untrained eyes.  Can you or someone on the llvm team
take a look?
-- 
Mike Kravetz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ