lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231013001203.GA3812@monkey>
Date:   Thu, 12 Oct 2023 17:12:03 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Nathan Chancellor <nathan@...nel.org>
Cc:     Usama Arif <usama.arif@...edance.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
        muchun.song@...ux.dev, songmuchun@...edance.com,
        fam.zheng@...edance.com, liangma@...ngbit.com,
        punit.agrawal@...edance.com,
        Konrad Dybcio <konrad.dybcio@...aro.org>, llvm@...ts.linux.dev
Subject: Re: [PATCH] mm: hugetlb: Only prep and add allocated folios for
 non-gigantic pages

On 10/12/23 07:53, Mike Kravetz wrote:
> On 10/11/23 17:03, Nathan Chancellor wrote:
> > On Mon, Oct 09, 2023 at 06:23:45PM -0700, Mike Kravetz wrote:
> > > On 10/09/23 15:56, Usama Arif wrote:
> > 
> > I suspect the crash that our continuous integration spotted [1] is the
> > same issue that Konrad is seeing, as I have bisected that failure to
> > bfb41d6b2fe1 in next-20231009. However, neither the first half of your
> > diff (since the second half does not apply at bfb41d6b2fe1) nor the
> > original patch in this thread resolves the issue though, so maybe it is
> > entirely different from Konrad's?
> > 
> > For what it's worth, this issue is only visible for me when building for
> > arm64 using LLVM with CONFIG_INIT_STACK_NONE=y, instead of the default
> > CONFIG_INIT_STACK_ALL_ZERO=y (which appears to hide the problem?),
> > making it seem like it could be something with uninitialized memory... I
> > have not been able to reproduce it with GCC, which could also mean
> > something.
> 
> Thank you Nathan!  That is very helpful.
> 
> I will use this information to try and recreate.  If I can recreate, I
> should be able to get to root cause.

I could easily recreate the issue using the provided instructions.  First
thing I did was add a few printk's to check/verify state.  The beginning
of gather_bootmem_prealloc looked like this:

static void __init gather_bootmem_prealloc(void)
{
	LIST_HEAD(folio_list);
	struct huge_bootmem_page *m;
	struct hstate *h, *prev_h = NULL;

	if (list_empty(&huge_boot_pages))
		printk("gather_bootmem_prealloc: huge_boot_pages list empty\n");

	list_for_each_entry(m, &huge_boot_pages, list) {
		struct page *page = virt_to_page(m);
		struct folio *folio = (void *)page;

		printk("gather_bootmem_prealloc: loop entry m %lx\n",
							(unsigned long)m);

The STRANGE thing is that the printk after testing for list_empty would
print, then we would enter the 'list_for_each_entry()' loop as if the list
was not empty.  This is the cause of the addressing exception.  m pointed
to the list head as opposed to an entry on the list.

I have attached disassembly of gather_bootmem_prealloc with INIT_STACK_NONE
and INIT_STACK_ALL_ZERO.  disassembly listings are for code without
printks.

This is the first time I have looked at arm assembly, so I may be missing
something.  However, in the INIT_STACK_NONE case it looks like we get the
address of huge_boot_pages into a register but do not use it to determine
if we should execute the loop.  Code generated with INIT_STACK_ALL_ZERO seems
to show code checking the list before entering the loop.

Can someone with more arm assembly experience take a quick look?  Since
huge_boot_pages is a global variable rather than on the stack, I can't
see how INIT_STACK_ALL_ZERO/INIT_STACK_NONE could make a difference.
-- 
Mike Kravetz

View attachment "disass_INIT_STACK_NONE" of type "text/plain" (9882 bytes)

View attachment "disass_INIT_STACK_ALL_ZERO" of type "text/plain" (10137 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ