lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87zlpz6di0.fsf@saeurebad.de>
Date:	Thu, 05 Jun 2008 18:41:11 +0200
From:	Johannes Weiner <hannes@...urebad.de>
To:	Yasunori Goto <y-goto@...fujitsu.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>,
	Yinghai Lu <yhlu.kernel@...il.com>,
	Andi Kleen <andi@...stfloor.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH -mm 11/14] bootmem: respect goal more likely

Hi,

Yasunori Goto <y-goto@...fujitsu.com> writes:

> I gotcha!  :-)
>
>  	max -= PFN_DOWN(bdata->node_boot_start);
>  	start -= PFN_DOWN(bdata->node_boot_start);
> +	fallback -= PFN_DOWN(bdata->node_boot_start);
>  
>  	if (bdata->last_success > start) {
> -		/* Set goal here to trigger a retry on failure */
> -		start = goal = ALIGN(bdata->last_success, step);
> +		fallback = start; -------------------------------- (*)
> +		start = ALIGN(bdata->last_success, step);
>  	}
>
> (*) is root cause. "fallback" is set as 0, because start is index of bitmap
> at here. When normal zone is allocated first, and DMA zone is required by
> alloc_bootmem_low() later and first page is free yet, fallback is set as 0.
>
> +	if (fallback) {
> +		start = ALIGN(fallback, step);
> +		fallback = 0;
> +		goto find_block;
> +	}
> +
>
> As a result, this retry code is skipped, and alloc_bootmem_low() fails. 
> So, when I change here from fallback to a retry_flag, my box can boot
> up.

Ah, you got there too :-)

Here is the original Email I wrote but waited to send out until after I
slept:

---

Hi Yasunori,

here is hopefully the fix to your bootup problem.  Let me explain:

This is the last but one bootmem allocation on your box:

Jun  5 11:50:43 localhost kernel: bootmem::alloc_bootmem_core nid=0 size=40000 [16 pages] align=80 goal=100000000 limit=0
Jun  5 11:50:43 localhost kernel: bootmem::__reserve nid=0 start=1020588 end=1020598 flags=1

(->last_success is at 1020588 now)

And this is the last one:

Jun  5 11:50:43 localhost kernel: bootmem::alloc_bootmem_core nid=0 size=8000 [2 pages] align=80 goal=0 limit=ffffffff

Goal is zero.  So sidx is set to zero as well.  last_success is bigger
than sidx, so fallback gets the value of sidx and sidx is updated to
last_succes.  But now sidx (1020588) is bigger than the limit (3ffff)
and we fail the first search iteration because of that.  We should
fall back to the original sidx stored in fallback now but take a look
at what value fallback must have for this branch to be taken...

I am pretty sure this is the bug you encountered.  I spotted it in the
code but your logfile proves my theory. 

	Hannes

--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -444,7 +444,11 @@ static void * __init alloc_bootmem_core(
 	midx = max - PFN_DOWN(bdata->node_boot_start);
 
 	if (bdata->last_success > sidx) {
-		fallback = sidx;
+		/*
+		 * Handle the valid case of sidx being zero and still
+		 * catch the fallback below.
+		 */
+		fallback = sidx + 1;
 		sidx = ALIGN(bdata->last_success, step);
 	}
 
@@ -493,7 +497,7 @@ find_block:
 	}
 
 	if (fallback) {
-		sidx = ALIGN(fallback, step);
+		sidx = ALIGN(fallback - 1, step);
 		fallback = 0;
 		goto find_block;
 	}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ