lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190131063930.GA28876@rapoport-lnx>
Date:   Thu, 31 Jan 2019 08:39:30 +0200
From:   Mike Rapoport <rppt@...ux.ibm.com>
To:     Christophe Leroy <christophe.leroy@....fr>
Cc:     Stephen Rothwell <sfr@...b.auug.org.au>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux Next Mailing List <linux-next@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        PowerPC <linuxppc-dev@...ts.ozlabs.org>
Subject: Re: linux-next: powerpc le qemu boot failure after merge of the akpm
 tree

On Thu, Jan 31, 2019 at 07:15:26AM +0100, Christophe Leroy wrote:
> 
> 
> Le 31/01/2019 à 07:06, Stephen Rothwell a écrit :
> >Hi all,
> >
> >On Thu, 31 Jan 2019 16:38:54 +1100 Stephen Rothwell <sfr@...b.auug.org.au> wrote:
> >>
> >>[I am guessing that is is something in Andrew's tree that has caused
> >>this.]
> >>
> >>My qemu boot of the powerpc pseries_le_defconfig config failed like this:
> >>
> >>htab_hash_mask    = 0x1ffff
> >>-----------------------------------------------------
> >>numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
> >>Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff

This means that sparse_buffer_init tries to allocate 2G for the sparsemap_buf...

Stephen, how many memory do you give to your VM?

> >>CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc4 #2
> >>Call Trace:
> >>[c00000000105bbd0] [c000000000b1345c] dump_stack+0xb0/0xf4 (unreliable)
> >>[c00000000105bc10] [c000000000111120] panic+0x168/0x3b8
> >>[c00000000105bcb0] [c000000000e701c8] sparse_init_nid+0x178/0x550
> >>[c00000000105bd70] [c000000000e709b4] sparse_init+0x210/0x238
> >>[c00000000105bdb0] [c000000000e468f4] initmem_init+0x1e0/0x260
> >>[c00000000105be80] [c000000000e3b9b0] setup_arch+0x354/0x3d4
> >>[c00000000105bef0] [c000000000e33afc] start_kernel+0x98/0x648
> >>[c00000000105bf90] [c00000000000b270] start_here_common+0x1c/0x52c
> >
> >A quick bisect leads to this:
> >
> >1c3c9328cde027eb875ba4692f0a5d66b0afe862 is the first bad commit
> >commit 1c3c9328cde027eb875ba4692f0a5d66b0afe862
> >Author: Mike Rapoport <rppt@...ux.ibm.com>
> >Date:   Thu Jan 31 10:51:32 2019 +1100
> >
> >     treewide: add checks for the return value of memblock_alloc*()
> >     Add check for the return value of memblock_alloc*() functions and call
> >     panic() in case of error.  The panic message repeats the one used by
> >     panicing memblock allocators with adjustment of parameters to include only
> >     relevant ones.
> >     The replacement was mostly automated with semantic patches like the one
> >     below with manual massaging of format strings.
> >     @@
> >     expression ptr, size, align;
> >     @@
> >     ptr = memblock_alloc(size, align);
> >     + if (!ptr)
> >     +       panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__,
> >     size, align);
> >     Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
> >     Signed-off-by: Mike Rapoport <rppt@...ux.ibm.com>
> >     Reviewed-by: Guo Ren <ren_guo@...ky.com>                [c-sky]
> >     Acked-by: Paul Burton <paul.burton@...s.com>            [MIPS]
> >     Acked-by: Heiko Carstens <heiko.carstens@...ibm.com>    [s390]
> >     Reviewed-by: Juergen Gross <jgross@...e.com>            [Xen]
> >     Reviewed-by: Geert Uytterhoeven <geert@...ux-m68k.org>  [m68k]
> >     Cc: Catalin Marinas <catalin.marinas@....com>
> >     Cc: Christophe Leroy <christophe.leroy@....fr>
> >     Cc: Christoph Hellwig <hch@....de>
> >     Cc: "David S. Miller" <davem@...emloft.net>
> >     Cc: Dennis Zhou <dennis@...nel.org>
> >     Cc: Greentime Hu <green.hu@...il.com>
> >     Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> >     Cc: Guan Xuetao <gxt@....edu.cn>
> >     Cc: Guo Ren <guoren@...nel.org>
> >     Cc: Mark Salter <msalter@...hat.com>
> >     Cc: Matt Turner <mattst88@...il.com>
> >     Cc: Max Filippov <jcmvbkbc@...il.com>
> >     Cc: Michael Ellerman <mpe@...erman.id.au>
> >     Cc: Michal Simek <monstr@...str.eu>
> >     Cc: Petr Mladek <pmladek@...e.com>
> >     Cc: Richard Weinberger <richard@....at>
> >     Cc: Rich Felker <dalias@...c.org>
> >     Cc: Rob Herring <robh+dt@...nel.org>
> >     Cc: Rob Herring <robh@...nel.org>
> >     Cc: Russell King <linux@...linux.org.uk>
> >     Cc: Stafford Horne <shorne@...il.com>
> >     Cc: Tony Luck <tony.luck@...el.com>
> >     Cc: Vineet Gupta <vgupta@...opsys.com>
> >     Cc: Yoshinori Sato <ysato@...rs.sourceforge.jp>
> >     Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
> >
> >Which is just adding the panic we hit.  So, presumably, the bug is in a
> >preceding patch :-(
> >
> >I have left the kernel not booting for today.
> >
> 
> No I think the error is really in that patch, see my other mail.
> 
> See https://elixir.bootlin.com/linux/v5.0-rc4/source/mm/memblock.c#L1455,
> memblock_alloc_try_nid_raw() is not supposed to panic, so the last hunk of
> this patch should be reverted.

It is not supposed to panic, but it can still fail, so simply ignoring it's
return value seems a bit odd at least.
 
> Found in total three problematic hunks in that patch:
> 
> @@ -48,6 +53,11 @@ static phys_addr_t __init kasan_alloc_raw_page(int node)
>  	void *p = memblock_alloc_try_nid_raw(PAGE_SIZE, PAGE_SIZE,
>  						__pa(MAX_DMA_ADDRESS),
>  						MEMBLOCK_ALLOC_KASAN, node);
> +	if (!p)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%llx\n",
> +		      __func__, PAGE_SIZE, PAGE_SIZE, node,
> +		      __pa(MAX_DMA_ADDRESS));
> +
>  	return __pa(p);
>  }
> 
> @@ -211,6 +211,9 @@ static int __init iob_init(struct device_node *dn)
>  	iob_l2_base = memblock_alloc_try_nid_raw(1UL << 21, 1UL << 21,
>  					MEMBLOCK_LOW_LIMIT, 0x80000000,
>  					NUMA_NO_NODE);
> +	if (!iob_l2_base)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx max_addr=%x\n",
> +		      __func__, 1UL << 21, 1UL << 21, 0x80000000);
> 
>  	pr_info("IOBMAP L2 allocated at: %p\n", iob_l2_base);
> 
> 
> @@ -425,6 +436,10 @@ static void __init sparse_buffer_init(unsigned long
> size, int nid)
>  		memblock_alloc_try_nid_raw(size, PAGE_SIZE,
>  						__pa(MAX_DMA_ADDRESS),
>  						MEMBLOCK_ALLOC_ACCESSIBLE, nid);
> +	if (!sparsemap_buf)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
> +		      __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
> +
>  	sparsemap_buf_end = sparsemap_buf + size;
>  }
> 
> 
> 
> Christophe
> 

-- 
Sincerely yours,
Mike.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ