[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87lgai9bt5.fsf@concordia.ellerman.id.au>
Date: Wed, 11 Jul 2018 22:49:58 +1000
From: Michael Ellerman <mpe@...erman.id.au>
To: akpm@...ux-foundation.org, broonie@...nel.org, mhocko@...e.cz,
sfr@...b.auug.org.au, linux-next@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, mm-commits@...r.kernel.org,
linuxppc-dev@...ts.ozlabs.org, bhe@...hat.com,
pasha.tatashin@...cle.com,
"Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
Anshuman Khandual <khandual@...ux.vnet.ibm.com>
Subject: Boot failures with "mm/sparse: Remove CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER" on powerpc (was Re: mmotm 2018-07-10-16-50 uploaded)
akpm@...ux-foundation.org writes:
> The mm-of-the-moment snapshot 2018-07-10-16-50 has been uploaded to
>
> http://www.ozlabs.org/~akpm/mmotm/
...
> * mm-sparse-add-a-static-variable-nr_present_sections.patch
> * mm-sparsemem-defer-the-ms-section_mem_map-clearing.patch
> * mm-sparsemem-defer-the-ms-section_mem_map-clearing-fix.patch
> * mm-sparse-add-a-new-parameter-data_unit_size-for-alloc_usemap_and_memmap.patch
> * mm-sparse-optimize-memmap-allocation-during-sparse_init.patch
> * mm-sparse-optimize-memmap-allocation-during-sparse_init-checkpatch-fixes.patch
> * mm-sparse-remove-config_sparsemem_alloc_mem_map_together.patch
This seems to be breaking my powerpc pseries qemu boots.
The boot log with some extra debug shows eg:
$ make pseries_le_defconfig
$ qemu-system-ppc64 -nographic -vga none -M pseries -m 2G -kernel vmlinux
...
vmemmap_populate f000000000000000..f000000000004000, node 0
* f000000000000000..f000000001000000 allocated at c00000007e000000
hash__vmemmap_create_mapping: start 0xf000000000000000 size 0x1000000 phys 0x7e000000
vmemmap_populate f000000000000000..f000000000008000, node 0
* f000000000000000..f000000001000000 allocated at c00000007d000000
hash__vmemmap_create_mapping: start 0xf000000000000000 size 0x1000000 phys 0x7d000000
vmemmap_populate f000000000000000..f00000000000c000, node 0
* f000000000000000..f000000001000000 allocated at c00000007c000000
hash__vmemmap_create_mapping: start 0xf000000000000000 size 0x1000000 phys 0x7c000000
vmemmap_populate f000000000000000..f000000000010000, node 0
* f000000000000000..f000000001000000 allocated at c00000007b000000
hash__vmemmap_create_mapping: start 0xf000000000000000 size 0x1000000 phys 0x7b000000
vmemmap_populate f000000000000000..f000000000014000, node 0
* f000000000000000..f000000001000000 allocated at c00000007a000000
hash__vmemmap_create_mapping: start 0xf000000000000000 size 0x1000000 phys 0x7a000000
vmemmap_populate f000000000000000..f000000000018000, node 0
* f000000000000000..f000000001000000 allocated at c000000079000000
hash__vmemmap_create_mapping: start 0xf000000000000000 size 0x1000000 phys 0x79000000
vmemmap_populate f000000000000000..f00000000001c000, node 0
* f000000000000000..f000000001000000 allocated at c000000078000000
hash__vmemmap_create_mapping: start 0xf000000000000000 size 0x1000000 phys 0x78000000
vmemmap_populate f000000000000000..f000000000020000, node 0
* f000000000000000..f000000001000000 allocated at c000000077000000
hash__vmemmap_create_mapping: start 0xf000000000000000 size 0x1000000 phys 0x77000000
vmemmap_populate f000000000000000..f000000000024000, node 0
* f000000000000000..f000000001000000 allocated at c000000076000000
hash__vmemmap_create_mapping: start 0xf000000000000000 size 0x1000000 phys 0x76000000
hash__vmemmap_create_mapping: failed -1
<repeated many times>
Then there's lots of other warnings about bad page states and eventually
a NULL deref and we panic().
The problem seems to be that we're calling down into
hash__vmemmap_create_mapping() for every call to vmemmap_populate(),
whereas previously we would only call hash__vmemmap_create_mapping()
once because our vmemmap_populated() would return true.
There's actually a comment in sparse_init() that says:
* powerpc need to call sparse_init_one_section right after each
* sparse_early_mem_map_alloc, so allocate usemap_map at first.
So changing that behaviour does seem to be the problem.
I assume that comment is talking about the fact that we use pfn_valid()
in vmemmap_populated().
I'm not clear on how to fix it though.
Any ideas?
cheers
Powered by blists - more mailing lists