[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200730191618.GA703407@carbon.DHCP.thefacebook.com>
Date: Thu, 30 Jul 2020 12:16:18 -0700
From: Roman Gushchin <guro@...com>
To: Mike Kravetz <mike.kravetz@...cle.com>
CC: <linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>,
Marek Szyprowski <m.szyprowski@...sung.com>,
Michal Nazarewicz <mina86@...a86.com>,
Kyungmin Park <kyungmin.park@...sung.com>,
Barry Song <song.bao.hua@...ilicon.com>,
Andrew Morton <akpm@...ux-foundation.org>,
<stable@...r.kernel.org>
Subject: Re: [PATCH] cma: don't quit at first error when activating reserved
areas
On Thu, Jul 30, 2020 at 09:31:23AM -0700, Mike Kravetz wrote:
> The routine cma_init_reserved_areas is designed to activate all
> reserved cma areas. It quits when it first encounters an error.
> This can leave some areas in a state where they are reserved but
> not activated. There is no feedback to code which performed the
> reservation. Attempting to allocate memory from areas in such a
> state will result in a BUG.
>
> Modify cma_init_reserved_areas to always attempt to activate all
> areas. The called routine, cma_activate_area is responsible for
> leaving the area in a valid state. No one is making active use
> of returned error codes, so change the routine to void.
>
> How to reproduce: This example uses kernelcore, hugetlb and cma
> as an easy way to reproduce. However, this is a more general cma
> issue.
>
> Two node x86 VM 16GB total, 8GB per node
> Kernel command line parameters, kernelcore=4G hugetlb_cma=8G
> Related boot time messages,
> hugetlb_cma: reserve 8192 MiB, up to 4096 MiB per node
> cma: Reserved 4096 MiB at 0x0000000100000000
> hugetlb_cma: reserved 4096 MiB on node 0
> cma: Reserved 4096 MiB at 0x0000000300000000
> hugetlb_cma: reserved 4096 MiB on node 1
> cma: CMA area hugetlb could not be activated
>
> # echo 8 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP PTI
> ...
> Call Trace:
> bitmap_find_next_zero_area_off+0x51/0x90
> cma_alloc+0x1a5/0x310
> alloc_fresh_huge_page+0x78/0x1a0
> alloc_pool_huge_page+0x6f/0xf0
> set_max_huge_pages+0x10c/0x250
> nr_hugepages_store_common+0x92/0x120
> ? __kmalloc+0x171/0x270
> kernfs_fop_write+0xc1/0x1a0
> vfs_write+0xc7/0x1f0
> ksys_write+0x5f/0xe0
> do_syscall_64+0x4d/0x90
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Fixes: c64be2bb1c6e ("drivers: add Contiguous Memory Allocator")
> Signed-off-by: Mike Kravetz <mike.kravetz@...cle.com>
> Cc: <stable@...r.kernel.org>
Makes total sense to me!
Reviewed-by: Roman Gushchin <guro@...com>
Thanks!
Powered by blists - more mailing lists