[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YYQTKRrDIJbkcplr@kernel.org>
Date: Thu, 4 Nov 2021 19:06:49 +0200
From: Mike Rapoport <rppt@...nel.org>
To: Qian Cai <quic_qiancai@...cinc.com>
Cc: Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-arm-kernel@...ts.infradead.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] arm64: Track no early_pgtable_alloc() for kmemleak
On Thu, Nov 04, 2021 at 11:56:23AM -0400, Qian Cai wrote:
> After switched page size from 64KB to 4KB on several arm64 servers here,
> kmemleak starts to run out of early memory pool due to a huge number of
> those early_pgtable_alloc() calls:
>
> kmemleak_alloc_phys()
> memblock_alloc_range_nid()
> memblock_phys_alloc_range()
> early_pgtable_alloc()
> init_pmd()
> alloc_init_pud()
> __create_pgd_mapping()
> __map_memblock()
> paging_init()
> setup_arch()
> start_kernel()
>
> Increased the default value of DEBUG_KMEMLEAK_MEM_POOL_SIZE by 4 times
> won't be enough for a server with 200GB+ memory. There isn't much
> interesting to check memory leaks for those early page tables and those
> early memory mappings should not reference to other memory. Hence, no
> kmemleak false positives, and we can safely skip tracking those early
> allocations from kmemleak like we did in the commit fed84c785270
> ("mm/memblock.c: skip kmemleak for kasan_init()") without needing to
> introduce complications to automatically scale the value depends on the
> runtime memory size etc. After the patch, the default value of
> DEBUG_KMEMLEAK_MEM_POOL_SIZE becomes sufficient again.
>
> Signed-off-by: Qian Cai <quic_qiancai@...cinc.com>
> ---
> arch/arm64/mm/mmu.c | 3 ++-
> include/linux/memblock.h | 1 +
> mm/memblock.c | 10 +++++++---
> 3 files changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index d77bf06d6a6d..4d3cfbaa92a7 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -96,7 +96,8 @@ static phys_addr_t __init early_pgtable_alloc(int shift)
> phys_addr_t phys;
> void *ptr;
>
> - phys = memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
> + phys = memblock_phys_alloc_range(PAGE_SIZE, PAGE_SIZE, 0,
> + MEMBLOCK_ALLOC_PGTABLE);
> if (!phys)
> panic("Failed to allocate page table page\n");
>
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 7df557b16c1e..de903055b01c 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -390,6 +390,7 @@ static inline int memblock_get_region_node(const struct memblock_region *r)
> #define MEMBLOCK_ALLOC_ANYWHERE (~(phys_addr_t)0)
> #define MEMBLOCK_ALLOC_ACCESSIBLE 0
> #define MEMBLOCK_ALLOC_KASAN 1
> +#define MEMBLOCK_ALLOC_PGTABLE 2
>
> /* We are using top down, so it is safe to use 0 here */
> #define MEMBLOCK_LOW_LIMIT 0
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 659bf0ffb086..13bc56a641c0 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -287,7 +287,8 @@ static phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size,
> {
> /* pump up @end */
> if (end == MEMBLOCK_ALLOC_ACCESSIBLE ||
> - end == MEMBLOCK_ALLOC_KASAN)
> + end == MEMBLOCK_ALLOC_KASAN ||
> + end == MEMBLOCK_ALLOC_PGTABLE)
I think I'll be better to rename MEMBLOCK_ALLOC_KASAN to, say,
MEMBLOCK_ALLOC_NOKMEMLEAK and use that for both KASAN and page table cases.
But more generally, we are going to hit this again and again.
Couldn't we add a memblock allocation as a mean to get more memory to
kmemleak::mem_pool_alloc()?
> end = memblock.current_limit;
>
> /* avoid allocating the first page */
> @@ -1387,8 +1388,11 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size,
> return 0;
>
> done:
> - /* Skip kmemleak for kasan_init() due to high volume. */
> - if (end != MEMBLOCK_ALLOC_KASAN)
> + /*
> + * Skip kmemleak for kasan_init() and early_pgtable_alloc() due to high
> + * volume.
> + */
> + if (end != MEMBLOCK_ALLOC_KASAN && end != MEMBLOCK_ALLOC_PGTABLE)
> /*
> * The min_count is set to 0 so that memblock allocated
> * blocks are never reported as leaks. This is because many
> --
> 2.30.2
>
--
Sincerely yours,
Mike.
Powered by blists - more mailing lists