linux-kernel - Re: [PATCH 2/2] arm64: Allocate crashkernel always in ZONE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACi5LpPn4QUjC692G=5UxLchpi+ZL+xFCcxqLbFvgvvcso28ww@mail.gmail.com>
Date:   Fri, 3 Jul 2020 00:52:08 +0530
From:   Bhupesh Sharma <bhsharma@...hat.com>
To:     Will Deacon <will@...nel.org>
Cc:     cgroups@...r.kernel.org, linux-mm@...ck.org,
        linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
        Bhupesh SHARMA <bhupesh.linux@...il.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...nel.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        James Morse <james.morse@....com>,
        Mark Rutland <mark.rutland@....com>,
        Catalin Marinas <catalin.marinas@....com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        kexec mailing list <kexec@...ts.infradead.org>
Subject: Re: [PATCH 2/2] arm64: Allocate crashkernel always in ZONE_DMA

Hi Will,

On Thu, Jul 2, 2020 at 1:20 PM Will Deacon <will@...nel.org> wrote:
>
> On Thu, Jul 02, 2020 at 03:44:20AM +0530, Bhupesh Sharma wrote:
> > commit bff3b04460a8 ("arm64: mm: reserve CMA and crashkernel in
> > ZONE_DMA32") allocates crashkernel for arm64 in the ZONE_DMA32.
> >
> > However as reported by Prabhakar, this breaks kdump kernel booting in
> > ThunderX2 like arm64 systems. I have noticed this on another ampere
> > arm64 machine. The OOM log in the kdump kernel looks like this:
> >
> >   [    0.240552] DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations
> >   [    0.247713] swapper/0: page allocation failure: order:1, mode:0xcc1(GFP_KERNEL|GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0
> >   <..snip..>
> >   [    0.274706] Call trace:
> >   [    0.277170]  dump_backtrace+0x0/0x208
> >   [    0.280863]  show_stack+0x1c/0x28
> >   [    0.284207]  dump_stack+0xc4/0x10c
> >   [    0.287638]  warn_alloc+0x104/0x170
> >   [    0.291156]  __alloc_pages_slowpath.constprop.106+0xb08/0xb48
> >   [    0.296958]  __alloc_pages_nodemask+0x2ac/0x2f8
> >   [    0.301530]  alloc_page_interleave+0x20/0x90
> >   [    0.305839]  alloc_pages_current+0xdc/0xf8
> >   [    0.309972]  atomic_pool_expand+0x60/0x210
> >   [    0.314108]  __dma_atomic_pool_init+0x50/0xa4
> >   [    0.318504]  dma_atomic_pool_init+0xac/0x158
> >   [    0.322813]  do_one_initcall+0x50/0x218
> >   [    0.326684]  kernel_init_freeable+0x22c/0x2d0
> >   [    0.331083]  kernel_init+0x18/0x110
> >   [    0.334600]  ret_from_fork+0x10/0x18
> >
> > This patch limits the crashkernel allocation to the first 1GB of
> > the RAM accessible (ZONE_DMA), as otherwise we might run into OOM
> > issues when crashkernel is executed, as it might have been originally
> > allocated from either a ZONE_DMA32 memory or mixture of memory chunks
> > belonging to both ZONE_DMA and ZONE_DMA32.
>
> How does this interact with this ongoing series:
>
> https://lore.kernel.org/r/20200628083458.40066-1-chenzhou10@huawei.com
>
> (patch 4, in particular)

Many thanks for having a look at this patchset. I was not aware that
Chen had sent out a new version.
I had noted in the v9 review of the high/low range allocation
<https://lists.gt.net/linux/kernel/3726052#3726052> that I was working
on a generic solution (irrespective of the crashkernel, low and high
range allocation) which resulted in this patchset.

The issue is two-fold: OOPs in memcfg layer (PATCH 1/2, which has been
Acked-by memcfg maintainer) and OOM in the kdump kernel due to
crashkernel allocation in ZONE_DMA32 regions(s) which is addressed by
this PATCH.

I will have a closer look at the v10 patchset Chen shared, but seems
it needs some rework as per Dave's review comments which he shared
today.
IMO, in the meanwhile this patchset  can be used to fix the existing
kdump issue with upstream kernel.

> > Fixes: bff3b04460a8 ("arm64: mm: reserve CMA and crashkernel in ZONE_DMA32")
> > Cc: Johannes Weiner <hannes@...xchg.org>
> > Cc: Michal Hocko <mhocko@...nel.org>
> > Cc: Vladimir Davydov <vdavydov.dev@...il.com>
> > Cc: James Morse <james.morse@....com>
> > Cc: Mark Rutland <mark.rutland@....com>
> > Cc: Will Deacon <will@...nel.org>
> > Cc: Catalin Marinas <catalin.marinas@....com>
> > Cc: cgroups@...r.kernel.org
> > Cc: linux-mm@...ck.org
> > Cc: linux-arm-kernel@...ts.infradead.org
> > Cc: linux-kernel@...r.kernel.org
> > Cc: kexec@...ts.infradead.org
> > Reported-by: Prabhakar Kushwaha <pkushwaha@...vell.com>
> > Signed-off-by: Bhupesh Sharma <bhsharma@...hat.com>
> > ---
> >  arch/arm64/mm/init.c | 16 ++++++++++++++--
> >  1 file changed, 14 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> > index 1e93cfc7c47a..02ae4d623802 100644
> > --- a/arch/arm64/mm/init.c
> > +++ b/arch/arm64/mm/init.c
> > @@ -91,8 +91,15 @@ static void __init reserve_crashkernel(void)
> >       crash_size = PAGE_ALIGN(crash_size);
> >
> >       if (crash_base == 0) {
> > -             /* Current arm64 boot protocol requires 2MB alignment */
> > -             crash_base = memblock_find_in_range(0, arm64_dma32_phys_limit,
> > +             /* Current arm64 boot protocol requires 2MB alignment.
> > +              * Also limit the crashkernel allocation to the first
> > +              * 1GB of the RAM accessible (ZONE_DMA), as otherwise we
> > +              * might run into OOM issues when crashkernel is executed,
> > +              * as it might have been originally allocated from
> > +              * either a ZONE_DMA32 memory or mixture of memory
> > +              * chunks belonging to both ZONE_DMA and ZONE_DMA32.
> > +              */
>
> This comment needs help. Why does putting the crashkernel in ZONE_DMA
> prevent "OOM issues"?

Sure, I can work on adding more details in the comment so that it
explains the potential OOM issue(s) better.

Thanks,
Bhupesh