[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXEKKcnPE5crMYbuFpDJBqmgjFwna84MzAZkfp-mM3B7vA@mail.gmail.com>
Date: Wed, 8 Jun 2022 12:12:18 +0200
From: Ard Biesheuvel <ardb@...nel.org>
To: David Hildenbrand <david@...hat.com>
Cc: Mike Rapoport <rppt@...nel.org>, mawupeng <mawupeng1@...wei.com>,
corbet@....net, will@...nel.org, catalin.marinas@....com,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
dvhart@...radead.org, andy@...radead.org,
akpm@...ux-foundation.org, paul.walmsley@...ive.com,
palmer@...belt.com, aou@...s.berkeley.edu, paulmck@...nel.org,
keescook@...omium.org, songmuchun@...edance.com,
rdunlap@...radead.org, damien.lemoal@...nsource.wdc.com,
swboyd@...omium.org, wei.liu@...nel.org, robin.murphy@....com,
anshuman.khandual@....com, thunder.leizhen@...wei.com,
wangkefeng.wang@...wei.com, gpiccoli@...lia.com,
chenhuacai@...nel.org, geert@...ux-m68k.org, chenzhou10@...wei.com,
vijayb@...ux.microsoft.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-efi@...r.kernel.org, platform-driver-x86@...r.kernel.org,
linux-mm@...ck.org, linux-riscv@...ts.infradead.org
Subject: Re: [PATCH v3 5/6] mm: Add mirror flag back on initrd memory
On Wed, 8 Jun 2022 at 12:08, David Hildenbrand <david@...hat.com> wrote:
>
> On 08.06.22 12:02, Mike Rapoport wrote:
> > On Wed, Jun 08, 2022 at 03:27:09PM +0800, mawupeng wrote:
> >>
> >> 在 2022/6/7 22:49, Ard Biesheuvel 写道:
> >>> On Tue, 7 Jun 2022 at 14:22, David Hildenbrand <david@...hat.com> wrote:
> >>>>
> >>>> On 07.06.22 11:38, Wupeng Ma wrote:
> >>>>> From: Ma Wupeng <mawupeng1@...wei.com>
> >>>>>
> >>>>> Initrd memory will be removed and then added in arm64_memblock_init() and this
> >>>>> will cause it to lose all of its memblock flags. The lost of MEMBLOCK_MIRROR
> >>>>> flag will lead to error log printed by find_zone_movable_pfns_for_nodes if
> >>>>> the lower 4G range has some non-mirrored memory.
> >>>>>
> >>>>> In order to solve this problem, the lost MEMBLOCK_MIRROR flag will be
> >>>>> reinstalled if the origin memblock has this flag.
> >>>>>
> >>>>> Signed-off-by: Ma Wupeng <mawupeng1@...wei.com>
> >>>>> ---
> >>>>> arch/arm64/mm/init.c | 9 +++++++++
> >>>>> include/linux/memblock.h | 1 +
> >>>>> mm/memblock.c | 20 ++++++++++++++++++++
> >>>>> 3 files changed, 30 insertions(+)
> >>>>>
> >>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> >>>>> index 339ee84e5a61..11641f924d08 100644
> >>>>> --- a/arch/arm64/mm/init.c
> >>>>> +++ b/arch/arm64/mm/init.c
> >>>>> @@ -350,9 +350,18 @@ void __init arm64_memblock_init(void)
> >>>>> "initrd not fully accessible via the linear mapping -- please check your bootloader ...\n")) {
> >>>>> phys_initrd_size = 0;
> >>>>> } else {
> >>>>> + int flags, ret;
> >>>>> +
> >>>>> + ret = memblock_get_flags(base, &flags);
> >>>>> + if (ret)
> >>>>> + flags = 0;
> >>>>> +
> >>>>> memblock_remove(base, size); /* clear MEMBLOCK_ flags */
> >>>>> memblock_add(base, size);
> >>>>> memblock_reserve(base, size);
> >>>>
> >>>> Can you explain why we're removing+re-adding here exactly? Is it just to
> >>>> clear flags as the comment indicates?
> >>>>
> >>>
> >>> This should only happen if the placement of the initrd conflicts with
> >>> a mem= command line parameter or it is not covered by memblock for
> >>> some other reason.
> >>>
> >>> IOW, this should never happen, and if re-memblock_add'ing this memory
> >>> unconditionally is causing problems, we should fix that instead of
> >>> working around it.
> >>
> >> This will happen if we use initrdmem=3G,100M to reserve initrd memory below
> >> the 4G limit to test this scenario(just for testing, I have trouble to boot
> >> qemu with initrd enabled and memory below 4G are all mirror memory).
> >>
> >> Re-memblock_add'ing this memory unconditionally seems fine but clear all
> >> flags(especially MEMBLOCK_MIRROR) may lead to some error log.
> >>
> >>>
> >>>> If it's really just about clearing flags, I wonder if we rather want to
> >>>> have an interface that does exactly that, and hides the way this is
> >>>> actually implemented (obtain flags, remove, re-add ...), internally.
> >>>>
> >>>> But most probably there is more magic in the code and clearing flags
> >>>> isn't all it ends up doing.
> >>>>
> >>>
> >>> I don't remember exactly why we needed to clear the flags, but I think
> >>> it had to do with some corner case we hit when the initrd was
> >>> partially covered.
> >> If "mem=" is set in command line, memblock_mem_limit_remove_map() will
> >> remove all memory block without MEMBLOCK_NOMAP. Maybe this will bring the
> >> memory back if this initrd mem has the MEMBLOCK_NOMAP flag?
> >>
> >> The rfc version [1] introduce and use memblock_clear_nomap() to clear the
> >> MEMBLOCK_NOMAP of this initrd memblock.
> >> So maybe the usage of memblock_remove() is just to avoid introducing new
> >> function(memblock_clear_nomap)?
> >>
> >> Since commit 4c546b8a3469 ("memblock: add memblock_clear_nomap()") already
> >> introduced memblock_clear_nomap(). Can we use this to remove flag MEMBLOCK_NOMAP
> >> to solve this problem rather than bring flag MEMBLOCK_MIRROR back?
> >
> > AFAICT, there are two corner cases that re-adding initrd memory covers:
> > * initrd memory is not a part of the memory reported to memblock, either
> > because of firmware weirdness or because it was cut out with mem=
> > * initrd memory overlaps a NOMAP region
> >
> > So to make sure initrd memory is mapped properly and retains
> > MEMBLOCK_MIRROR I think the best we can do is
> >
> > memblock_add();
> > memblock_clear_nomap();
> > memblock_reserve();
>
> Would simply detect+rejecting to boot on such setups be an option? The
> replies so far indicate to me that this is rather a corner case than a
> reasonable use case.
>
The sad reality is that mem= is known to be used in production for
limiting the amount of memory that the kernel takes control of, in
order to allow the remainder to be used in platform specific ways.
Of course, there are much better ways to achieve that, but given that
we currently support it, I don't think we can easily back that out.
I do think that there is no need to go out of our way to make this
case work seamlessly with mirrored memory, though. So I'd prefer to
make the remove+re-add conditional on there actually being a need to
do so. That way, we don't break the old use case or mirrored memory,
and whatever happens when the two are combined is DONTCARE.
Powered by blists - more mailing lists