[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aH9KfV8XM5fNsR/Y@kernel.org>
Date: Tue, 22 Jul 2025 11:23:25 +0300
From: Mike Rapoport <rppt@...nel.org>
To: mawupeng <mawupeng1@...wei.com>
Cc: akpm@...ux-foundation.org, ardb@...nel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: ignore nomap memory during mirror init
On Mon, Jul 21, 2025 at 10:11:11AM +0800, mawupeng wrote:
> On 2025/7/20 20:38, Mike Rapoport wrote:
> > On Fri, Jul 18, 2025 at 09:37:48AM +0800, mawupeng wrote:
> >>
> >>
> >> On 2025/7/17 21:37, Mike Rapoport wrote:
> >>> On Thu, Jul 17, 2025 at 07:06:52PM +0800, mawupeng wrote:
> >>>>
> >>>> On 2025/7/17 18:29, Mike Rapoport wrote:
> >>>>> On Thu, Jul 17, 2025 at 04:57:23PM +0800, Wupeng Ma wrote:
> >>>>>> When memory mirroring is enabled, the BIOS may reserve memory regions
> >>>>>> at the start of the physical address space without the MR flag. This will
> >>>>>> lead to zone_movable_pfn to be updated to the start of these reserved
> >>>>>> regions, resulting in subsequent mirrored memory being ignored.
> >>>>>>
> >>>>>> Here is the log with efi=debug enabled:
> >>>>>> efi: 0x084004000000-0x0842bf37ffff [Conventional| | |MR|...|WB|WT|WC| ]
> >>>>>> efi: 0x0842bf380000-0x0842c21effff [Loader Code | | |MR|...|WB|WT|WC| ]
> >>>>>> efi: 0x0842c21f0000-0x0847ffffffff [Conventional| | |MR|...|WB|WT|WC| ]
> >>>>>> efi: 0x085000000000-0x085fffffffff [Conventional| | | |...|WB|WT|WC| ]
> >>>>>> ...
> >>>>>> efi: 0x084000000000-0x084003ffffff [Reserved | | | |...|WB|WT|WC| ]
> >>>>>>
> >>>>>> Since this kind of memory can not be used by kernel. ignore nomap memory to fix
> >>>>>> this issue.
> >>>>
> >>>> Since the first non-mirror pfn of this node is 0x084000000000, then zone_movable_pfn
> >>>> for this node will be updated to this. This will lead to Mirror Region
> >>>> - 0x084004000000-0x0842bf37ffff
> >>>> - 0x0842bf380000-0x0842c21effff
> >>>> - 0x0842c21f0000-0x0847ffffffff
> >>>> be seen as non-mirror memory since zone_movable_pfn will be the start_pfn of this node
> >>>> in adjust_zone_range_for_zone_movable().
> >>>
> >>> What do you mean by "seen as non-mirror memory"?
> >>
> >> It mean these memory range will be add to movable zone.
> >>
> >>>
> >>> What is the problem with having movable zone on that node start at
> >>> 0x084000000000?
> >>>
> >>> Can you post the kernel log up to "Memory: nK/mK available" line for more
> >>> context?
> >>
> >> Memory: nK/mK available can not see be problem here, since there is nothing wrong
> >> with the total memory. However this problem can be shown via lsmem --output-all
> >
> > I didn't ask for that particular line but for *up to that line*.
> >
> >> w/o this patch
> >> [root@...alhost ~]# lsmem --output-all
> >> RANGE SIZE STATE REMOVABLE BLOCK NODE ZONES
> >> 0x0000084000000000-0x00000847ffffffff 32G online yes 67584-67839 0 Movable
> >> 0x0000085000000000-0x0000085fffffffff 64G online yes 68096-68607 0 Movable
> >>
> >> w/ this patch
> >> [root@...alhost ~]# lsmem --output-all
> >> RANGE SIZE STATE REMOVABLE BLOCK NODE ZONES
> >> 0x0000084000000000-0x00000847ffffffff 32G online yes 8448-8479 0 Normal
> >> 0x0000085000000000-0x0000085fffffffff 64G online yes 8512-8575 0 Movable
> >
> > As I see the problem, you have a problematic firmware that fails to report
> > memory as mirrored because it reserved for firmware own use. This causes
> > for non-mirrored memory to appear before mirrored memory. And this breaks
> > an assumption in find_zone_movable_pfns_for_nodes() that mirrored memory
> > always has lower addresses than non-mirrored memory and you end up wiht
> > having all the memory in movable zone.
>
> Yes.
>
> >
> > So to workaround this firmware issue you propose a hack that would skip
> > NOMAP regions while calculating zone_movable_pfn because your particular
> > firmware reports the reserved mirrored memory as NOMAP.
> >
> > Why don't you simply pass "kernelcore=32G" on the command line and you'll
> > get the same result.
>
> Since mirrored memory are in each node, not only one, "kernelcore=32G" can
> not fix this problem.
I don't see other nodes in lsmem output. And I asked for the kernel log
exactly to see how kernel sees the memory on the system.
Another question is do you really need ZONE_MOVABLE? Most of the time MM
core operates on the pageblock granularity and even if all the memory are
in ZONE_NORMAL the pageblocks are still movable.
--
Sincerely yours,
Mike.
Powered by blists - more mailing lists