[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAO7dBbUdZg7Lp+7bHzpQSpP7PX4YO=isTEDeA3X8N+VihkCusw@mail.gmail.com>
Date: Tue, 7 Oct 2025 12:34:19 +1300
From: Tao Liu <ltao@...hat.com>
To: David Hildenbrand <dhildenb@...hat.com>
Cc: Breno Leitao <leitao@...ian.org>, kas@...nel.org, Jiri Bohac <jbohac@...e.cz>, riel@...riel.com,
vbabka@...e.cz, nphamcs@...il.com, Baoquan He <bhe@...hat.com>,
Vivek Goyal <vgoyal@...hat.com>, Dave Young <dyoung@...hat.com>, kexec@...ts.infradead.org,
akpm@...ux-foundation.org, Philipp Rudo <prudo@...hat.com>,
Donald Dutile <ddutile@...hat.com>, Pingfan Liu <piliu@...hat.com>, linux-kernel@...r.kernel.org,
Michal Hocko <mhocko@...e.cz>
Subject: Re: [PATCH v5 0/5] kdump: crashkernel reservation from CMA
Hi David,
On Tue, Oct 7, 2025 at 5:45 AM David Hildenbrand <dhildenb@...hat.com> wrote:
>
> On 06.10.25 18:25, Breno Leitao wrote:
> > On Mon, Oct 06, 2025 at 10:16:26AM +0200, David Hildenbrand wrote:
> >> On 03.10.25 17:51, Breno Leitao wrote:
> >>> Hello Jiri,
> >>>
> >>> On Thu, Jun 12, 2025 at 12:11:19PM +0200, Jiri Bohac wrote:
> >>>
> >>>> Currently this is only the case for memory ballooning and zswap. Such movable
> >>>> memory will be missing from the vmcore. User data is typically not dumped by
> >>>> makedumpfile.
> >>>
> >>> For zswap and zsmalloc pages, I'm wondering whether these pages will be missing
> >>> from the vmcore, or if there's a possibility they might be present but
> >>> corrupted—especially since they could reside in the CMA region, which may be
> >>> overwritten by the kdump environment.
> >>
> >> That's not different to ordinary user pages residing on these areas, right?
> >
> > Will zsmalloc on CMA pages be marked as "userpages"?
>
> No, but they should have the zsmalloc page type set.
>
> >
> > makedump file iterates over the pfns and check for a few flags before
> > "copying" them to disk.
> >
> > In makedumpfile, userpages are basically discarded if they are anonymous
> > pages:
> > #define isAnon(mapping, flags, _mapcount) \
> > (((unsigned long)mapping & PAGE_MAPPING_ANON) != 0 && !isSlab(flags,
> > _mapcount))
> >
> > https://github.com/makedumpfile/makedumpfile/blob/master/makedumpfile.h#L164
> >
> > called from:
> > https://github.com/makedumpfile/makedumpfile/blob/master/makedumpfile.c#L6671
> >
> > For zsmalloc pages in the CMA, The page struct (pfn)) is marked with old
> > page struct (from the first kernel), but, the content has changed
> > (replaced by kdump environment - 2nd kernel).
> >
> > So, whatever decision makedumpfile does based on the PFN, it will dump
> > incorrect data, given that the page content does not match the data
> > anymore.
>
> Right.
>
> >
> > If my understanding is valid, we don't want to dump any page that points
> > to the PFN, because they will probably have garbage.
>
> My theory is that barely anybody will go ahead and check compressed page
> content, but I agree. We should filter them out.
>
> >
> > That said, I see two options:
> >
> > 1) Ignore the CMA area completely in makedump.
> > - I don't think there is any way to find that area today. The kernel
> > might need to print the CMA region somewhere (/proc/iomem?)
>
> /proc/iomem in the newkernel should indicate the memory region as System
> RAM (for the new kernel). That can just be filtered out in any case:
> dumping memory of the new kernel does not make sense in any case.
>
> >
> > 2) Given that most of the memory in CMA will be anonymous memory, and
> > already discard by other rules, just add an additional entry for
> > zsmalloc pages.
> >
> > Talking to Kirill offline, it seems we can piggy back on MovableOps
> > page flag.
>
> We should likely check the page type instead if we go down that path.
If choosing a proper page type/flag is hard, maybe an ongoing new
feature for makedumpfile can help with that. In short, if we can get a
workable page flag for CMA to get filtered, then proceed as usual, If
cannot, then we can use eppic/btf/kallsyms[1] in makedumpfile to
programmably determine page type and filter it out. See the program
for determining amdgpu's mm pages[2].
[1]: https://lore.kernel.org/kexec/20250610095743.18073-1-ltao@redhat.com/T/#m901bf1413b844648c86e8a84d75b66d0531b9f92
[2]: https://lore.kernel.org/kexec/20250610095743.18073-1-ltao@redhat.com/T/#m38362d258e3b0bdc14a64e54a6acd5b85810ca26
Cheers,
Tao Liu
>
> --
> Cheers
>
> David / dhildenb
>
Powered by blists - more mailing lists