[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aab5e2af-04d6-485f-bf81-557583f2ae4b@redhat.com>
Date: Mon, 25 Aug 2025 15:58:37 +0200
From: David Hildenbrand <david@...hat.com>
To: Eugen Hristev <eugen.hristev@...aro.org>, Michal Hocko <mhocko@...e.com>
Cc: linux-kernel@...r.kernel.org, linux-arm-msm@...r.kernel.org,
linux-arch@...r.kernel.org, linux-mm@...ck.org, tglx@...utronix.de,
andersson@...nel.org, pmladek@...e.com,
linux-arm-kernel@...ts.infradead.org, linux-hardening@...r.kernel.org,
corbet@....net, mojha@....qualcomm.com, rostedt@...dmis.org,
jonechou@...gle.com, tudor.ambarus@...aro.org,
Christoph Hellwig <hch@...radead.org>,
Sergey Senozhatsky <senozhatsky@...omium.org>
Subject: Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
On 25.08.25 15:36, Eugen Hristev wrote:
>
>
> On 8/25/25 16:20, David Hildenbrand wrote:
>>
>>>>
>>>> IIRC, kernel/vmcore_info.c is never built as a module, as it also
>>>> accesses non-exported symbols.
>>>
>>> Hello David,
>>>
>>> I am looking again into this, and there are some things which in my
>>> opinion would be difficult to achieve.
>>> For example I looked into my patch #11 , which adds the `runqueues` into
>>> kmemdump.
>>>
>>> The runqueues is a variable of `struct rq` which is defined in
>>> kernel/sched/sched.h , which is not supposed to be included outside of
>>> sched.
>>> Now moving all the struct definition outside of sched.h into another
>>> public header would be rather painful and I don't think it's a really
>>> good option (The struct would be needed to compute the sizeof inside
>>> vmcoreinfo). Secondly, it would also imply moving all the nested struct
>>> definitions outside as well. I doubt this is something that we want for
>>> the sched subsys. How the subsys is designed, out of my understanding,
>>> is to keep these internal structs opaque outside of it.
>>
>> All the kmemdump module needs is a start and a length, correct? So the
>> only tricky part is getting the length.
>
> I also have in mind the kernel user case. How would a kernel programmer
> want to add some kernel structs/info/buffers into kmemdump such that the
> dump would contain their data ? Having "KMEMDUMP_VAR(...)" looks simple
> enough.
The other way around, why should anybody have a saying in adding their
data to kmemdump? Why do we have that all over the kernel?
Is your mechanism really so special?
A single composer should take care of that, and it's really just start +
len of physical memory areas.
> Otherwise maybe the programmer has to write helpers to compute lengths
> etc, and stitch them into kmemdump core.
> I am not saying it's impossible, but just tiresome perhaps.
In your patch set, how many of these instances did you encounter where
that was a problem?
>>
>> One could just add a const variable that holds this information, or even
>> better, a simple helper function to calculate that.
>>
>> Maybe someone else reading along has a better idea.
>
> This could work, but it requires again adding some code into the
> specific subsystem. E.g. struct_rq_get_size()
> I am open to ideas , and thank you very much for your thoughts.
>
>>
>> Interestingly, runqueues is a percpu variable, which makes me wonder if
>> what you had would work as intended (maybe it does, not sure).
>
> I would not really need to dump the runqueues. But the crash tool which
> I am using for testing, requires it. Without the runqueues it will not
> progress further to load the kernel dump.
> So I am not really sure what it does with the runqueues, but it works.
> Perhaps using crash/gdb more, to actually do something with this data,
> would give more insight about its utility.
> For me, it is a prerequisite to run crash, and then to be able to
> extract the log buffer from the dump.
I have the faint recollection that percpu vars might not be stored in a
single contiguous physical memory area, but maybe my memory is just
wrong, that's why I was raising it.
>
>>
>>>
>>> From my perspective it's much simpler and cleaner to just add the
>>> kmemdump annotation macro inside the sched/core.c as it's done in my
>>> patch. This macro translates to a noop if kmemdump is not selected.
>>
>> I really don't like how we are spreading kmemdump all over the kernel,
>> and adding complexity with __section when really, all we need is a place
>> to obtain a start and a length.
>>
>
> I understand. The section idea was suggested by Thomas. Initially I was
> skeptic, but I like how it turned out.
Yeah, I don't like it. Taste differs ;)
I am in particular unhappy about custom memblock wrappers.
[...]
>>>
>>> To have this working outside of printk, it would be required to walk
>>> through all the printk structs/allocations and select the required info.
>>> Is this something that we want to do outside of printk ?
>>
>> I don't follow, please elaborate.
>>
>> How is e.g., log_buf_len_get() + log_buf_addr_get() not sufficient,
>> given that you run your initialization after setup_log_buf() ?
>>
>>
>
> My initial thought was the same. However I got some feedback from Petr
> Mladek here :
>
> https://lore.kernel.org/lkml/aBm5QH2p6p9Wxe_M@localhost.localdomain/
>
> Where he explained how to register the structs correctly.
> It can be that setup_log_buf is called again at a later time perhaps.
>
setup_log_buf() is a __init function, so there is only a certain time
frame where it can be called.
In particular, once the buddy is up, memblock allocations are impossible
and it would be deeply flawed to call this function again.
Let's not over-engineer this.
Peter is on CC, so hopefully he can share his thoughts.
--
Cheers
David / dhildenb
Powered by blists - more mailing lists