[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <26cc3478-8f65-44bb-8ebe-24a28a858dab@linaro.org>
Date: Fri, 9 May 2025 18:19:40 +0300
From: Eugen Hristev <eugen.hristev@...aro.org>
To: Bjorn Andersson <andersson@...nel.org>
Cc: linux-kernel@...r.kernel.org, linux-arm-msm@...r.kernel.org,
linux-doc@...r.kernel.org, corbet@....net, tglx@...utronix.de,
mingo@...hat.com, rostedt@...dmis.org, john.ogness@...utronix.de,
senozhatsky@...omium.org, pmladek@...e.com, peterz@...radead.org,
mojha@....qualcomm.com, linux-arm-kernel@...ts.infradead.org,
vincent.guittot@...aro.org, konradybcio@...nel.org,
dietmar.eggemann@....com, juri.lelli@...hat.com
Subject: Re: [RFC][PATCH 00/14] introduce kmemdump
Hello Bjorn,
On 5/7/25 19:54, Bjorn Andersson wrote:
> On Tue, Apr 22, 2025 at 02:31:42PM +0300, Eugen Hristev wrote:
>> kmemdump is a mechanism which allows the kernel to mark specific memory
>> areas for dumping or specific backend usage.
>> Once regions are marked, kmemdump keeps an internal list with the regions
>> and registers them in the backend.
>> Further, depending on the backend driver, these regions can be dumped using
>> firmware or different hardware block.
>> Regions being marked beforehand, when the system is up and running, there
>> is no need nor dependency on a panic handler, or a working kernel that can
>> dump the debug information.
>> The kmemdump approach works when pstore, kdump, or another mechanism do not.
>> Pstore relies on persistent storage, a dedicated RAM area or flash, which
>> has the disadvantage of having the memory reserved all the time, or another
>> specific non volatile memory. Some devices cannot keep the RAM contents on
>> reboot so ramoops does not work. Some devices do not allow kexec to run
>> another kernel to debug the crashed one.
>> For such devices, that have another mechanism to help debugging, like
>> firmware, kmemdump is a viable solution.
>>
>> kmemdump can create a core image, similar with /proc/vmcore, with only
>> the registered regions included. This can be loaded into crash tool/gdb and
>> analyzed.
>> To have this working, specific information from the kernel is registered,
>> and this is done at kmemdump init time, no need for the kmemdump user to
>> do anything.
>>
>> The implementation is based on the initial Pstore/directly mapped zones
>> published as an RFC here:
>> https://lore.kernel.org/all/20250217101706.2104498-1-eugen.hristev@linaro.org/
>>
>> The back-end implementation for qcom_smem is based on the minidump
>> patch series and driver written by Mukesh Ojha, thanks:
>> https://lore.kernel.org/lkml/20240131110837.14218-1-quic_mojha@quicinc.com/
>>
>> I appreciate the feedback on this series, I know it is a longshot, and there
>> is a lot to improve, but I hope I am on the right track.
>>
>> Thanks,
>> Eugen
>>
>> PS. Here is how crash tool reports the dump:
>>
>> KERNEL: /home/eugen/linux-minidump/vmlinux [TAINTED]
>> DUMPFILE: /home/eugen/eee
>
> Can you please describe the steps taken to get acquire/generate this
> file and how to invoke crash?
>
Thank you for looking into this.
Next week, on 16th of May, on Friday, there will be a talk related to
this patch series at Linaro Connect in Lisbon. In that talk I will also
show a demo in which all the process of acquiring the core dump and
crash will be covered.
I will be traveling the following days, if I get the time I will submit
the steps as a reply to this email, if not, then for sure I will submit
them after the talk in Lisbon.
Eugen
> Regards,
> Bjorn
>
>> CPUS: 8 [OFFLINE: 7]
>> DATE: Thu Jan 1 02:00:00 EET 1970
>> UPTIME: 00:00:28
>> NODENAME: qemuarm64
>> RELEASE: 6.14.0-rc5-next-20250303-00014-g011eb2aaf7b6-dirty
>> VERSION: #169 SMP PREEMPT Thu Apr 17 14:12:21 EEST 2025
>> MACHINE: aarch64 (unknown Mhz)
>> MEMORY: 0
>> PANIC: ""
>>
>> crash> log
>> [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd4b2]
>> [ 0.000000] Linux version 6.14.0-rc5-next-20250303-00014-g011eb2aaf7b6-dirty (eugen@...en-station) (aarch64-none-linux-gnu-gcc (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 13.3.1 20240614, GNU ld (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 2.42.0.20240614) #169 SMP PREEMPT Thu Apr 17 14:12:21 EEST 2025
>> [ 0.000000] KASLR enabled
>> [...]
>>
>> Eugen Hristev (14):
>> Documentation: add kmemdump
>> kmemdump: introduce kmemdump
>> kmemdump: introduce qcom-md backend driver
>> soc: qcom: smem: add minidump device
>> Documentation: kmemdump: add section for coreimage ELF
>> kmemdump: add coreimage ELF layer
>> printk: add kmsg_kmemdump_register
>> kmemdump: coreimage: add kmsg registration
>> genirq: add irq_kmemdump_register
>> kmemdump: coreimage: add irq registration
>> panic: add panic_kmemdump_register
>> kmemdump: coreimage: add panic registration
>> sched: add sched_kmemdump_register
>> kmemdump: coreimage: add sched registration
>>
>> Documentation/debug/index.rst | 17 ++
>> Documentation/debug/kmemdump.rst | 83 +++++
>> drivers/Kconfig | 2 +
>> drivers/Makefile | 2 +
>> drivers/debug/Kconfig | 39 +++
>> drivers/debug/Makefile | 5 +
>> drivers/debug/kmemdump.c | 197 ++++++++++++
>> drivers/debug/kmemdump_coreimage.c | 293 ++++++++++++++++++
>> drivers/debug/qcom_md.c | 467 +++++++++++++++++++++++++++++
>> drivers/soc/qcom/smem.c | 10 +
>> include/linux/irqnr.h | 1 +
>> include/linux/kmemdump.h | 77 +++++
>> include/linux/kmsg_dump.h | 6 +
>> include/linux/panic.h | 1 +
>> include/linux/sched.h | 1 +
>> kernel/irq/irqdesc.c | 7 +
>> kernel/panic.c | 8 +
>> kernel/printk/printk.c | 13 +
>> kernel/sched/core.c | 7 +
>> 19 files changed, 1236 insertions(+)
>> create mode 100644 Documentation/debug/index.rst
>> create mode 100644 Documentation/debug/kmemdump.rst
>> create mode 100644 drivers/debug/Kconfig
>> create mode 100644 drivers/debug/Makefile
>> create mode 100644 drivers/debug/kmemdump.c
>> create mode 100644 drivers/debug/kmemdump_coreimage.c
>> create mode 100644 drivers/debug/qcom_md.c
>> create mode 100644 include/linux/kmemdump.h
>>
>> --
>> 2.43.0
>>
Powered by blists - more mailing lists