[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <mtetvagzj2xltczutahqj7ijgpr3fy7ppbp5w6rmn2w2zmukux@tawhrkyoqjli>
Date: Wed, 7 May 2025 09:54:39 -0700
From: Bjorn Andersson <andersson@...nel.org>
To: Eugen Hristev <eugen.hristev@...aro.org>
Cc: linux-kernel@...r.kernel.org, linux-arm-msm@...r.kernel.org,
linux-doc@...r.kernel.org, corbet@....net, tglx@...utronix.de, mingo@...hat.com,
rostedt@...dmis.org, john.ogness@...utronix.de, senozhatsky@...omium.org,
pmladek@...e.com, peterz@...radead.org, mojha@....qualcomm.com,
linux-arm-kernel@...ts.infradead.org, vincent.guittot@...aro.org, konradybcio@...nel.org,
dietmar.eggemann@....com, juri.lelli@...hat.com
Subject: Re: [RFC][PATCH 00/14] introduce kmemdump
On Tue, Apr 22, 2025 at 02:31:42PM +0300, Eugen Hristev wrote:
> kmemdump is a mechanism which allows the kernel to mark specific memory
> areas for dumping or specific backend usage.
> Once regions are marked, kmemdump keeps an internal list with the regions
> and registers them in the backend.
> Further, depending on the backend driver, these regions can be dumped using
> firmware or different hardware block.
> Regions being marked beforehand, when the system is up and running, there
> is no need nor dependency on a panic handler, or a working kernel that can
> dump the debug information.
> The kmemdump approach works when pstore, kdump, or another mechanism do not.
> Pstore relies on persistent storage, a dedicated RAM area or flash, which
> has the disadvantage of having the memory reserved all the time, or another
> specific non volatile memory. Some devices cannot keep the RAM contents on
> reboot so ramoops does not work. Some devices do not allow kexec to run
> another kernel to debug the crashed one.
> For such devices, that have another mechanism to help debugging, like
> firmware, kmemdump is a viable solution.
>
> kmemdump can create a core image, similar with /proc/vmcore, with only
> the registered regions included. This can be loaded into crash tool/gdb and
> analyzed.
> To have this working, specific information from the kernel is registered,
> and this is done at kmemdump init time, no need for the kmemdump user to
> do anything.
>
> The implementation is based on the initial Pstore/directly mapped zones
> published as an RFC here:
> https://lore.kernel.org/all/20250217101706.2104498-1-eugen.hristev@linaro.org/
>
> The back-end implementation for qcom_smem is based on the minidump
> patch series and driver written by Mukesh Ojha, thanks:
> https://lore.kernel.org/lkml/20240131110837.14218-1-quic_mojha@quicinc.com/
>
> I appreciate the feedback on this series, I know it is a longshot, and there
> is a lot to improve, but I hope I am on the right track.
>
> Thanks,
> Eugen
>
> PS. Here is how crash tool reports the dump:
>
> KERNEL: /home/eugen/linux-minidump/vmlinux [TAINTED]
> DUMPFILE: /home/eugen/eee
Can you please describe the steps taken to get acquire/generate this
file and how to invoke crash?
Regards,
Bjorn
> CPUS: 8 [OFFLINE: 7]
> DATE: Thu Jan 1 02:00:00 EET 1970
> UPTIME: 00:00:28
> NODENAME: qemuarm64
> RELEASE: 6.14.0-rc5-next-20250303-00014-g011eb2aaf7b6-dirty
> VERSION: #169 SMP PREEMPT Thu Apr 17 14:12:21 EEST 2025
> MACHINE: aarch64 (unknown Mhz)
> MEMORY: 0
> PANIC: ""
>
> crash> log
> [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd4b2]
> [ 0.000000] Linux version 6.14.0-rc5-next-20250303-00014-g011eb2aaf7b6-dirty (eugen@...en-station) (aarch64-none-linux-gnu-gcc (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 13.3.1 20240614, GNU ld (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 2.42.0.20240614) #169 SMP PREEMPT Thu Apr 17 14:12:21 EEST 2025
> [ 0.000000] KASLR enabled
> [...]
>
> Eugen Hristev (14):
> Documentation: add kmemdump
> kmemdump: introduce kmemdump
> kmemdump: introduce qcom-md backend driver
> soc: qcom: smem: add minidump device
> Documentation: kmemdump: add section for coreimage ELF
> kmemdump: add coreimage ELF layer
> printk: add kmsg_kmemdump_register
> kmemdump: coreimage: add kmsg registration
> genirq: add irq_kmemdump_register
> kmemdump: coreimage: add irq registration
> panic: add panic_kmemdump_register
> kmemdump: coreimage: add panic registration
> sched: add sched_kmemdump_register
> kmemdump: coreimage: add sched registration
>
> Documentation/debug/index.rst | 17 ++
> Documentation/debug/kmemdump.rst | 83 +++++
> drivers/Kconfig | 2 +
> drivers/Makefile | 2 +
> drivers/debug/Kconfig | 39 +++
> drivers/debug/Makefile | 5 +
> drivers/debug/kmemdump.c | 197 ++++++++++++
> drivers/debug/kmemdump_coreimage.c | 293 ++++++++++++++++++
> drivers/debug/qcom_md.c | 467 +++++++++++++++++++++++++++++
> drivers/soc/qcom/smem.c | 10 +
> include/linux/irqnr.h | 1 +
> include/linux/kmemdump.h | 77 +++++
> include/linux/kmsg_dump.h | 6 +
> include/linux/panic.h | 1 +
> include/linux/sched.h | 1 +
> kernel/irq/irqdesc.c | 7 +
> kernel/panic.c | 8 +
> kernel/printk/printk.c | 13 +
> kernel/sched/core.c | 7 +
> 19 files changed, 1236 insertions(+)
> create mode 100644 Documentation/debug/index.rst
> create mode 100644 Documentation/debug/kmemdump.rst
> create mode 100644 drivers/debug/Kconfig
> create mode 100644 drivers/debug/Makefile
> create mode 100644 drivers/debug/kmemdump.c
> create mode 100644 drivers/debug/kmemdump_coreimage.c
> create mode 100644 drivers/debug/qcom_md.c
> create mode 100644 include/linux/kmemdump.h
>
> --
> 2.43.0
>
Powered by blists - more mailing lists