[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250826171447.6w77day5wddppy3s@hu-mojha-hyd.qualcomm.com>
Date: Tue, 26 Aug 2025 22:44:47 +0530
From: Mukesh Ojha <mukesh.ojha@....qualcomm.com>
To: Eugen Hristev <eugen.hristev@...aro.org>
Cc: linux-kernel@...r.kernel.org, linux-arm-msm@...r.kernel.org,
linux-arch@...r.kernel.org, linux-mm@...ck.org, tglx@...utronix.de,
andersson@...nel.org, pmladek@...e.com,
linux-arm-kernel@...ts.infradead.org, linux-hardening@...r.kernel.org,
corbet@....net, mojha@....qualcomm.com, rostedt@...dmis.org,
jonechou@...gle.com, tudor.ambarus@...aro.org
Subject: Re: [RFC][PATCH v2 00/29] introduce kmemdump
On Thu, Jul 24, 2025 at 04:54:43PM +0300, Eugen Hristev wrote:
> kmemdump is a mechanism which allows the kernel to mark specific memory
> areas for dumping or specific backend usage.
> Once regions are marked, kmemdump keeps an internal list with the regions
> and registers them in the backend.
> Further, depending on the backend driver, these regions can be dumped using
> firmware or different hardware block.
> Regions being marked beforehand, when the system is up and running, there
> is no need nor dependency on a panic handler, or a working kernel that can
> dump the debug information.
> The kmemdump approach works when pstore, kdump, or another mechanism do not.
> Pstore relies on persistent storage, a dedicated RAM area or flash, which
> has the disadvantage of having the memory reserved all the time, or another
> specific non volatile memory. Some devices cannot keep the RAM contents on
> reboot so ramoops does not work. Some devices do not allow kexec to run
> another kernel to debug the crashed one.
> For such devices, that have another mechanism to help debugging, like
> firmware, kmemdump is a viable solution.
>
> kmemdump can create a core image, similar with /proc/vmcore, with only
> the registered regions included. This can be loaded into crash tool/gdb and
> analyzed.
> To have this working, specific information from the kernel is registered,
> and this is done at kmemdump init time, no need for the kmemdump user to
> do anything.
>
> This version of the kmemdump patch series includes two backend drivers:
> one is the Qualcomm Minidump backend, and the other one is the Debug Kinfo
> backend for Android devices, reworked from this source here:
> https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/android/debug_kinfo.c
> written originally by Jone Chou <jonechou@...gle.com>
>
> Initial version of kmemdump and discussion is available here:
> https://lore.kernel.org/lkml/20250422113156.575971-1-eugen.hristev@linaro.org/
>
> Kmemdump has been presented and discussed at Linaro Connect 2025,
> including motivation, scope, usability and feasability.
> Video of the recording is available here for anyone interested:
> https://www.youtube.com/watch?v=r4gII7MX9zQ&list=PLKZSArYQptsODycGiE0XZdVovzAwYNwtK&index=14
>
> The implementation is based on the initial Pstore/directly mapped zones
> published as an RFC here:
> https://lore.kernel.org/all/20250217101706.2104498-1-eugen.hristev@linaro.org/
>
> The back-end implementation for qcom_minidump is based on the minidump
> patch series and driver written by Mukesh Ojha, thanks:
> https://lore.kernel.org/lkml/20240131110837.14218-1-quic_mojha@quicinc.com/
>
> *** How to use kmemdump with minidump backend on Qualcomm platform guide ***
>
> Prerequisites:
> Crash tool with target=ARM64 and minor changes required for usual crash mode
> (minimal mode works without the patch)
> A patch can be applied from here https://p.calebs.dev/49a048
> This patch will be eventually sent in a reworked way to crash tool.
>
> Target kernel must be built with :
> CONFIG_DEBUG_INFO_REDUCED=n ; this will have vmlinux include all the debugging
> information needed for crash tool.
>
> Otherwise, the kernel requires these as well:
> CONFIG_KMEMDUMP, CONFIG_KMEMDUMP_COREIMAGE, and the backend
> CONFIG_KMEMDUMP_QCOM_MINIDUMP_BACKEND
>
> Kernel arguments:
> Kernel firmware must be set to mode 'mini' by kernel module parameter
> like this : qcom_scm.download_mode=mini
>
> After the kernel boots, and qcom_minidump module is loaded, everything is ready for
> a possible crash.
>
> Once the crash happens, the firmware will kick in and you will see on
> the console the message saying Sahara init, etc, that the firmware is
> waiting in download mode. (this is subject to firmware supporting this
> mode, I am using sa8775p-ride board)
>
> Example of log on the console:
> "
> [...]
> B - 1096414 - usb: init start
> B - 1100287 - usb: qusb_dci_platform , 0x19
> B - 1105686 - usb: usb3phy: PRIM success: lane_A , 0x60
> B - 1107455 - usb: usb2phy: PRIM success , 0x4
> B - 1112670 - usb: dci, chgr_type_det_err
> B - 1117154 - usb: ID:0x260, value: 0x4
> B - 1121942 - usb: ID:0x108, value: 0x1d90
> B - 1124992 - usb: timer_start , 0x4c4b40
> B - 1129140 - usb: vbus_det_pm_unavail
> B - 1133136 - usb: ID:0x252, value: 0x4
> B - 1148874 - usb: SUPER , 0x900e
> B - 1275510 - usb: SUPER , 0x900e
> B - 1388970 - usb: ID:0x20d, value: 0x0
> B - 1411113 - usb: ENUM success
> B - 1411113 - Sahara Init
> B - 1414285 - Sahara Open
> "
>
> Once the board is in download mode, you can use the qdl tool (I
> personally use edl , have not tried qdl yet), to get all the regions as
> separate files.
> The tool from the host computer will list the regions in the order they
> were downloaded.
>
> Once you have all the files simply use `cat` to put them all together,
> in the order of the indexes.
> For my kernel config and setup, here is my cat command : (you can use a script
> or something, I haven't done that so far):
>
> `cat memory/md_KELF1.BIN memory/md_Kvmcorein2.BIN memory/md_Kconfig3.BIN \
> memory/md_Kmemsect4.BIN memory/md_Ktotalram5.BIN memory/md_Kcpu_poss6.BIN \
> memory/md_Kcpu_pres7.BIN memory/md_Kcpu_onli8.BIN memory/md_Kcpu_acti9.BIN \
> memory/md_Kjiffies10.BIN memory/md_Klinux_ba11.BIN memory/md_Knr_threa12.BIN \
> memory/md_Knr_irqs13.BIN memory/md_Ktainted_14.BIN memory/md_Ktaint_fl15.BIN \
> memory/md_Kmem_sect16.BIN memory/md_Knode_dat17.BIN memory/md_Knode_sta18.BIN \
> memory/md_K__per_cp19.BIN memory/md_Knr_swapf20.BIN memory/md_Kinit_uts21.BIN \
> memory/md_Kprintk_r22.BIN memory/md_Kprintk_r23.BIN memory/md_Kprb24.BIN \
> memory/md_Kprb_desc25.BIN memory/md_Kprb_info26.BIN memory/md_Kprb_data27.BIN \
> memory/md_Krunqueue28.BIN memory/md_Khigh_mem29.BIN memory/md_Kinit_mm30.BIN \
> memory/md_Kinit_mm_31.BIN memory/md_Kunknown32.BIN memory/md_Kunknown33.BIN \
> memory/md_Kunknown34.BIN memory/md_Kunknown35.BIN memory/md_Kunknown36.BIN \
> memory/md_Kunknown37.BIN memory/md_Kunknown38.BIN memory/md_Kunknown39.BIN \
> memory/md_Kunknown40.BIN memory/md_Kunknown41.BIN memory/md_Kunknown42.BIN \
> memory/md_Kunknown43.BIN memory/md_Kunknown44.BIN memory/md_Kunknown45.BIN \
> memory/md_Kunknown46.BIN memory/md_Kunknown47.BIN memory/md_Kunknown50.BIN \
> memory/md_Kunknown51.BIN memory/md_Kunknown52.BIN \
> memory/md_Kunknown53.BIN > ~/minidump_image`
>
> Once you have the resulted file, use `crash` tool to load it, like this:
> `./crash --no_modules --no_panic --no_kmem_cache --zero_excluded vmlinux minidump_image`
>
> There is also a --minimal mode for ./crash that would work without any patch applied
> to crash tool, but you can't inspect symbols, etc.
Unfortunately for me, only with --minimal option, I could see the 'log'.
./crash --no_modules --no_panic --no_kmem_cache --zero_excluded vmlinux minidump_image
WARNING: kernel version inconsistency between vmlinux and dumpfile
crash: read error: kernel virtual address: ffffff8ed7f380d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffff8ed7f510d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffff8ed7f6a0d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffff8ed7f830d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffff8ed7f9c0d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffff8ed7fb50d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffff8ed7fce0d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffff8ed7fe70d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffffc0817c5d80 type: "maple_init read mt_slots"
crash: read error: kernel virtual address: ffffffc0817c5d78 type: "maple_init read mt_pivots"
crash: read error: kernel virtual address: ffffff8efb89e2c0 type: "memory section root table"
Looks like something more you are using in your setup to make it work.
-Mukesh
>
> Once you load crash you will see something like this :
>
> KERNEL: /home/eugen/linux-minidump/vmlinux [TAINTED]
> DUMPFILE: /home/eugen/new
> CPUS: 8 [OFFLINE: 7]
> DATE: Thu Jan 1 02:00:00 EET 1970
> UPTIME: 00:00:29
> TASKS: 0
> NODENAME: qemuarm64
> RELEASE: 6.16.0-rc7-next-20250721-00029-gf8cffdbf0479-dirty
> VERSION: #5 SMP PREEMPT Tue Jul 22 18:44:57 EEST 2025
> MACHINE: aarch64 (unknown Mhz)
> MEMORY: 34.2 GB
> PANIC: ""
> crash> log
> [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd4b2]
> [ 0.000000] Linux version 6.16.0-rc7-next-20250721-00029-gf8cffdbf0479-dirty (eugen@...en-station) (aarch64-none-linux-gnu-gcc (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 13.3.1 20240614, GNU ld (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 2.42.0.20240614) #5 SMP PREEMPT Tue Jul 22 18:44:57 EEST 2025
>
>
> *** Debug Kinfo backend driver ***
> I don't have any device to actually test this. So I have not.
> I hacked the driver to just use a kmalloc'ed area to save things instead
> of the shared memory, and dumped everything there and checked whether it looks
> sane. If someone is willing to try it out, thanks ! and let me know.
> I know there is no binding documentation for the compatible either.
>
> Thanks for everyone reviewing and bringing ideas into the discussion.
>
> Eugen
>
> Changelog since the v1 of the RFC:
> - Reworked the whole minidump implementation based on suggestions from Thomas Gleixner.
> This means new API, macros, new way to store the regions inside kmemdump
> (ditched the IDR, moved to static allocation, have a static default backend, etc)
> - Reworked qcom_minidump driver based on review from Bjorn Andersson
> - Reworked printk log buffer registration based on review from Petr Mladek
>
> I appologize if I missed any review comments. I know there is still lots of work
> on this series and hope I will improve it more and more.
> Patches are sent on top of next-20250721
>
> Eugen Hristev (29):
> kmemdump: introduce kmemdump
> Documentation: add kmemdump
> kmemdump: add coreimage ELF layer
> Documentation: kmemdump: add section for coreimage ELF
> kmemdump: introduce qcom-minidump backend driver
> soc: qcom: smem: add minidump device
> init/version: Annotate static information into Kmemdump
> cpu: Annotate static information into Kmemdump
> genirq/irqdesc: Annotate static information into Kmemdump
> panic: Annotate static information into Kmemdump
> sched/core: Annotate static information into Kmemdump
> timers: Annotate static information into Kmemdump
> kernel/fork: Annotate static information into Kmemdump
> mm/page_alloc: Annotate static information into Kmemdump
> mm/init-mm: Annotate static information into Kmemdump
> mm/show_mem: Annotate static information into Kmemdump
> mm/swapfile: Annotate static information into Kmemdump
> mm/percpu: Annotate static information into Kmemdump
> mm/mm_init: Annotate static information into Kmemdump
> printk: Register information into Kmemdump
> kernel/configs: Register dynamic information into Kmemdump
> mm/numa: Register information into Kmemdump
> mm/sparse: Register information into Kmemdump
> kernel/vmcore_info: Register dynamic information into Kmemdump
> kmemdump: Add additional symbols to the coreimage
> init/version: Annotate init uts name separately into Kmemdump
> kallsyms: Annotate static information into Kmemdump
> mm/init-mm: Annotate additional information into Kmemdump
> kmemdump: Add Kinfo backend driver
>
> Documentation/debug/index.rst | 17 ++
> Documentation/debug/kmemdump.rst | 104 +++++++++
> MAINTAINERS | 18 ++
> drivers/Kconfig | 4 +
> drivers/Makefile | 2 +
> drivers/debug/Kconfig | 55 +++++
> drivers/debug/Makefile | 6 +
> drivers/debug/kinfo.c | 304 +++++++++++++++++++++++++
> drivers/debug/kmemdump.c | 239 +++++++++++++++++++
> drivers/debug/kmemdump_coreimage.c | 223 ++++++++++++++++++
> drivers/debug/qcom_minidump.c | 353 +++++++++++++++++++++++++++++
> drivers/soc/qcom/smem.c | 10 +
> include/asm-generic/vmlinux.lds.h | 13 ++
> include/linux/kmemdump.h | 219 ++++++++++++++++++
> init/version.c | 6 +
> kernel/configs.c | 6 +
> kernel/cpu.c | 5 +
> kernel/fork.c | 2 +
> kernel/irq/irqdesc.c | 2 +
> kernel/kallsyms.c | 10 +
> kernel/panic.c | 4 +
> kernel/printk/printk.c | 28 ++-
> kernel/sched/core.c | 2 +
> kernel/time/timer.c | 3 +-
> kernel/vmcore_info.c | 3 +
> mm/init-mm.c | 12 +
> mm/mm_init.c | 2 +
> mm/numa.c | 5 +-
> mm/page_alloc.c | 2 +
> mm/percpu.c | 3 +
> mm/show_mem.c | 2 +
> mm/sparse.c | 16 +-
> mm/swapfile.c | 2 +
> 33 files changed, 1670 insertions(+), 12 deletions(-)
> create mode 100644 Documentation/debug/index.rst
> create mode 100644 Documentation/debug/kmemdump.rst
> create mode 100644 drivers/debug/Kconfig
> create mode 100644 drivers/debug/Makefile
> create mode 100644 drivers/debug/kinfo.c
> create mode 100644 drivers/debug/kmemdump.c
> create mode 100644 drivers/debug/kmemdump_coreimage.c
> create mode 100644 drivers/debug/qcom_minidump.c
> create mode 100644 include/linux/kmemdump.h
>
> --
> 2.43.0
>
--
-Mukesh Ojha
Powered by blists - more mailing lists