[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<MRWPR09MB8022AE6B2BED3373D55E31DB8F56A@MRWPR09MB8022.eurprd09.prod.outlook.com>
Date: Wed, 16 Jul 2025 11:05:47 +0000
From: Pnina Feder <PNINA.FEDER@...ileye.com>
To: Alexandre Ghiti <alex@...ti.fr>, Björn Töpel
<bjorn@...nel.org>
CC: Gregory Greenman <Gregory.Greenman@...ileye.com>, "bjorn@...osinc.com"
<bjorn@...osinc.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linux-riscv@...ts.infradead.org"
<linux-riscv@...ts.infradead.org>, "mick@....forth.gr" <mick@....forth.gr>,
"palmer@...belt.com" <palmer@...belt.com>, "paul.walmsley@...ive.com"
<paul.walmsley@...ive.com>, Vladimir Kondratiev
<Vladimir.Kondratiev@...ileye.com>
Subject: RE: [PATCH 0/1] Fix for riscv vmcore issue
Hi Alex,
Thank you for your response!
>Hi Pnina,
>
>On 7/14/25 14:00, Pnina Feder wrote:
>>>>> Hi Pnina,
>>>>>> Pnina!
>>>>>>
>>>>>> Pnina Feder <pnina.feder@...ileye.com> writes:
>>>>>>
>>>>>>> We are creating a vmcore using kexec on a Linux 6.15 RISC-V
>>>>>>> system and analyzing it with the crash tool on the host. This
>>>>>>> workflow used to work on Linux 6.14 but is now broken in 6.15.
>>>>>> Thanks for reporting this!
>>>>>>
>>>>>>> The issue is caused by a change in the kernel:
>>>>>>> In Linux 6.15, certain memblock sections are now marked as
>>>>>>> Reserved in /proc/iomem. The kexec tool excludes all Reserved
>>>>>>> regions when generating the vmcore, so these sections are missing from the dump.
>>>>>>> How are you collecting the /proc/vmcore file? A full set of commands would be helpful.
>>>>> We’ve defined in our system that when a process crashes, we call panic().
>>>>> To handle crash recovery, we're using kexec with the following command:
>>>>> kexec -p /Image --initrd=/rootfs.cpio --append "console=${con} earlycon=${earlycon} no4lvl"
>>>>>
>>>>> To simulate crash, we trigger it using:
>>>>> sleep 100 & kill -6 $!
>>>>>
>>>>> This boots into the crash kernel (kdump), where we then copy the /proc/vmcore file back to the host for analysis.
>>>>>
>>>>>>> However, the kernel still uses addresses in these regions—for
>>>>>>> example, for IRQ pointers. Since the crash tool needs access to
>>>>>>> these memory areas to function correctly, their exclusion breaks the analysis.
>>>>>> Wdym with "IRQ pointers"? Also, what version (sha1) of crash are you using?
>>>>>>
>>>>>> We are currently using crash-utility version 9.0.0 (master).
>>>>>> From the crash analysis logs, we observed errors like:
>>>>>>
>>>>> "......
>>>>> IRQ stack pointer[0] is ffffffd6fbdcc068
>>>>>> crash: read error: kernel virtual address: ffffffd6fbdcc068 type: "IRQ stack pointer"
>>>>> .....
>>>>>
>>>>>> <read_kdump: addr: ffffffff80edf1cc paddr: 8010df1cc cnt: 4>
>>>>> <readmem: ffffffd6fbdd6880, KVADDR, "runqueues entry (per_cpu)",
>>>>> 3456, (FOE), 55acf03963e0>
>>>>>> read_kdump: addr: ffffffd6fbdd6880 paddr: 8fbdd6880 cnt: 1920<
>>>>> crash: read error: kernel virtual address: ffffffd6fbdd6880 type: "runqueues entry (per_cpu)"
>>>>
>>>> I can't reproduce this issue on qemu, booting with sv39. I'm using the latest kexec-tools (which recently merged riscv .support), crash 9.0.0 and kernel 6.16.0-rc4. Note that I'm using crash in qemu.
>>>>
>>>> Are you able to reproduce this on qemu too?
>>> Yes, I am using qemu too on main and crash kernel, with latest
>>> kexec-tools, crash 9.0.0 and kernel 6.15
>>>
>>>
>>>> Maybe that's related to the config, can you share your config?
>>> this is my dev_config
>>>
>>> CONFIG_SYSVIPC=y
>>> CONFIG_POSIX_MQUEUE=y
>>> CONFIG_AUDIT=y
>>> CONFIG_NO_HZ_IDLE=y
>>> CONFIG_HIGH_RES_TIMERS=y
>>> CONFIG_BPF_SYSCALL=y
>>> CONFIG_PREEMPT_RT=y
>>> CONFIG_TASKSTATS=y
>>> CONFIG_TASK_DELAY_ACCT=y
>>> CONFIG_PSI=y
>>> CONFIG_IKCONFIG=y
>>> CONFIG_IKCONFIG_PROC=y
>>> CONFIG_CGROUPS=y
>>> CONFIG_MEMCG=y
>>> CONFIG_CGROUP_SCHED=y
>>> CONFIG_CFS_BANDWIDTH=y
>>> CONFIG_RT_GROUP_SCHED=y
>>> CONFIG_CGROUP_PIDS=y
>>> CONFIG_CGROUP_FREEZER=y
>>> CONFIG_CGROUP_HUGETLB=y
>>> CONFIG_CPUSETS=y
>>> CONFIG_CGROUP_DEVICE=y
>>> CONFIG_CGROUP_CPUACCT=y
>>> CONFIG_CGROUP_PERF=y
>>> CONFIG_CGROUP_BPF=y
>>> CONFIG_NAMESPACES=y
>>> CONFIG_USER_NS=y
>>> CONFIG_CHECKPOINT_RESTORE=y
>>> CONFIG_BLK_DEV_INITRD=y
>>> CONFIG_EXPERT=y
>>> CONFIG_PROFILING=y
>>> CONFIG_KEXEC=y
>>> CONFIG_ARCH_VIRT=y
>>> CONFIG_NONPORTABLE=y
>>> CONFIG_SMP=y
>>> CONFIG_NR_CPUS=32
>>> CONFIG_HZ_1000=y
>>> CONFIG_CPU_IDLE=y
>>> CONFIG_MODULES=y
>>> CONFIG_MODULE_UNLOAD=y
>>> CONFIG_IOSCHED_BFQ=y
>>> CONFIG_PAGE_REPORTING=y
>>> CONFIG_PERCPU_STATS=y
>>> CONFIG_NET=y
>>> CONFIG_PACKET=y
>>> CONFIG_UNIX=y
>>> CONFIG_XFRM_USER=m
>>> CONFIG_INET=y
>>> CONFIG_IP_MULTICAST=y
>>> CONFIG_IP_ADVANCED_ROUTER=y
>>> CONFIG_INET_ESP=m
>>> CONFIG_NETWORK_SECMARK=y
>>> CONFIG_NETFILTER=y
>>> CONFIG_IP_NF_IPTABLES=y
>>> CONFIG_IP_NF_FILTER=y
>>> CONFIG_BRIDGE=m
>>> CONFIG_BRIDGE_VLAN_FILTERING=y
>>> CONFIG_VLAN_8021Q=m
>>> CONFIG_NET_SCHED=y
>>> CONFIG_NET_CLS_CGROUP=m
>>> CONFIG_NETLINK_DIAG=y
>>> CONFIG_NET_L3_MASTER_DEV=y
>>> CONFIG_CGROUP_NET_PRIO=y
>>> CONFIG_FAILOVER=y
>>> CONFIG_DEVTMPFS=y
>>> CONFIG_DEVTMPFS_MOUNT=y
>>> CONFIG_MTD=y
>>> CONFIG_MTD_BLOCK=y
>>> CONFIG_MTD_CFI=y
>>> CONFIG_MTD_CFI_INTELEXT=y
>>> CONFIG_MTD_PHYSMAP=y
>>> CONFIG_MTD_PHYSMAP_OF=y
>>> CONFIG_BLK_DEV_LOOP=y
>>> CONFIG_BLK_DEV_LOOP_MIN_COUNT=0
>>> CONFIG_VIRTIO_BLK=y
>>> CONFIG_SCSI=y
>>> CONFIG_BLK_DEV_SD=y
>>> CONFIG_SCSI_VIRTIO=y
>>> CONFIG_MD=y
>>> CONFIG_BLK_DEV_DM=y
>>> CONFIG_NETDEVICES=y
>>> CONFIG_MACB=y
>>> CONFIG_PCS_XPCS=m
>>> CONFIG_SERIO_LIBPS2=y
>>> CONFIG_VT_HW_CONSOLE_BINDING=y
>>> CONFIG_LEGACY_PTY_COUNT=16
>>> CONFIG_SERIAL_8250=y
>>> CONFIG_SERIAL_8250_CONSOLE=y
>>> CONFIG_SERIAL_OF_PLATFORM=y
>>> CONFIG_SERIAL_EARLYCON_RISCV_SBI=y
>>> CONFIG_VIRTIO_CONSOLE=y
>>> CONFIG_HW_RANDOM=y
>>> CONFIG_HW_RANDOM_VIRTIO=y
>>> CONFIG_I2C=y
>>> CONFIG_I2C_DESIGNWARE_CORE=y
>>> CONFIG_SPI=y
>>> CONFIG_PINCTRL=y
>>> CONFIG_PINCTRL_SINGLE=y
>>> CONFIG_GPIOLIB=y
>>> CONFIG_GPIO_SYSFS=y
>>> CONFIG_GPIO_DWAPB=y
>>> CONFIG_GPIO_SIFIVE=y
>>> CONFIG_POWER_SUPPLY=y
>>> CONFIG_WATCHDOG=y
>>> CONFIG_WATCHDOG_CORE=y
>>> CONFIG_REGULATOR=y
>>> CONFIG_REGULATOR_FIXED_VOLTAGE=y
>>> CONFIG_BACKLIGHT_CLASS_DEVICE=m
>>> CONFIG_SCSI_UFSHCD=y
>>> CONFIG_SCSI_UFSHCD_PLATFORM=y
>>> CONFIG_SCSI_UFS_DWC_TC_PLATFORM=y
>>> CONFIG_RTC_CLASS=y
>>> CONFIG_RTC_DRV_M41T80=y
>>> CONFIG_DMADEVICES=y
>>> CONFIG_SYNC_FILE=y
>>> CONFIG_COMMON_CLK_EYEQ=y
>>> CONFIG_RPMSG_CHAR=y
>>> CONFIG_RPMSG_CTRL=y
>>> CONFIG_RPMSG_VIRTIO=y
>>> CONFIG_RESET_CONTROLLER=y
>>> CONFIG_RESET_SIMPLE=y
>>> CONFIG_GENERIC_PHY=y
>>> CONFIG_EXT4_FS=y
>>> CONFIG_EXT4_FS_POSIX_ACL=y
>>> CONFIG_EXT4_FS_SECURITY=y
>>> CONFIG_MSDOS_FS=y
>>> CONFIG_VFAT_FS=y
>>> CONFIG_TMPFS=y
>>> CONFIG_TMPFS_POSIX_ACL=y
>>> CONFIG_HUGETLBFS=y
>>> CONFIG_KEYS=y
>>> CONFIG_SECURITY=y
>>> CONFIG_SECURITYFS=y
>>> CONFIG_SECURITY_NETWORK=y
>>> CONFIG_SECURITY_PATH=y
>>> CONFIG_CRYPTO_RSA=y
>>> CONFIG_CRYPTO_ECB=y
>>> CONFIG_CRYPTO_BLAKE2B=m
>>> CONFIG_CRYPTO_XXHASH=m
>>> CONFIG_CRYPTO_USER_API_HASH=y
>>> CONFIG_CRC_CCITT=m
>>> CONFIG_CRC_ITU_T=y
>>> CONFIG_CRC7=y
>>> CONFIG_LIBCRC32C=m
>>> CONFIG_PRINTK_TIME=y
>>> CONFIG_DYNAMIC_DEBUG=y
>>> CONFIG_DEBUG_INFO_DWARF5=y
>>> CONFIG_DEBUG_FS=y
>>> CONFIG_DEBUG_PAGEALLOC=y
>>> CONFIG_PTDUMP_DEBUGFS=y
>>> CONFIG_SCHED_STACK_END_CHECK=y
>>> CONFIG_DEBUG_VM=y
>>> CONFIG_DEBUG_VM_PGFLAGS=y
>>> CONFIG_DEBUG_MEMORY_INIT=y
>>> CONFIG_DEBUG_PER_CPU_MAPS=y
>>> CONFIG_SOFTLOCKUP_DETECTOR=y
>>> CONFIG_WQ_WATCHDOG=y
>>> CONFIG_DEBUG_RT_MUTEXES=y
>>> CONFIG_DEBUG_SPINLOCK=y
>>> CONFIG_DEBUG_ATOMIC_SLEEP=y
>>> CONFIG_DEBUG_LIST=y
>>> CONFIG_DEBUG_PLIST=y
>>> CONFIG_DEBUG_SG=y
>>> CONFIG_RCU_EQS_DEBUG=y
>>> CONFIG_MEMTEST=y
>>>
>>>>> These failures occur consistently for addresses in the 0xffffffd000000000 region.
>>>>
>>>> FYI, this region is the direct mapping (see Documentation/arch/riscv/vm-layout.rst).
>>>>
>>>> Thanks,
>>>>
>>>> Alex
>>>>
>> Hi Alex!
>>
>> Do I have something to try or help to process this issue?
>> maybe, can you give your Config and I will try it on my system?
>> Any more information I can share?
>
>
>So I'm able to reproduce your issue with your config, it only happens with kexec_load(), not kexec_file_load().
>
>Your patch does not fix the problem for me, makedumpfile still fails. I spent quite some time looking for the code that parses the memory regions and exposes them as PT_LOAD segments in vmcore, but I did not find it, do you know where that happens for kexec_load()?
>
>Thanks,
>
>Alex
>
>
The code that parses memory regions is located in kexec-tools.
To fix the issue with missing memory regions in the vmcore, we need to ensure that kexec-tools does not exclude the Reserved-memblock sections when generating ELF headers for kexec_load().
I’ve added a patch to handle this, and plan to submit it upstream to kexec-tools once this approach is confirmed and approved.
Kexec-tools patch:
---
diff --git a/kexec/arch/riscv/kexec-riscv.c b/kexec/arch/riscv/kexec-riscv.c
index f34b468..1a93b51 100644
--- a/kexec/arch/riscv/kexec-riscv.c
+++ b/kexec/arch/riscv/kexec-riscv.c
@@ -421,8 +421,11 @@ static bool to_be_excluded(char *str, unsigned long long start, unsigned long lo
!strncmp(str, KERNEL_CODE, strlen(KERNEL_CODE)) ||
!strncmp(str, KERNEL_DATA, strlen(KERNEL_DATA)))
return false;
- else
- return true;
+
+ if (!strncmp(str, "Reserved-memblock", strlen("Reserved-memblock")))
+ return false;
+
+ return true;
}
int get_memory_ranges(struct memory_range **range, int *num_ranges,
---
With this patch, the kexec-tools will no longer exclude the Reserved-memblock regions, allowing the crash tool to access the necessary memory areas for analysis.
Thanks,
Pnina
>>
>> Thanks a lot,
>> Pnina
>>
>>>>> Upon inspection, we confirmed that the physical addresses corresponding to those virtual addresses are not present in the vmcore, as they fall under Reserved memory sections.
>>>>> We tested a patch to kexec-tools that prevents exclusion of the Reserved-memblock section from the vmcore. With this patch, the issue no longer occurs, and crash analysis succeeds.
>>>>> Note: I suspect the same issue exists on ARM64, as both the signal.c and kexec-tools implementations are similar.
>>>>>
>>>>>> Thanks!
>>>>>> Björn
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@...ts.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv
Powered by blists - more mailing lists