[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202410231429.b91daa36-oliver.sang@intel.com>
Date: Wed, 23 Oct 2024 14:54:39 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Qi Zheng <zhengqi.arch@...edance.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
<david@...hat.com>, <hughd@...gle.com>, <willy@...radead.org>,
<mgorman@...e.de>, <muchun.song@...ux.dev>, <vbabka@...nel.org>,
<akpm@...ux-foundation.org>, <zokeefe@...gle.com>, <rientjes@...gle.com>,
<jannh@...gle.com>, <peterx@...hat.com>, <linux-mm@...ck.org>,
<x86@...nel.org>, Qi Zheng <zhengqi.arch@...edance.com>,
<oliver.sang@...el.com>
Subject: Re: [PATCH v1 7/7] x86: select ARCH_SUPPORTS_PT_RECLAIM if X86_64
Hello,
by this commit, below two configs are enabled
--- /pkg/linux/x86_64-rhel-8.3/gcc-12/c9f9931196ccc64ec25268538edc327c3add08de/.config 2024-10-20 21:40:11.559320920 +0800
+++ /pkg/linux/x86_64-rhel-8.3/gcc-12/2e22ca3c1f2a6d64740f7b875d869d1f80f78ce8/.config 2024-10-20 06:02:46.008212911 +0800
@@ -1207,6 +1207,8 @@ CONFIG_IOMMU_MM_DATA=y
CONFIG_EXECMEM=y
CONFIG_NUMA_MEMBLKS=y
CONFIG_NUMA_EMU=y
+CONFIG_ARCH_SUPPORTS_PT_RECLAIM=y
+CONFIG_PT_RECLAIM=y
then we noticed various issues which we don't observe on parent.
kernel test robot noticed "BUG:Bad_rss-counter_state_mm:#type:MM_FILEPAGES_val" on:
commit: 2e22ca3c1f2a6d64740f7b875d869d1f80f78ce8 ("[PATCH v1 7/7] x86: select ARCH_SUPPORTS_PT_RECLAIM if X86_64")
url: https://github.com/intel-lab-lkp/linux/commits/Qi-Zheng/mm-khugepaged-retract_page_tables-use-pte_offset_map_lock/20241017-174953
patch link: https://lore.kernel.org/all/0f6e7fb7fb21431710f28df60738f8be98fe9dd9.1729157502.git.zhengqi.arch@bytedance.com/
patch subject: [PATCH v1 7/7] x86: select ARCH_SUPPORTS_PT_RECLAIM if X86_64
in testcase: boot
config: x86_64-rhel-8.3
compiler: gcc-12
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
(please refer to attached dmesg/kmsg for entire log/backtrace)
+-----------------------------------------------------------+------------+------------+
| | c9f9931196 | 2e22ca3c1f |
+-----------------------------------------------------------+------------+------------+
| boot_failures | 0 | 6 |
| BUG:Bad_rss-counter_state_mm:#type:MM_FILEPAGES_val | 0 | 5 |
| BUG:Bad_rss-counter_state_mm:#type:MM_ANONPAGES_val | 0 | 6 |
| BUG:Bad_page_cache_in_process | 0 | 3 |
| segfault_at_ip_sp_error | 0 | 5 |
| Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode= | 0 | 4 |
+-----------------------------------------------------------+------------+------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202410231429.b91daa36-oliver.sang@intel.com
[ 9.153217][ T1] BUG: Bad rss-counter state mm:000000006dcf9cdd type:MM_FILEPAGES val:40
[ 9.153929][ T1] BUG: Bad rss-counter state mm:000000006dcf9cdd type:MM_ANONPAGES val:1
...
[ 9.444419][ T214] systemd[214]: segfault at 0 ip 0000000000000000 sp 00000000f6b1c2ec error 14 likely on CPU 1 (core 1, socket 0)
[ 9.445388][ T214] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
Code starting with the faulting instruction
===========================================
[ OK ] Started LKP bootstrap.
[ 9.453331][ T1] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 9.454023][ T1] CPU: 1 UID: 0 PID: 1 Comm: systemd Not tainted 6.12.0-rc3-next-20241016-00007-g2e22ca3c1f2a #1
[ 9.454818][ T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 9.455601][ T1] Call Trace:
[ 9.455906][ T1] <TASK>
[ 9.456184][ T1] panic (kernel/panic.c:354)
[ 9.456522][ T1] do_exit (include/linux/audit.h:327 kernel/exit.c:920)
[ 9.456884][ T1] do_group_exit (kernel/exit.c:1069)
[ 9.457252][ T1] get_signal (kernel/signal.c:2917)
[ 9.457615][ T1] arch_do_signal_or_restart (arch/x86/kernel/signal.c:337)
[ 9.458053][ T1] syscall_exit_to_user_mode (kernel/entry/common.c:113 include/linux/entry-common.h:328 kernel/entry/common.c:207 kernel/entry/common.c:218)
[ 9.458495][ T1] __do_fast_syscall_32 (arch/x86/entry/common.c:391)
[ 9.458904][ T1] do_fast_syscall_32 (arch/x86/entry/common.c:411)
[ 9.459299][ T1] entry_SYSENTER_compat_after_hwframe (arch/x86/entry/entry_64_compat.S:127)
[ 9.459788][ T1] RIP: 0023:0xf7fbf589
[ 9.460130][ T1] Code: 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
All code
========
0: 03 74 d8 01 add 0x1(%rax,%rbx,8),%esi
...
20: 00 51 52 add %dl,0x52(%rcx)
23: 55 push %rbp
24:* 89 e5 mov %esp,%ebp <-- trapping instruction
26: 0f 34 sysenter
28: cd 80 int $0x80
2a: 5d pop %rbp
2b: 5a pop %rdx
2c: 59 pop %rcx
2d: c3 retq
2e: 90 nop
2f: 90 nop
30: 90 nop
31: 90 nop
32: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi
39: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi
Code starting with the faulting instruction
===========================================
0: 5d pop %rbp
1: 5a pop %rdx
2: 59 pop %rcx
3: c3 retq
4: 90 nop
5: 90 nop
6: 90 nop
7: 90 nop
8: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi
f: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi
[ 9.461528][ T1] RSP: 002b:00000000ff837680 EFLAGS: 00200206 ORIG_RAX: 0000000000000006
[ 9.462212][ T1] RAX: 0000000000000000 RBX: 0000000000000038 RCX: 0000000000000002
[ 9.462834][ T1] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000f731e6cc
[ 9.463462][ T1] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 9.464083][ T1] R10: 0000000000000000 R11: 0000000000200206 R12: 0000000000000000
[ 9.464714][ T1] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 9.465343][ T1] </TASK>
[ 9.465677][ T1] Kernel Offset: 0x4600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241023/202410231429.b91daa36-oliver.sang@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists