[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CA+FbhJN9-rbPSpHbgLmjcgV3=1QmqUVsHS70KpnL5i_DxMp4bg@mail.gmail.com>
Date: Fri, 29 Sep 2023 12:36:56 +0200
From: Marcus Seyfarth <m.seyfarth@...il.com>
To: Tor Vic <torvic9@...lbox.org>
Cc: Bagas Sanjaya <bagasdotme@...il.com>,
"Paul E. McKenney" <paulmck@...nel.org>,
Frederic Weisbecker <frederic@...nel.org>,
Neeraj Upadhyay <quic_neeraju@...cinc.com>,
Joel Fernandes <joel@...lfernandes.org>,
Josh Triplett <josh@...htriplett.org>,
Boqun Feng <boqun.feng@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Lai Jiangshan <jiangshanlai@...il.com>,
Zqiang <qiang.zhang1211@...il.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux RCU <rcu@...r.kernel.org>
Subject: Re: Fwd: [6.5.5] System slowdown during compilation workload and RIP: lazy_rcu_shrink_scan
> This seems to be a heavily patched kernel.
> Does this problem also appear with a vanilla 6.5 kernel?
Indeed, CachyOS comes with additional patches. I haven't found an easy
way to try out a vanilla Kernel yet (there is
https://aur.archlinux.org/packages/linux-mainline - but that is
already on 6.6 RC3). As CachyOS also makes use of ananicy and uksmd, I
don't know if it is the best idea for the stability of the system to
test with such a vanilla Kernel that doesn't support these extra
features.
I can present a new data point however which looks quite a bit
different: I've attached a new trace from today of an OOM that
recovered successfully (which means without the freezes afterwards)
with the same Kernel in use, the relevant part is:
[29. Sep 12:14] Qt bearer threa invoked oom-killer:
gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0,
oom_score_adj=200
[ +0,000007] CPU: 25 PID: 1126 Comm: Qt bearer threa Tainted: G
O 6.5.5-2.1-cachyos-lto #1
ae9643c86e4447bdd5b0d7da31c14411335d3e8d
[ +0,000003] Hardware name: LENOVO GAMING TF/X99-TF Gaming, BIOS
CX99DE26 10/10/2020
[ +0,000001] Call Trace:
[ +0,000002] <TASK>
[ +0,000003] dump_header+0x51/0x260
[ +0,000005] oom_kill_process+0x92/0x1a0
[ +0,000003] out_of_memory+0x227/0x320
[ +0,000002] __folio_alloc+0x2e46/0x6ee0
[ +0,000005] ? blk_mq_flush_plug_list+0xaa/0xa00
[ +0,000005] __filemap_get_folio+0x1e2/0x460
[ +0,000002] filemap_fault+0x56c/0x1260
[ +0,000004] do_pte_missing+0x194/0x2da0
[ +0,000004] ? ____fput+0x550/0x2d60
[ +0,000002] ? rtnl_dump_all+0xff/0x120
[ +0,000004] ? free_unref_page+0x237/0xc20
[ +0,000003] ? __wake_up+0xe4/0x1c0
[ +0,000004] handle_mm_fault+0x976/0xe00
[ +0,000003] do_user_addr_fault+0x8ca/0x2f80
[ +0,000002] ? do_syscall_64+0x68/0x80
[ +0,000005] exc_page_fault+0x66/0x160
[ +0,000003] asm_exc_page_fault+0x22/0x30
[ +0,000005] RIP: 0033:0x7f5243289003
[ +0,000014] Code: Unable to access opcode bytes at 0x7f5243288fd9.
[ +0,000001] RSP: 002b:00007f52227fae98 EFLAGS: 00010206
[ +0,000002] RAX: 00007f5272534d40 RBX: 00007f520c012930 RCX: 0000000000000055
[ +0,000002] RDX: 0000000000000005 RSI: 00007f520c00e270 RDI: 00007f520c012950
[ +0,000001] RBP: 0000557650c0c690 R08: 00007f520c011460 R09: 00000007f520c00e
[ +0,000001] R10: 00007f520c000058 R11: 0000000000000003 R12: 00007f52227faf28
[ +0,000001] R13: 00007f520c00f4e0 R14: 0000000000000000 R15: 00007f520c00f4f8
[ +0,000001] </TASK>
[ +0,000001] Mem-Info:
[ +0,000001] active_anon:1474442 inactive_anon:9717963 isolated_anon:0
active_file:10365 inactive_file:6709 isolated_file:0
unevictable:0 dirty:0 writeback:0
slab_reclaimable:100405 slab_unreclaimable:112907
mapped:2811 shmem:1465 pagetables:58902
sec_pagetables:0 bounce:0
kernel_misc_reclaimable:0
free:57924 free_pcp:4870 free_cma:0
[ +0,000004] Node 0 active_anon:5897768kB inactive_anon:38871852kB
active_file:41460kB inactive_file:26836kB unevictable:0kB
isolated(anon):0kB isolated(file):0kB mapped:11244kB dirty:0>
[ +0,000003] DMA free:15360kB boost:0kB min:8kB low:200kB high:392kB
reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
active_file:0kB inactive_file:0kB unevictable:0kB writepen>
[ +0,000003] lowmem_reserve[]: 0 1762 47954 47954
[ +0,000003] DMA32 free:185848kB boost:0kB min:1056kB low:24244kB
high:47432kB reserved_highatomic:64KB active_anon:204464kB
inactive_anon:1429212kB active_file:0kB inactive_file:0kB un>
[ +0,000002] lowmem_reserve[]: 0 0 46191 46191
[ +0,000002] Normal free:30488kB boost:0kB min:26964kB low:618252kB
high:1209540kB reserved_highatomic:4352KB active_anon:5693304kB
inactive_anon:37442640kB active_file:41652kB inactive>
[ +0,000003] lowmem_reserve[]: 0 0 0 0
[ +0,000002] DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB
0*512kB 1*1024kB (M) 1*2048kB (M) 3*4096kB (M) = 15360kB
[ +0,000006] DMA32: 176*4kB (UM) 99*8kB (UM) 486*16kB (UME) 174*32kB
(UME) 134*64kB (UME) 65*128kB (UME) 8*256kB (UME) 5*512kB (M)
40*1024kB (ME) 11*2048kB (UM) 21*4096kB (M) = 185848kB
[ +0,000008] Normal: 308*4kB (UME) 301*8kB (UME) 344*16kB (UME)
409*32kB (UME) 35*64kB (UME) 7*128kB (UM) 3*256kB (U) 1*512kB (U)
0*1024kB 0*2048kB 0*4096kB = 26648kB
[ +0,000007] 20391 total pagecache pages
[ +0,000001] 104 pages in swap cache
[ +0,000001] Free swap = 64kB
[ +0,000000] Total swap = 12293372kB
[ +0,000001] 12542844 pages RAM
[ +0,000000] 0 pages HighMem/MovableOnly
[ +0,000001] 249387 pages reserved
[ +0,000000] 0 pages hwpoisoned
[ +0,000001] Tasks state (memory values in pages):
> What if you disable RCU_LAZY?
I will try that out over the next coming days; by default it is
enabled on CachyOS.
View attachment "dmesg2.txt" of type "text/plain" (111506 bytes)
Powered by blists - more mailing lists