[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKPOu+9i69YEnSNJpeFffh6_+nONWnFRMk=SS4sBJP9-3nLD0g@mail.gmail.com>
Date: Thu, 28 Nov 2024 11:00:31 +0100
From: Max Kellermann <max.kellermann@...os.com>
To: Gao Xiang <hsiangkao@...ux.alibaba.com>
Cc: Christoph Hellwig <hch@....de>, Suren Baghdasaryan <surenb@...gle.com>, Johannes Weiner <hannes@...xchg.org>,
Peter Zijlstra <peterz@...radead.org>, linux-kernel@...r.kernel.org
Subject: Re: Bad psi_group_cpu.tasks[NR_MEMSTALL] counter
On Thu, Nov 21, 2024 at 2:18 PM Gao Xiang <hsiangkao@...ux.alibaba.com> wrote:
> Just saw this. I guess your _recent_ 6.11.9 bug is actually
> related to EROFS since EROFS uses readahead_expand(). I think
> your recent report was introduced by a recent backport fix
> commit 9e2f9d34dd12 ("erofs: handle overlapped pclusters out of crafted images properly")
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v6.11.9&id=9cfa199bcbbbba31cbf97b2786f44f4464f3f29a
>
> bio can be NULL after this patch and causes
> unbalanced psi_memstall_{enter,leave}(). It can be fixed as
> (the diff below could be damaged due to my email client):
With your patch, the PSI warning (from Suren's debugging patch) fired
again last night. Which means there may be other instances of this bug
left.
------------[ cut here ]------------
Stall from readahead_expand+0xca/0x1d0 was never cleared
WARNING: CPU: 133 PID: 91645 at kernel/sched/psi.c:989
psi_task_switch+0x126/0x230
Modules linked in:
CPU: 133 UID: 3221274747 PID: 91645 Comm: php-cgi8.1 Tainted: G
W 6.11.10-cm4all2-es+ #267
Tainted: [W]=WARN
Hardware name: Dell Inc. PowerEdge R7615/0G9DHV, BIOS 1.6.10 12/08/2023
RIP: 0010:psi_task_switch+0x126/0x230
Code: f6 75 e6 41 f6 44 24 18 80 74 36 41 f6 84 24 d0 08 00 00 02 74
2b 49 8b b4 24 d8 08 00 00 48 c7 c7 20 c8 8d a8 e8 fa 1f f9 ff <0f> 0b
41 f6 44 24 18 80 74 0d 41 f6 84 24 d0 08 00 00 02 74 02 0f
RSP: 0018:ffff96be9c28b9a8 EFLAGS: 00010086
RAX: 0000000000000000 RBX: 0000000000000085 RCX: 0000000000000027
RDX: ffff8997b995c8c8 RSI: 0000000000000001 RDI: ffff8997b995c8c0
RBP: 000000000000001c R08: 00000000ffff7fff R09: 0000000000000058
R10: 00000000ffff7fff R11: ffff899abd2a1000 R12: ffff891db3b85c00
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8997b9940000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f26d7aba480 CR3: 000000c07c61a006 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
<TASK>
? __warn+0x93/0x140
? psi_task_switch+0x126/0x230
? report_bug+0x174/0x1a0
? handle_bug+0x53/0x90
? exc_invalid_op+0x17/0x70
? asm_exc_invalid_op+0x16/0x20
? psi_task_switch+0x126/0x230
? psi_task_switch+0x126/0x230
__schedule+0x980/0x10f0
do_task_dead+0x3e/0x40
do_exit+0x6ed/0x970
do_group_exit+0x2c/0x80
__x64_sys_exit_group+0x14/0x20
x64_sys_call+0x15aa/0x17b0
do_syscall_64+0x64/0x100
? srso_alias_return_thunk+0x5/0xfbef5
? get_page_from_freelist+0x60e/0x1140
? cgroup_rstat_updated+0x88/0x210
? srso_alias_return_thunk+0x5/0xfbef5
? __mod_memcg_lruvec_state+0x91/0x140
? srso_alias_return_thunk+0x5/0xfbef5
? __lruvec_stat_mod_folio+0x80/0xd0
? srso_alias_return_thunk+0x5/0xfbef5
? folio_add_file_rmap_ptes+0x37/0xb0
? srso_alias_return_thunk+0x5/0xfbef5
? set_pte_range+0xb7/0x280
? srso_alias_return_thunk+0x5/0xfbef5
? next_uptodate_folio+0x83/0x270
? srso_alias_return_thunk+0x5/0xfbef5
? filemap_map_pages+0x4a2/0x590
? srso_alias_return_thunk+0x5/0xfbef5
? do_fault+0x291/0x4d0
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? __handle_mm_fault+0x31c/0x1060
? srso_alias_return_thunk+0x5/0xfbef5
? __count_memcg_events+0x53/0xf0
? srso_alias_return_thunk+0x5/0xfbef5
? handle_mm_fault+0xb6/0x280
? srso_alias_return_thunk+0x5/0xfbef5
? do_user_addr_fault+0x386/0x610
? srso_alias_return_thunk+0x5/0xfbef5
? exc_page_fault+0x6f/0x120
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f26dad48349
Code: Unable to access opcode bytes at 0x7f26dad4831f.
RSP: 002b:00007ffcd05a7848 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f26dae429e0 RCX: 00007f26dad48349
RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
RBP: 0000000000000000 R08: fffffffffffffd48 R09: 000055c238e82190
R10: 00007f26d8a781a8 R11: 0000000000000246 R12: 00007f26dae429e0
R13: 00007f26dae482e0 R14: 000000000000001e R15: 00007f26dae482c8
</TASK>
---[ end trace 0000000000000000 ]---
Powered by blists - more mailing lists