lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKPOu+9i69YEnSNJpeFffh6_+nONWnFRMk=SS4sBJP9-3nLD0g@mail.gmail.com>
Date: Thu, 28 Nov 2024 11:00:31 +0100
From: Max Kellermann <max.kellermann@...os.com>
To: Gao Xiang <hsiangkao@...ux.alibaba.com>
Cc: Christoph Hellwig <hch@....de>, Suren Baghdasaryan <surenb@...gle.com>, Johannes Weiner <hannes@...xchg.org>, 
	Peter Zijlstra <peterz@...radead.org>, linux-kernel@...r.kernel.org
Subject: Re: Bad psi_group_cpu.tasks[NR_MEMSTALL] counter

On Thu, Nov 21, 2024 at 2:18 PM Gao Xiang <hsiangkao@...ux.alibaba.com> wrote:
> Just saw this. I guess your _recent_ 6.11.9 bug is actually
> related to EROFS since EROFS uses readahead_expand().  I think
> your recent report was introduced by a recent backport fix
> commit 9e2f9d34dd12 ("erofs: handle overlapped pclusters out of crafted images properly")
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v6.11.9&id=9cfa199bcbbbba31cbf97b2786f44f4464f3f29a
>
> bio can be NULL after this patch and causes
> unbalanced psi_memstall_{enter,leave}().  It can be fixed as
> (the diff below could be damaged due to my email client):

With your patch, the PSI warning (from Suren's debugging patch) fired
again last night. Which means there may be other instances of this bug
left.

 ------------[ cut here ]------------
 Stall from readahead_expand+0xca/0x1d0 was never cleared
 WARNING: CPU: 133 PID: 91645 at kernel/sched/psi.c:989
psi_task_switch+0x126/0x230
 Modules linked in:
 CPU: 133 UID: 3221274747 PID: 91645 Comm: php-cgi8.1 Tainted: G
 W          6.11.10-cm4all2-es+ #267
 Tainted: [W]=WARN
 Hardware name: Dell Inc. PowerEdge R7615/0G9DHV, BIOS 1.6.10 12/08/2023
 RIP: 0010:psi_task_switch+0x126/0x230
 Code: f6 75 e6 41 f6 44 24 18 80 74 36 41 f6 84 24 d0 08 00 00 02 74
2b 49 8b b4 24 d8 08 00 00 48 c7 c7 20 c8 8d a8 e8 fa 1f f9 ff <0f> 0b
41 f6 44 24 18 80 74 0d 41 f6 84 24 d0 08 00 00 02 74 02 0f
 RSP: 0018:ffff96be9c28b9a8 EFLAGS: 00010086
 RAX: 0000000000000000 RBX: 0000000000000085 RCX: 0000000000000027
 RDX: ffff8997b995c8c8 RSI: 0000000000000001 RDI: ffff8997b995c8c0
 RBP: 000000000000001c R08: 00000000ffff7fff R09: 0000000000000058
 R10: 00000000ffff7fff R11: ffff899abd2a1000 R12: ffff891db3b85c00
 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffff8997b9940000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f26d7aba480 CR3: 000000c07c61a006 CR4: 0000000000770ef0
 PKRU: 55555554
 Call Trace:
  <TASK>
  ? __warn+0x93/0x140
  ? psi_task_switch+0x126/0x230
  ? report_bug+0x174/0x1a0
  ? handle_bug+0x53/0x90
  ? exc_invalid_op+0x17/0x70
  ? asm_exc_invalid_op+0x16/0x20
  ? psi_task_switch+0x126/0x230
  ? psi_task_switch+0x126/0x230
  __schedule+0x980/0x10f0
  do_task_dead+0x3e/0x40
  do_exit+0x6ed/0x970
  do_group_exit+0x2c/0x80
  __x64_sys_exit_group+0x14/0x20
  x64_sys_call+0x15aa/0x17b0
  do_syscall_64+0x64/0x100
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? get_page_from_freelist+0x60e/0x1140
  ? cgroup_rstat_updated+0x88/0x210
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? __mod_memcg_lruvec_state+0x91/0x140
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? __lruvec_stat_mod_folio+0x80/0xd0
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? folio_add_file_rmap_ptes+0x37/0xb0
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? set_pte_range+0xb7/0x280
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? next_uptodate_folio+0x83/0x270
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? filemap_map_pages+0x4a2/0x590
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? do_fault+0x291/0x4d0
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? __handle_mm_fault+0x31c/0x1060
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? __count_memcg_events+0x53/0xf0
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? handle_mm_fault+0xb6/0x280
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? do_user_addr_fault+0x386/0x610
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? exc_page_fault+0x6f/0x120
  entry_SYSCALL_64_after_hwframe+0x76/0x7e
 RIP: 0033:0x7f26dad48349
 Code: Unable to access opcode bytes at 0x7f26dad4831f.
 RSP: 002b:00007ffcd05a7848 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
 RAX: ffffffffffffffda RBX: 00007f26dae429e0 RCX: 00007f26dad48349
 RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
 RBP: 0000000000000000 R08: fffffffffffffd48 R09: 000055c238e82190
 R10: 00007f26d8a781a8 R11: 0000000000000246 R12: 00007f26dae429e0
 R13: 00007f26dae482e0 R14: 000000000000001e R15: 00007f26dae482c8
  </TASK>
 ---[ end trace 0000000000000000 ]---

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ