linux-kernel - Re: Oops in netfs_rreq_unlock_folios

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKPOu+8iNpKkduNqOg4kfbnOBren58xx5hQ78DAs5FjD+FysHA@mail.gmail.com>
Date: Tue, 12 Nov 2024 08:39:05 +0100
From: Max Kellermann <max.kellermann@...os.com>
To: David Howells <dhowells@...hat.com>, netfs@...ts.linux.dev, linux-nfs@...r.kernel.org, 
	linux-kernel@...r.kernel.org, Jeff Layton <jlayton@...nel.org>
Subject: Re: Oops in netfs_rreq_unlock_folios_pgpriv2

David,

It has been two weeks since my crash bug report. Our servers are still
crashing all the time, and instead of going back to 6.6, I have
enabled `panic_on_oops` so the servers reboot automatically, hoping
that you would come up with a fix for the netfs regression soon, but I
cannot hold this up for much longer. Please help!

(Are we really the only ones experiencing this bug?)

Max

On Wed, Oct 30, 2024 at 11:23 AM Max Kellermann
<max.kellermann@...os.com> wrote:
>
> David,
>
> Meanwhile, our servers crash many times a day due to this bug.
>
> The code was added by your commit 7b589a9b45a ("netfs: Fix handling of
> USE_PGPRIV2 and WRITE_TO_CACHE flags") after I found Ceph-related bugs
> in your commit 2ff1e97587f4 ("netfs: Replace PG_fscache by setting
> folio->private and marking dirty"). Since this fix, the Ceph problems
> were gone, but yesterday, out of the blue, the NFS-using server
> started crashing (after running stable with 6.11.5 for 5 days).
>
> We can't go back to 6.10 (EOL); the newest kernel prior to your
> refactoring was 6.9 which has been EOL since July, leaving a downgrade
> to 6.6 LTS as the only remaining option.
>
> Max
>
>
>
> On Tue, Oct 29, 2024 at 9:02 AM Max Kellermann <max.kellermann@...os.com> wrote:
> >
> > Hi David,
> >
> > maybe this crash is related to your recent netfs refactoring work; it is on
> > a server with heavy NFS traffic (with fscache enabled). The kernel is
> > 6.11.5 plus a dozen patches that are not relevant for NFS/netfs/fscache.
> >
> >  BUG: unable to handle page fault for address: 0000025882015121
> >  #PF: supervisor read access in kernel mode
> >  #PF: error_code(0x0000) - not-present page
> >  PGD 0 P4D 0
> >  Oops: Oops: 0000 [#1] SMP PTI
> >  CPU: 11 UID: 0 PID: 247837 Comm: kworker/u193:32 Not tainted
> > 6.11.5-cm4all1-hp+ #219
> >  Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89
> > 10/17/2018
> >  Workqueue: nfsiod rpc_async_release
> >  RIP: 0010:netfs_rreq_unlock_folios_pgpriv2+0xd2/0x360
> >  Code: 4c 8b 04 24 48 85 c0 49 89 c5 0f 84 38 01 00 00 49 81 fd 06 04 00 00
> > 0f 84 f2 00 00 00 49 81 fd 02 04 00 00 0f 84 35 02 00 00 <49> 8b 45 20 ba
> > 00 10 00 00 49 8b 4d 00 48 c1 e0 0c 83 e1 40 74 08
> >  RSP: 0018:ffffb0056373fc90 EFLAGS: 00010216
> >  RAX: 000000000000002d RBX: ffff89de0d2a6780 RCX: 0000000000000001
> >  RDX: 00000000000000ad RSI: 0000000000000001 RDI: ffff89deb02e7b50
> >  RBP: 0000000000000000 R08: ffff89de3c9e9400 R09: 000000000000002c
> >  R10: 0000000000000008 R11: 0000000000000001 R12: 00000000000000b7
> >  R13: 0000025882015101 R14: 0000000000000000 R15: ffffb0056373fd28
> >  FS:  0000000000000000(0000) GS:ffff89f51fac0000(0000)
> > knlGS:0000000000000000
> >  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >  CR2: 0000025882015121 CR3: 000000005942e006 CR4: 00000000001706f0
> >  Call Trace:
> >   <TASK>
> >   ? __die+0x1f/0x60
> >   ? page_fault_oops+0x15c/0x450
> >   ? search_extable+0x22/0x30
> >   ? netfs_rreq_unlock_folios_pgpriv2+0xd2/0x360
> >   ? search_module_extables+0xe/0x40
> >   ? exc_page_fault+0x5e/0x100
> >   ? asm_exc_page_fault+0x22/0x30
> >   ? netfs_rreq_unlock_folios_pgpriv2+0xd2/0x360
> >   ? select_task_rq_fair+0x1ed/0x1370
> >   netfs_rreq_unlock_folios+0x40c/0x4b0
> >   netfs_rreq_assess+0x348/0x580
> >   netfs_subreq_terminated+0x193/0x2a0
> >   nfs_netfs_read_completion+0x97/0xb0
> >   nfs_read_completion+0x12e/0x200
> >   rpc_free_task+0x39/0x60
> >   rpc_async_release+0x2b/0x40
> >   process_one_work+0x134/0x2e0
> >   worker_thread+0x299/0x3a0
> >   ? __pfx_worker_thread+0x10/0x10
> >   kthread+0xba/0xe0
> >   ? __pfx_kthread+0x10/0x10
> >   ret_from_fork+0x30/0x50
> >   ? __pfx_kthread+0x10/0x10
> >   ret_from_fork_asm+0x1a/0x30
> >   </TASK>
> >  Modules linked in:
> >  CR2: 0000025882015121
> >  ---[ end trace 0000000000000000 ]---
> >  ERST: [Firmware Warn]: Firmware does not respond in time.
> >  pstore: backend (erst) writing error (-5)
> >  RIP: 0010:netfs_rreq_unlock_folios_pgpriv2+0xd2/0x360
> >  Code: 4c 8b 04 24 48 85 c0 49 89 c5 0f 84 38 01 00 00 49 81 fd 06 04 00 00
> > 0f 84 f2 00 00 00 49 81 fd 02 04 00 00 0f 84 35 02 00 00 <49> 8b 45 20 ba
> > 00 10 00 00 49 8b 4d 00 48 c1 e0 0c 83 e1 40 74 08
> >  RSP: 0018:ffffb0056373fc90 EFLAGS: 00010216
> >  RAX: 000000000000002d RBX: ffff89de0d2a6780 RCX: 0000000000000001
> >  RDX: 00000000000000ad RSI: 0000000000000001 RDI: ffff89deb02e7b50
> >  RBP: 0000000000000000 R08: ffff89de3c9e9400 R09: 000000000000002c
> >  R10: 0000000000000008 R11: 0000000000000001 R12: 00000000000000b7
> >  R13: 0000025882015101 R14: 0000000000000000 R15: ffffb0056373fd28
> >  FS:  0000000000000000(0000) GS:ffff89f51fac0000(0000)
> > knlGS:0000000000000000
> >  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >  CR2: 0000025882015121 CR3: 000000005942e006 CR4: 00000000001706f0
> >  note: kworker/u193:32[247837] exited with irqs disabled
> >
> >  (gdb) p netfs_rreq_unlock_folios_pgpriv2+0xd2
> >  $1 = (void (*)(struct netfs_io_request *, size_t *)) 0xffffffff813d80c2
> > <netfs_rreq_unlock_folios_pgpriv2+210>
> >  (gdb) disassemble netfs_rreq_unlock_folios_pgpriv2+0xd2
> >  Dump of assembler code for function netfs_rreq_unlock_folios_pgpriv2:
> >  [...]
> >    0xffffffff813d8093 <+163>: call   0xffffffff81f0ec70 <xas_find>
> >    0xffffffff813d8098 <+168>: mov    (%rsp),%r8
> >    0xffffffff813d809c <+172>: test   %rax,%rax
> >    0xffffffff813d809f <+175>: mov    %rax,%r13
> >    0xffffffff813d80a2 <+178>: je     0xffffffff813d81e0
> > <netfs_rreq_unlock_folios_pgpriv2+496>
> >    0xffffffff813d80a8 <+184>: cmp    $0x406,%r13
> >    0xffffffff813d80af <+191>: je     0xffffffff813d81a7
> > <netfs_rreq_unlock_folios_pgpriv2+439>
> >    0xffffffff813d80b5 <+197>: cmp    $0x402,%r13
> >    0xffffffff813d80bc <+204>: je     0xffffffff813d82f7
> > <netfs_rreq_unlock_folios_pgpriv2+775>
> >    0xffffffff813d80c2 <+210>: mov    0x20(%r13),%rax
> >    0xffffffff813d80c6 <+214>: mov    $0x1000,%edx
> >    0xffffffff813d80cb <+219>: mov    0x0(%r13),%rcx
> >    0xffffffff813d80cf <+223>: shl    $0xc,%rax
> >    0xffffffff813d80d3 <+227>: and    $0x40,%ecx
> >    0xffffffff813d80d6 <+230>: je     0xffffffff813d80e0
> > <netfs_rreq_unlock_folios_pgpriv2+240>
> >    0xffffffff813d80d8 <+232>: movzbl 0x40(%r13),%ecx
> >    0xffffffff813d80dd <+237>: shl    %cl,%rdx
> >
> >
> > Right now, the machine is running and I have an unstripped kernel, just in
> > case you need more information from /proc/kcore.
> >
> > Max
> >
> > [resent as text/plain only - damn you, gmail!]