linux-kernel - Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZIaqMpGISWKgHLK6@dread.disaster.area>
Date:   Mon, 12 Jun 2023 15:16:34 +1000
From:   Dave Chinner <david@...morbit.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     "Darrick J. Wong" <djwong@...nel.org>,
        Zorro Lang <zlang@...hat.com>, linux-xfs@...r.kernel.org,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Mike Christie <michael.christie@...cle.com>,
        "Michael S. Tsirkin" <mst@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report]
 fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)

On Sun, Jun 11, 2023 at 08:14:25PM -0700, Linus Torvalds wrote:
> On Sun, Jun 11, 2023 at 7:22 PM Dave Chinner <david@...morbit.com> wrote:
> >
> > I guess the regression fix needs a regression fix....
> 
> Yup.
> 
> From the description of the problem, it sounds like this happens on
> real hardware, no vhost anywhere?
>
> Or maybe Darrick (who doesn't see the issue) is running on raw
> hardware, and you and Zorro are running in a virtual environment?

I'm testing inside VMs and seeing it, I can't speak for anyone else.

....

> So *maybe* this attached patch might fix it? I haven't thought very
> deeply about this, but vhost workers most definitely shouldn't call
> do_coredump(), since they are then not counted.
> 
> (And again, I think we should just check that PF_IO_WORKER bit, not
> use this more complex test, but that's a separate and bigger change).
> 
>               Linus

>  kernel/signal.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 2547fa73bde5..a1e11ee8537c 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2847,6 +2847,10 @@ bool get_signal(struct ksignal *ksig)
>  		 */
>  		current->flags |= PF_SIGNALED;
>  
> +		/* vhost workers don't participate in core dups */
> +		if ((current->flags & (PF_IO_WORKER | PF_USER_WORKER)) != PF_USER_WORKER)
> +			goto out;
> +
>  		if (sig_kernel_coredump(signr)) {
>  			if (print_fatal_signals)
>  				print_fatal_signal(ksig->info.si_signo);


That would appear to make things worse. mkfs.xfs hung in Z state on
exit and never returned to the shell. Also, multiple processes are
livelocked like this:

 Sending NMI from CPU 0 to CPUs 1-3:
 NMI backtrace for cpu 2
 CPU: 2 PID: 3409 Comm: pmlogger_farm Not tainted 6.4.0-rc5-dgc+ #1822
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
 RIP: 0010:uprobe_deny_signal+0x5/0x90
 Code: 48 c7 c1 c4 64 62 82 48 c7 c7 d1 64 62 82 e8 b2 39 ec ff e9 70 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 <55> 31 4
 RSP: 0018:ffffc900023abdf0 EFLAGS: 00000202
 RAX: 0000000000000004 RBX: ffff888103b127c0 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000296 RDI: ffffc900023abe70
 RBP: ffffc900023abe60 R08: 0000000000000001 R09: 0000000000000001
 R10: 0000000000000000 R11: ffff88813bd2ccf0 R12: ffff888103b127c0
 R13: ffffc900023abe70 R14: ffff888110413700 R15: ffff888103d26e80
 FS:  00007f35497a4740(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
 CR2: 00007ffd4ca0ce80 CR3: 000000010f7d1000 CR4: 00000000000006e0
 Call Trace:
  <NMI>
  ? show_regs+0x61/0x70
  ? nmi_cpu_backtrace+0x88/0xf0
  ? nmi_cpu_backtrace_handler+0x11/0x20
  ? nmi_handle+0x57/0x150
  ? default_do_nmi+0x49/0x240
  ? exc_nmi+0xf4/0x110
  ? end_repeat_nmi+0x16/0x31
  ? uprobe_deny_signal+0x5/0x90
  ? uprobe_deny_signal+0x5/0x90
  ? uprobe_deny_signal+0x5/0x90
  </NMI>
  <TASK>
  ? get_signal+0x94/0x9b0
  ? signal_setup_done+0x66/0x190
  arch_do_signal_or_restart+0x2f/0x260
  exit_to_user_mode_prepare+0x181/0x1c0
  syscall_exit_to_user_mode+0x16/0x40
  do_syscall_64+0x40/0x80
  entry_SYSCALL_64_after_hwframe+0x63/0xcd
 RIP: 0023:0xffff888103b127c0
 Code: Unable to access opcode bytes at 0xffff888103b12796.
 RSP: 002b:00007ffd4ca0d0ac EFLAGS: 00000202 ORIG_RAX: 000000000000003d
 RAX: 0000000000000009 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 00007ffd4d20bb9c RDI: 00000000ffffffff
 RBP: 00007ffd4d20bb9c R08: 0000000000000002 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
 R13: 00007ffd4d20bba0 R14: 00005604571fc380 R15: 0000000000000001
  </TASK>
 NMI backtrace for cpu 3
 CPU: 3 PID: 3526 Comm: pmlogger_check Not tainted 6.4.0-rc5-dgc+ #1822
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
 RIP: 0010:fixup_exception+0x72/0x260
 Code: 14 0f 87 03 02 00 00 ff 24 d5 98 67 22 82 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 41 81 cd 00 00 00 40 4d 63 ed 4d 89 6c 24 50 <31> c0 9
 RSP: 0018:ffffc9000275bb58 EFLAGS: 00000083
 RAX: 000000000000000f RBX: ffffffff827d0a4c RCX: ffffffff810c5f95
 RDX: 000000000000000f RSI: ffffffff827d0a4c RDI: ffffc9000275bb28
 RBP: ffffc9000275bb80 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: ffffc9000275bc78
 R13: 000000000000000e R14: 000000008f5ded3f R15: 0000000000000000
 FS:  00007f56a36de740(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
 CR2: 000000008f5ded3f CR3: 000000010dcde000 CR4: 00000000000006e0
 Call Trace:
  <NMI>
  ? show_regs+0x61/0x70
  ? nmi_cpu_backtrace+0x88/0xf0
  ? nmi_cpu_backtrace_handler+0x11/0x20
  ? nmi_handle+0x57/0x150
  ? default_do_nmi+0x49/0x240
  ? exc_nmi+0xf4/0x110
  ? end_repeat_nmi+0x16/0x31
  ? copy_fpstate_to_sigframe+0x1c5/0x3a0
  ? fixup_exception+0x72/0x260
  ? fixup_exception+0x72/0x260
  ? fixup_exception+0x72/0x260
  </NMI>
  <TASK>
  kernelmode_fixup_or_oops+0x49/0x120
  __bad_area_nosemaphore+0x15a/0x230
  ? __bad_area+0x57/0x80
  bad_area_nosemaphore+0x16/0x20
  exc_page_fault+0x323/0x880
  asm_exc_page_fault+0x27/0x30
 RIP: 0010:copy_fpstate_to_sigframe+0x1c5/0x3a0
 Code: 45 89 bc 24 40 25 00 00 f0 41 80 64 24 01 bf e9 f5 fe ff ff be 3c 00 00 00 48 c7 c7 77 9c 5f 82 e8 00 2a 23 00 31 c0 0f 1f 00 <49> 0f 1
 RSP: 0018:ffffc9000275bd28 EFLAGS: 00010246
 RAX: 000000000000000e RBX: 000000008f5de7ec RCX: ffffc9000275bda8
 RDX: 000000008f5ded40 RSI: 000000000000003c RDI: ffffffff825f9c77
 RBP: ffffc9000275bd98 R08: ffffc9000275be30 R09: 0000000000000001
 R10: 0000000000000000 R11: ffffc90000138ff8 R12: ffff8881106527c0
 R13: 000000008f5deb40 R14: ffff888110654d40 R15: ffff88810a653f40
  ? copy_fpstate_to_sigframe+0x1c0/0x3a0
  ? __might_sleep+0x42/0x70
  get_sigframe+0xcd/0x2b0
  ia32_setup_frame+0x61/0x230
  arch_do_signal_or_restart+0x1d1/0x260
  exit_to_user_mode_prepare+0x181/0x1c0
  irqentry_exit_to_user_mode+0x9/0x30
  irqentry_exit+0x33/0x40
  exc_page_fault+0x1b6/0x880
  asm_exc_page_fault+0x27/0x30
 RIP: 0023:0x106527c0
 Code: Unable to access opcode bytes at 0x10652796.
 RSP: 002b:000000008f5ded6c EFLAGS: 00010202
 RAX: 000000000000000b RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 00007ffd8f5df2ec RDI: 00000000ffffffff
 RBP: 00007ffd8f5df2ec R08: 0000000000000000 R09: 00005558962eb526
 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
 R13: 00007ffd8f5df2f0 R14: 00005558962b5e60 R15: 0000000000000001
  </TASK>


Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com