lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2024041557-stereo-wafer-1551@gregkh>
Date: Mon, 15 Apr 2024 10:47:56 +0200
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: Kent Overstreet <kent.overstreet@...ux.dev>
Cc: Johannes Berg <johannes.berg@...el.com>, linux-bcachefs@...r.kernel.org,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: debugfs revoke broken?

On Sun, Apr 14, 2024 at 06:32:07PM -0400, Kent Overstreet wrote:
> I recently started seeing test failures like the following; from the
> "shutdown complete" line we're well after the point where we called
> debugfs_remove_recursive() - yet from the backtrace we were still able
> to call into debugfs.
> 
> And I see from the history the remove path has been getting tweaked,
> so...
> 
> 00091 ========= TEST   generic/001
> 00091 
> 00092 Setting up swapspace version 1, size = 2 GiB (2147479552 bytes)
> 00092 no label, UUID=73a80295-2b03-4512-aae1-785187926ce3
> 00092 Adding 2097148k swap on /dev/vde.  Priority:-2 extents:1 across:2097148k 
> 00092 configuration error - unknown item 'NONEXISTENT' (notify administrator)
> 00092 configuration error - unknown item 'PREVENT_NO_AUTH' (notify administrator)
> 00094 configuration error - unknown item 'NONEXISTENT' (notify administrator)
> 00094 configuration error - unknown item 'PREVENT_NO_AUTH' (notify administrator)
> 00094 configuration error - unknown item 'NONEXISTENT' (notify administrator)
> 00094 configuration error - unknown item 'PREVENT_NO_AUTH' (notify administrator)
> 00101 building 001... done
> 00101 bcachefs (vdb): mounting version 1.7: mi_btree_bitmap
> 00101 bcachefs (vdb): initializing new filesystem
> 00101 bcachefs (vdb): going read-write
> 00101 bcachefs (vdb): marking superblocks
> 00101 bcachefs (vdb): initializing freespace
> 00101 bcachefs (vdb): done initializing freespace
> 00101 bcachefs (vdb): reading snapshots table
> 00101 bcachefs (vdb): reading snapshots done
> 00101 bcachefs (vdb): done starting filesystem
> 00102 FSTYP         -- bcachefs
> 00102 PLATFORM      -- Linux/aarch64 Debian-1103-bullseye-arm64-base-kvm 6.9.0-rc2-ktest-g2719f811ae24 #18142 SMP Sun Apr 14 16:26:05 NZST 2024
> 00102 MKFS_OPTIONS  -- --encrypted --no_passphrase /dev/vdc
> 00102 MOUNT_OPTIONS -- /dev/vdc /mnt/scratch
> 00102 
> 00102 bcachefs (vdc): mounting version 1.7: mi_btree_bitmap
> 00102 bcachefs (vdc): initializing new filesystem
> 00102 bcachefs (vdc): going read-write
> 00102 bcachefs (vdc): marking superblocks
> 00102 bcachefs (vdc): initializing freespace
> 00102 bcachefs (vdc): done initializing freespace
> 00102 bcachefs (vdc): reading snapshots table
> 00102 bcachefs (vdc): reading snapshots done
> 00102 bcachefs (vdc): done starting filesystem
> 00102 bcachefs (vdc): shutting down
> 00102 bcachefs (vdc): going read-only
> 00102 bcachefs (vdc): finished waiting for writes to stop
> 00102 bcachefs (vdc): flushing journal and stopping allocators, journal seq 3
> 00102 bcachefs (vdc): flushing journal and stopping allocators complete, journal seq 5
> 00102 bcachefs (vdc): shutdown complete, journal seq 6
> 00102 bcachefs (vdc): marking filesystem clean
> 00102 bcachefs (vdc): shutdown complete
> 00102 bcachefs (vdb): shutting down
> 00102 bcachefs (vdb): going read-only
> 00102 bcachefs (vdb): finished waiting for writes to stop
> 00102 bcachefs (vdb): flushing journal and stopping allocators, journal seq 6
> 00102 bcachefs (vdb): flushing journal and stopping allocators complete, journal seq 7
> 00102 bcachefs (vdb): shutdown complete, journal seq 8
> 00102 bcachefs (vdb): marking filesystem clean
> 00102 bcachefs (vdb): shutdown complete
> 00102 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
> 00102 Mem abort info:
> 00102   ESR = 0x0000000096000004
> 00102   EC = 0x25: DABT (current EL), IL = 32 bits
> 00102   SET = 0, FnV = 0
> 00102   EA = 0, S1PTW = 0
> 00102   FSC = 0x04: level 0 translation fault
> 00102 Data abort info:
> 00102   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> 00102   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> 00102   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> 00102 user pgtable: 4k pages, 48-bit VAs, pgdp=000000011585c000
> 00102 [0000000000000010] pgd=0000000000000000, p4d=0000000000000000
> 00102 Internal error: Oops: 0000000096000004 [#1] SMP
> 00102 Modules linked in:
> 00102 CPU: 7 PID: 1805 Comm: cat Not tainted 6.9.0-rc2-ktest-g2719f811ae24 #18142
> 00102 Hardware name: linux,dummy-virt (DT)
> 00102 pstate: 00001005 (nzcv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--)
> 00102 pc : bch2_journal_seq_pins_to_text+0x100/0x208
> 00102 lr : bch2_journal_seq_pins_to_text+0xf0/0x208
> 00102 sp : ffff0000d6dd3c80
> 00102 x29: ffff0000d6dd3c80 x28: ffff0000ca361f00 x27: 0000000000000000
> 00102 x26: 0000000000000000 x25: ffff0000da0002c0 x24: ffff0000da0002f0
> 00102 x23: ffff0000d50668c0 x22: ffff800080998950 x21: ffff0000da0002c0
> 00102 x20: ffff0000c46165c0 x19: 0000000000000000 x18: 00000000fffffffe
> 00102 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> 00102 x14: ffffffffffffffff x13: ffff0000c0ada1df x12: ffff0000c0ada1d9
> 00102 x11: 0000000000000000 x10: 0000000000000000 x9 : ffff800080400ec8
> 00102 x8 : 0000000000000000 x7 : 20746e756f63203a x6 : 0000000000000000
> 00102 x5 : 0000000000000020 x4 : 000000000000000d x3 : ffff0000c0ada1d0
> 00102 x2 : 0000000000000010 x1 : ffff0000c0ada1d0 x0 : 0000000000000012
> 00102 Call trace:
> 00102  bch2_journal_seq_pins_to_text+0x100/0x208
> 00102  bch2_journal_pins_read+0x48/0xd0
> 00102  full_proxy_read+0x64/0xb8
> 00102  vfs_read+0xd0/0x2d0
> 00102  ksys_read+0x5c/0xe0
> 00102  __arm64_sys_read+0x20/0x30
> 00102  invoke_syscall.constprop.0+0x50/0xe0
> 00102  do_el0_svc+0x44/0xc8
> 00102  el0_svc+0x18/0x58
> 00102  el0t_64_sync_handler+0xb8/0xc0
> 00102  el0t_64_sync+0x14c/0x150

So this file was open when debugfs_remove() was called?

Any chance you can bisect?  We just fixed some issues here in
952c3fce297f ("debugfs: fix wait/cancellation handling during remove")
that I thought should have handled this.  If you revert that commit,
does things work again?  And/or what about commit 8c88a474357e
("debugfs: add API to allow debugfs operations cancellation")?  Maybe we
need to go back to not having completions at all in the debugfs remove
path and just live with waiting for the files to be removed before
continuing (which should at the least, resolve the issue you see here,
while slowing things down a bit.)

thanks,

greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ