lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 10 Aug 2023 19:52:05 +0200
From:   Jan Kara <jack@...e.cz>
To:     Kent Overstreet <kent.overstreet@...ux.dev>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-bcachefs@...r.kernel.org, djwong@...nel.org,
        dchinner@...hat.com, sandeen@...hat.com, willy@...radead.org,
        josef@...icpanda.com, tytso@....edu, bfoster@...hat.com,
        jack@...e.cz, andreas.gruenbacher@...il.com, brauner@...nel.org,
        peterz@...radead.org, akpm@...ux-foundation.org,
        dhowells@...hat.com, snitzer@...nel.org, axboe@...nel.dk
Subject: Re: [GIT PULL] bcachefs

On Thu 10-08-23 11:54:53, Kent Overstreet wrote:
> > And there clearly is something very strange going on with superblock
> > handling
> 
> This deserves an explanation because sget() is a bit nutty.
> 
> The way sget() is conventionally used for block device filesystems, the
> block device open _isn't actually exclusive_ - sure, FMODE_EXCL is used,
> but the holder is the fs type pointer, so it won't exclude with other
> opens of the same fs type.
> 
> That means the only protection from multiple opens scribbling over each
> other is sget() itself - but if the bdev handle ever outlives the
> superblock we're completely screwed; that's a silent data corruption bug
> that we can't easily catch, and if the filesystem teardown path has any
> asynchronous stuff going on (and of course it does) that's not a hard
> mistake to make. I've observed at least one bug that looked suspiciously
> like that, but I don't think I quite pinned it down at the time.

This is just being changed - check Christian's VFS tree. There are patches
that make sget() use superblock pointer as a bdev holder so the reuse
you're speaking about isn't a problem anymore.

> It also forces the caller to separate opening of the block devices from
> the rest of filesystem initialization, which is a bit less than ideal.
> 
> Anyways, bcachefs just wants to be able to do real exclusive opens of
> the block devices, and we do all filesystem bringup with a single
> bch2_fs_open() call. I think this could be made to work with the way
> sget() wants to work, but it'd require reworking the locking in
> sget() - it does everything, including the test() and set() calls, under
> a single spinlock.

Yeah. Maybe the current upstream changes aren't enough to make your life
easier for bcachefs, btrfs does its special thing as well after all because
mount also involves multiple devices for it. I just wanted to mention that
the exclusive bdev open thing is changing.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ