[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230811081042.4zgtvemgtocfsthz@quack3>
Date: Fri, 11 Aug 2023 10:10:42 +0200
From: Jan Kara <jack@...e.cz>
To: Kent Overstreet <kent.overstreet@...ux.dev>
Cc: Jan Kara <jack@...e.cz>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-bcachefs@...r.kernel.org, djwong@...nel.org,
dchinner@...hat.com, sandeen@...hat.com, willy@...radead.org,
josef@...icpanda.com, tytso@....edu, bfoster@...hat.com,
andreas.gruenbacher@...il.com, brauner@...nel.org,
peterz@...radead.org, akpm@...ux-foundation.org,
dhowells@...hat.com, snitzer@...nel.org, axboe@...nel.dk
Subject: Re: [GIT PULL] bcachefs
On Thu 10-08-23 22:47:03, Kent Overstreet wrote:
> On Thu, Aug 10, 2023 at 07:52:05PM +0200, Jan Kara wrote:
> > On Thu 10-08-23 11:54:53, Kent Overstreet wrote:
> > > > And there clearly is something very strange going on with superblock
> > > > handling
> > >
> > > This deserves an explanation because sget() is a bit nutty.
> > >
> > > The way sget() is conventionally used for block device filesystems, the
> > > block device open _isn't actually exclusive_ - sure, FMODE_EXCL is used,
> > > but the holder is the fs type pointer, so it won't exclude with other
> > > opens of the same fs type.
> > >
> > > That means the only protection from multiple opens scribbling over each
> > > other is sget() itself - but if the bdev handle ever outlives the
> > > superblock we're completely screwed; that's a silent data corruption bug
> > > that we can't easily catch, and if the filesystem teardown path has any
> > > asynchronous stuff going on (and of course it does) that's not a hard
> > > mistake to make. I've observed at least one bug that looked suspiciously
> > > like that, but I don't think I quite pinned it down at the time.
> >
> > This is just being changed - check Christian's VFS tree. There are patches
> > that make sget() use superblock pointer as a bdev holder so the reuse
> > you're speaking about isn't a problem anymore.
>
> So then the question is what do you use for identifying the superblock,
> and you're switching to the dev_t - interesting.
>
> Are we 100% sure that will never break, that a dev_t will always
> identify a unique block_device? Namespacing has been changing things.
Yes, dev_t is a unique identifier of the device, we rely on that in
multiple places, block device open comes to mind as the first. You're
right namespacing changes things but we implement that as changing what
gets presented to userspace via some mapping layer while the kernel keeps
using globally unique identifiers.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists