[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230811-golden-shoppen-dd2f14d64cda@brauner>
Date: Fri, 11 Aug 2023 10:13:55 +0200
From: Christian Brauner <brauner@...nel.org>
To: Jan Kara <jack@...e.cz>
Cc: Kent Overstreet <kent.overstreet@...ux.dev>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-bcachefs@...r.kernel.org, djwong@...nel.org,
dchinner@...hat.com, sandeen@...hat.com, willy@...radead.org,
josef@...icpanda.com, tytso@....edu, bfoster@...hat.com,
andreas.gruenbacher@...il.com, peterz@...radead.org,
akpm@...ux-foundation.org, dhowells@...hat.com, snitzer@...nel.org,
axboe@...nel.dk
Subject: Re: [GIT PULL] bcachefs
On Fri, Aug 11, 2023 at 10:10:42AM +0200, Jan Kara wrote:
> On Thu 10-08-23 22:47:03, Kent Overstreet wrote:
> > On Thu, Aug 10, 2023 at 07:52:05PM +0200, Jan Kara wrote:
> > > On Thu 10-08-23 11:54:53, Kent Overstreet wrote:
> > > > > And there clearly is something very strange going on with superblock
> > > > > handling
> > > >
> > > > This deserves an explanation because sget() is a bit nutty.
> > > >
> > > > The way sget() is conventionally used for block device filesystems, the
> > > > block device open _isn't actually exclusive_ - sure, FMODE_EXCL is used,
> > > > but the holder is the fs type pointer, so it won't exclude with other
> > > > opens of the same fs type.
> > > >
> > > > That means the only protection from multiple opens scribbling over each
> > > > other is sget() itself - but if the bdev handle ever outlives the
> > > > superblock we're completely screwed; that's a silent data corruption bug
> > > > that we can't easily catch, and if the filesystem teardown path has any
> > > > asynchronous stuff going on (and of course it does) that's not a hard
> > > > mistake to make. I've observed at least one bug that looked suspiciously
> > > > like that, but I don't think I quite pinned it down at the time.
> > >
> > > This is just being changed - check Christian's VFS tree. There are patches
> > > that make sget() use superblock pointer as a bdev holder so the reuse
> > > you're speaking about isn't a problem anymore.
> >
> > So then the question is what do you use for identifying the superblock,
> > and you're switching to the dev_t - interesting.
> >
> > Are we 100% sure that will never break, that a dev_t will always
> > identify a unique block_device? Namespacing has been changing things.
>
> Yes, dev_t is a unique identifier of the device, we rely on that in
> multiple places, block device open comes to mind as the first. You're
> right namespacing changes things but we implement that as changing what
> gets presented to userspace via some mapping layer while the kernel keeps
> using globally unique identifiers.
Full device namespacing is not on the horizon at all. We've looked into
this years ago and it woud be a giant effort that would effect nearly
everything if the properly. So even if, there would be so many changes
required that reliance on dev_t in the VFS would be the least of our
problems.
Powered by blists - more mailing lists