[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <kj5vcqbx5ztolv5y3g4csc6te4qmi7y7kmqfora2sxbobnrbrm@rcuffqncku74>
Date: Fri, 23 Aug 2024 22:13:55 -0400
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: linux-bcachefs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [GIT PULL] bcachefs fixes for 6.11-rc5
On Sat, Aug 24, 2024 at 09:23:00AM GMT, Linus Torvalds wrote:
> On Sat, 24 Aug 2024 at 02:54, Kent Overstreet <kent.overstreet@...ux.dev> wrote:
> >
> > Hi Linus, big one this time...
>
> Yeah, no, enough is enough. The last pull was already big.
>
> This is too big, it touches non-bcachefs stuff, and it's not even
> remotely some kind of regression.
>
> At some point "fix something" just turns into development, and this is
> that point.
>
> Nobody sane uses bcachefs and expects it to be stable, so every single
> user is an experimental site.
Eh?
Universal consensus has been that bcachefs is _definitely_ more
trustworthy than brtfs, in terms of "will this filesystem ever go
unrecoverable or lose my data" - I've seen many reports of people who've
put it through the same situations where btrfs falls.
I've ever seen people compare bcachefs's robustness in positive terms
vs. /xfs/; and that's the result of a *hell* of a lot of work with the
#1 goal of having a robust filesystem that _never_ loses data.
Syzbot dashboard bears this out as well, bcachefs is starting to look
better than btrfs there as well...
(Peanut gallery: Please don't rush out and switch to bcachefs just yet.
I still have a backlog of bugs and issues - some of them serious, as in
your filessystem will go emergency read only - and I don't want people
getting bit. There's still a ton to do; I'm not taking EXPERIMENTAL off
until at least the fuzz testing for on disk corruption is in play).
Look, I've been doing this for a long time, I've had people running my
code in production for a long time, and I'm working with my users on a
daily basis to address issues. I don't throw code over the wall; I do
everything I can to support it and make sure it's working well.
And - the "srcu held for 10+s warnings" really were bad, there are going
to be a long tail of those that need to be fixed - to get to the rest,
we need the primary causes fixed first.
And when I ship code, I'm _always_ weighing "how much do we want this"
vs. "risk of regression/risk in general" - I'm not just throwing out
whatever I feel like.
Look, this is the filesystem you're all going to want to be running in -
knock on wood - just a year or two, because I'm working to to make it
more robust and reliable than xfs and ext4 (and yes, it will be) with
_end to end data integrity_.
We need this. there's still tons of people with "btrfs just crapped
itself and now I'm fucked" horror stories, and running a non
checksumming filesystem is like buying non ECC ram. I've got users with
100+ TB filesystems who trust my code, and I haven't lost anyone's
filesystem who was patient and willing to work with me.
But I've got to get this done, and right now that does mean moving fast
and grinding through a lot of issues.
(again for the peanut gallery: _please_ do not rush to install it yet
unless you are willing and able to report issues, I'll say when the bugs
have been worked through and the hardening is done).
Powered by blists - more mailing lists