[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ecj7k7b7buss7f7vhfzzsvnq6hdka3wziytih42g76yt2zgh3j@lo5f2n2lujk3>
Date: Fri, 25 Oct 2024 19:02:23 -0400
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: John Stoffel <john@...ffel.org>
Cc: manas18244@...td.ac.in, linux-bcachefs@...r.kernel.org,
linux-kernel@...r.kernel.org, Anup Sharma <anupnewsmail@...il.com>,
Shuah Khan <skhan@...uxfoundation.org>, syzbot+e8eff054face85d7ea41@...kaller.appspotmail.com
Subject: Re: [PATCH] Revert "bcachefs: Add asserts to
bch2_dev_btree_bitmap_marked_sectors()"
On Fri, Oct 25, 2024 at 05:26:19PM -0400, John Stoffel wrote:
> >>>>> "Kent" == Kent Overstreet <kent.overstreet@...ux.dev> writes:
>
> > On Thu, Oct 24, 2024 at 02:30:34PM -0400, John Stoffel wrote:
> >> >>>>> "Kent" == Kent Overstreet <kent.overstreet@...ux.dev> writes:
> >>
> >> > On Mon, Oct 21, 2024 at 10:18:57PM +0530, Manas via B4 Relay wrote:
> >> >> From: Manas <manas18244@...td.ac.in>
> >> >>
> >> >> This reverts commit 60f2b1bcf519416dbffee219132aa949d0c39d0e.
> >> >>
> >> >> This syzbot bug[1] is triggered due to the BUG_ON assertions added in
> >> >> __bch2_dev_btree_bitmap_mark. During runtime, m->btree_bitmap_shift is
> >> >> 63 '?'. This triggers both the assertions.
> >>
> >> > The BUG_ON() doesn't need to be deleted; we just need to fix the
> >> > validation so it doesn't fire (it doesn't particularly matter if it's
> >> > removed or not, ubsan would catch it without the BUG_ON()).
> >>
> >> Shouldn't the BUG_ON() be replaced with making the filesystem readonly
> >> instead if you're going to keep it? I'd rather have the filesystem
> >> still be mounted and able to be read, but not writeable, instead of
> >> having my system crash before I can do anything.
>
> > Not in this case. In general, if there's a chance of the BUG_ON()
> > hitting in the wild after the code has passed testing then it
> > shouldn't be a BUG_ON(), but this is low level validation that we're
> > relying on.
>
> So I'm having a hard time parsing this reply. And I don't think you
> make a good case for leaving or even having a BUG_ON() here at all.
> If there's a chance of it hitting in the wild, it should be removed.
> But you don't want to remove it because it will never hit? That's
> just lazy... :-) I just don't see why a filesystem should have the
> opportunity to kill an entire system because that one filesystem has
> problems.
No, it's not lazy.
Code with assertions is much more robust and reliable, because bugs
shake out faster. An assertion pop is _much, much_ easier to debug than
wandering off into undefined behaviour land.
Assertions are also documentation that can't go out of date.
They make later refactoring much safer and easier, as well.
Powered by blists - more mailing lists