[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <p2qtcygjvkhyq5t4rqnmhscw5kwlm74ectyukmnc2eexiz35ue@xpmu2hlm7wnt>
Date: Sun, 15 Dec 2024 01:20:21 -0500
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Gianfranco Trad <gianf.trad@...il.com>
Cc: linux-bcachefs@...r.kernel.org, linux-kernel@...r.kernel.org,
skhan@...uxfoundation.org, syzbot+8689d10f1894eedf774d@...kaller.appspotmail.com
Subject: Re: [PATCH] bcachefs: zero-init move_bucket struct in
bch2_copygc_get_buckets()
On Sun, Dec 15, 2024 at 02:58:26AM +0100, Gianfranco Trad wrote:
> Hi Kent,
>
> I wanted to follow up on this patch. Over the last few days, I've
> investigated it further and observed the following that might be of help:
>
> 1- zero-initing whole b struct (as the first patch version) leads to a clean
> log [1].
> While if trying to memset to 0 only b.k field log reports [2]:
>
> bcachefs (da441363-bb6a-4ab9-999b-c1f40db4fee2): filesystem UUID already
> open
> bcachefs (da441363-bb6a-4ab9-999b-c1f40db4fee2): shutdown complete
> bcachefs: bch2_fs_get_tree() error: EINVAL
>
>
> 2- Given both versions of the patch didn't trigger the uninit issue anymore
> I checked whether inner fields of b.k.bucket are correctly inited, just
> before the bug triggers.
> b.k.bucket fields seeming to look correctly inited: snapshot = 0, offset =
> 34, inode = 0, gen = 0.
>
> 3- The first of the 2 reproducers causes a segfault:
>
> ==9335== Invalid[ 264.802101][ T9346] read of size 1
> ==9335== at 0x483BC39: strnlen (vg_replace_strmem.c:426)
> ==9335== by 0x1098F0: netlink_query_family_id (repro.c:176)
> ==9335== by 0x109A51: syz_genetlink_get_family_id (repro.c:211)
> ==9335== by 0x10B476: execute_one (repro.c:2071)
> ==9335== by 0x10B1A5: loop (repro.c:745)
> ==9335== by 0x10B52F: main (repro.c:2088)
> ==9335== Address 0x0 is not stack'd, malloc'd or (recently) free'd
>
> As of now, it seems unrelated to the root cause of the reported syzbot bug.
>
>
> At this point, zero-initializing the entire struct seems to work reliably,
> thought I'm still trying to get the full picture on this bug.
I think this might be a padding issue.
C struct literals are supposed to initialize the whole struct, but the
compiler folks in their infinite wisdom decided that didn't apply to
padding.
I think this needs to be raised with the KMSAN and compiler folks, as
inserting memsets all over the place would be a sad workaround.
Powered by blists - more mailing lists