[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <jqj6do7lodrrvpjmk6vlhasdigs23jkyvznniudhebcizstsn7@6cetkluh4ehl>
Date: Fri, 24 May 2024 13:30:09 -0400
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: Brian Foster <bfoster@...hat.com>, Kees Cook <keescook@...omium.org>,
linux-kernel <linux-kernel@...r.kernel.org>, linux-bcachefs@...r.kernel.org,
Alexander Potapenko <glider@...gle.com>, Marco Elver <elver@...gle.com>,
Dmitry Vyukov <dvyukov@...gle.com>, kasan-dev@...glegroups.com,
Nathan Chancellor <nathan@...nel.org>, Nick Desaulniers <ndesaulniers@...gle.com>,
Bill Wendling <morbo@...gle.com>, Justin Stitt <justinstitt@...gle.com>, llvm@...ts.linux.dev
Subject: Re: Use of zero-length arrays in bcachefs structures inner fields
On Fri, May 24, 2024 at 12:04:11PM -0400, Mathieu Desnoyers wrote:
> On 2024-05-24 11:35, Mathieu Desnoyers wrote:
> > [ Adding clang/llvm and KMSAN maintainers/reviewers in CC. ]
> >
> > On 2024-05-24 11:28, Kent Overstreet wrote:
> > > On Thu, May 23, 2024 at 01:53:42PM -0400, Mathieu Desnoyers wrote:
> > > > Hi Kent,
> > > >
> > > > Looking around in the bcachefs code for possible causes of this KMSAN
> > > > bug report:
> > > >
> > > > https://lore.kernel.org/lkml/000000000000fd5e7006191f78dc@google.com/
> > > >
> > > > I notice the following pattern in the bcachefs structures: zero-length
> > > > arrays members are inserted in structures (not always at the end),
> > > > seemingly to achieve a result similar to what could be done with a
> > > > union:
> > > >
> > > > fs/bcachefs/bcachefs_format.h:
> > > >
> > > > struct bkey_packed {
> > > > __u64 _data[0];
> > > >
> > > > /* Size of combined key and value, in u64s */
> > > > __u8 u64s;
> > > > [...]
> > > > };
> > > >
> > > > likewise:
> > > >
> > > > struct bkey_i {
> > > > __u64 _data[0];
> > > >
> > > > struct bkey k;
> > > > struct bch_val v;
> > > > };
> > > >
> > > > (and there are many more examples of this pattern in bcachefs)
> > > >
> > > > AFAIK, the C11 standard states that array declarator constant expression
> > > >
> > > > Effectively, we can verify that this code triggers an undefined behavior
> > > > with:
> > > >
> > > > #include <stdio.h>
> > > >
> > > > struct z {
> > > > int x[0];
> > > > int y;
> > > > int z;
> > > > } __attribute__((packed));
> > > >
> > > > int main(void)
> > > > {
> > > > struct z a;
> > > >
> > > > a.y = 1;
> > > > printf("%d\n", a.x[0]);
> > > > }
> > > > delimited by [ ] shall have a value greater than zero.
> > >
> > > Yet another example of the C people going absolutely nutty with
> > > everything being undefined. Look, this isn't ok, we need to get work
> > > done, and I've already wasted entirely too much time on ZLA vs. flex
> > > array member nonsense.
> > >
> > > There's a bunch of legit uses for zero length arrays, and your example,
> > > where we're not even _assigning_ to x, is just batshit. Someone needs to
> > > get his head examined.
>
> Notice how a.y is first set to 1, then a.x[0] is loaded, expecting to
> alias with a.y.
>
> This is the same aliasing pattern found in bcachefs, for instance here:
>
> bcachefs_format.h:
>
> struct jset {
> [...]
> __u8 encrypted_start[0];
>
> __le16 _read_clock; /* no longer used */
> __le16 _write_clock;
>
> /* Sequence number of oldest dirty journal entry */
> __le64 last_seq;
>
>
> struct jset_entry start[0];
> __u64 _data[];
> } __packed __aligned(8);
>
> where struct jset last_seq field is set by jset_validate():
>
> jset->last_seq = jset->seq;
>
> and where journal_read_bucket() uses the encrypted_start member as input:
>
> ret = bch2_encrypt(c, JSET_CSUM_TYPE(j), journal_nonce(j),
> j->encrypted_start,
> vstruct_end(j) - (void *) j->encrypted_start);
Except we're just using it as a marker for an offset into the struct,
the same "aliasing" issue would apply if we were just using offsetof()
to calculate the offsets directly.
Powered by blists - more mailing lists