linux-kernel - Re: [REGRESSION][BISECTED] erroneous buffer overflow detected in bch2_xattr

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zv8RIs-htdc-PtXB@archlinux>
Date: Thu, 3 Oct 2024 23:48:18 +0200
From: Jan Hendrik Farr <kernel@...rr.cc>
To: Kees Cook <kees@...nel.org>
Cc: Thorsten Blum <thorsten.blum@...lux.com>, kent.overstreet@...ux.dev,
	regressions@...ts.linux.dev, linux-bcachefs@...r.kernel.org,
	linux-hardening@...r.kernel.org, linux-kernel@...r.kernel.org,
	ardb@...nel.org, morbo@...gle.com
Subject: Re: [REGRESSION][BISECTED] erroneous buffer overflow detected in
 bch2_xattr_validate

On 03 14:28:01, Kees Cook wrote:
> On Thu, Oct 03, 2024 at 05:17:08PM +0200, Jan Hendrik Farr wrote:
> > gcc currently says that the __bdos of struct containing a flexible array
> > member is:
> > 
> > sizeof(<whole struct>) + sizeof(<flexible array element>) * <count>
> > 
> > clang however does the following:
> > 
> > max(sizeof(<whole struct>), offsetof(<flexible array member>) + sizeof(<flexible array element>) * <count>)
> 
> Clang's calculation seems very wrong. I would expect it to match GCC's.
> 

I was on the very same train of thought, but I have since changed my
mind a bit. A struct containing a flexible array member can be allocated in
two ways:

(1):

struct posix_acl *acl = malloc(sizeof(struct posix_acl) + sizeof(struct posix_acl_entry) * 1);
acl.a_count = 1;

or (2):

struct posix_acl *acl = malloc(offsetof(struct posix_acl, a_entries) + sizeof(struct posix_acl_entry) * 1);
acl.a_count = 1;

Both are valid ways to allocate it. __bdos does not know which of these
methods was used to allocate the struct whose size it has to determine,
so it's giving the lower bound that doesn't include the (potential)
padding at the end.

So it comes down to false positives vs false negatives...
More details here:
https://github.com/llvm/llvm-project/pull/111015

Clangs current behavior would essentially force kernel code to always
assume option (2) is used. So

struct posix_acl *
posix_acl_clone(const struct posix_acl *acl, gfp_t flags)
{
	struct posix_acl *clone = NULL;

	if (acl) {
		int size = sizeof(struct posix_acl) + acl->a_count *
		           sizeof(struct posix_acl_entry);
		clone = kmemdup(acl, size, flags);
		if (clone)
			refcount_set(&clone->a_refcount, 1);
	}
	return clone;
}
EXPORT_SYMBOL_GPL(posix_acl_clone);

from linux/fs/posix_acl.c would have to turn into something like:

struct posix_acl *
posix_acl_clone(const struct posix_acl *acl, gfp_t flags)
{
	struct posix_acl *clone = NULL;

	if (acl) {
		int size = offsetof(struct posix_acl, a_entries) + acl->a_count *
		           sizeof(struct posix_acl_entry);
		clone = kmemdup(acl, size, flags);
		if (clone)
			refcount_set(&clone->a_refcount, 1);
	}
	return clone;
}
EXPORT_SYMBOL_GPL(posix_acl_clone);

Which is actually safer, because can you actually be sure this posix_acl
wasn't allocated using method (2)?

After looking at the assembly produced by gcc more, it actually looks
like it's using the allocation size if it's known in the current context
(for example if the struct was just malloced in the same function)
and otherwise returns INT_MAX for the __bdos of a struct containing a
flexible array member. It's only returning the size based on the
__counted_by attribute of you ask it for the __bdos of the flexible
array member itself.