[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJfpegvZ+4SNnkOEkS=7D44bZNQBovA7SU7etChN6Bh_B9f3dQ@mail.gmail.com>
Date: Thu, 21 Sep 2023 09:34:14 +0200
From: Miklos Szeredi <miklos@...redi.hu>
To: Matthew House <mattlloydhouse@...il.com>
Cc: Christian Brauner <brauner@...nel.org>,
Miklos Szeredi <mszeredi@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-api@...r.kernel.org, linux-man@...r.kernel.org,
linux-security-module@...r.kernel.org, Karel Zak <kzak@...hat.com>,
Ian Kent <raven@...maw.net>,
David Howells <dhowells@...hat.com>,
Al Viro <viro@...iv.linux.org.uk>,
Christian Brauner <christian@...uner.io>,
Amir Goldstein <amir73il@...il.com>
Subject: Re: [RFC PATCH 2/3] add statmnt(2) syscall
On Wed, 20 Sept 2023 at 15:26, Matthew House <mattlloydhouse@...il.com> wrote:
> The declared type of a variable *is* one of the different types, as far as
> the aliasing rules are concerned. In C17, section 6.5 ("Expressions"):
>
> > The *effective type* of an object for an access to its stored value is
> > the declared type of the object, if any. [More rules about objects with
> > no declared type, i.e., those created with malloc(3) or realloc(3)...]
> >
> > An object shall have its stored value accessed only by an lvalue
> > expression that has one of the following types:
> >
> > -- a type compatible with the effective type of the object,
> >
> > -- a qualified version of a type compatible with the effective type of
> > the object,
> >
> > -- a type that is the signed or unsigned type corresponding to the
> > effective type of the object,
> >
> > -- a type that is the signed or unsigned type corresponding to a
> > qualified version of the effective type of the object,
> >
> > -- an aggregate or union type that includes one of the aforementioned
> > types among its members (including, recursively, a member of a
> > subaggregate or contained union), or
> >
> > -- a character type.
>
> In this case, buf is declared in the program as a char[10000] array, so the
> declared type of each element is char, and the effective type of each
> element is also char. If we want to access, say, st->mnt_id, the lvalue
> expression has type __u64, and it tries to access 8 of the char objects.
> However, the integer type that __u64 expands to doesn't meet any of those
> criteria, so the aliasing rules are violated and the behavior is undefined.
Some of the above is new information for me.
However for all practical purposes the code doesn't violate aliasing
rules. Even the most aggressive "-Wstrict-aliasing=1" doesn't trigger
a warning. I guess this is because gcc takes the definition to be
symmetric, i.e. anything may safely be aliased to a char pointer and a
char pointer may safely be aliased to anything. I'm not saying that
that is what the language definition says, just that gcc interprets
the language definition that way. Also plain "-Wstrict-aliasing"
doesn't trigger even if the type of the array is not char, because gcc
tries hard not to warn about cases where there's no dereference of the
aliased pointer. This is consistent with what I said and what the gcc
manpage says: only accesses count, declarations don't.
>
> I've always felt that capacity doubling is a bit wasteful, but it's
> definitely something I can live with, especially if providing size feedback
> is as complex as you suggest. Still, I'm not a big fan of single-buffer
> interfaces in general, with how poorly they tend to interact with C's
> aliasing rules. (Also, those kinds of interfaces also invite alignment
> errors: for instance, your snippet above is missing the necessary union to
> prevent the buffer from being misaligned, which would cause UB when you
> cast it to a struct statmnt *.)
Okay, alignment is a different story. I'll note this in the man page.
Thanks,
Miklos
Powered by blists - more mailing lists