[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZabSG4R45sC0s23d@1wt.eu>
Date: Tue, 16 Jan 2024 19:59:39 +0100
From: Willy Tarreau <w@....eu>
To: Ammar Faizi <ammarfaizi2@...weeb.org>
Cc: Charles Mirabile <cmirabil@...hat.com>, linux-kernel@...r.kernel.org,
Thomas Weißschuh <linux@...ssschuh.net>
Subject: Re: [PATCH] nolibc/stdlib: Improve `getauxval(3)` implementation
On Tue, Jan 16, 2024 at 07:58:09PM +0100, Willy Tarreau wrote:
> On Wed, Jan 17, 2024 at 01:52:06AM +0700, Ammar Faizi wrote:
> > On Tue, Jan 16, 2024 at 01:11:47PM -0500, Charles Mirabile wrote:
> > > At least on x86-64, the ABI only specifies that one more long will be
> > > present with value 0 (type AT_NULL) after the pairs of auxv entries.
> > > Whether or not it has a corresponding value is unspecified. This value is
> > > present on linux, but there is no reason to check it as simply seeing an
> > > auxv entry whose type value is AT_NULL should be enough.
> >
> > Yeah, I agree with that. I just read the ABI and confirmed that the
> > 'a_un' member is ignored when the type is `AT_NULL`. Let's stop relying
> > on an unspecified value.
> >
> > For others who want to check, see page 37 and 38:
> > https://gitlab.com/x86-psABIs/x86-64-ABI/-/wikis/uploads/221b09355dd540efcbe61b783b6c0ece/x86-64-psABI-2023-09-26.pdf
> >
> > > This is a matter of taste, but I think processing the data in a structured
> > > way by coercing it into an array of type value pairs, using multiple
> > > return style, and a for loop with a clear exit condition is more readable
> > > than the existing infinite loop with multiple exit points and a return
> > > value variable.
> >
> > Ok. It's more readable using your way. One thing that bothers me a bit
> > is type of 'a_type'. On page 37, the ABI defines the auxv type-val pair
> > as:
> >
> > typedef struct
> > {
> > int a_type;
> > union {
> > long a_val;
> > void *a_ptr;
> > void (*a_fnc)();
> > } a_un;
> > } auxv_t;
> >
> > Assuming the arch is x86-64 Linux. Note that 'a_type' is an 'int' which
> > is 4 bytes in size, but we use 'unsigned long' instead of 'int' to
> > represent it. However, since 'a_un' needs to be 8 bytes aligned, the
> > compiler will put a 4 bytes padding between 'a_type' and 'a_un', so it
> > ends up just fine (on x86-64).
> >
> > What do you think about other architectures? Will it potentially be
> > misinterpreted?
>
> Indeed, it would fail on a 64-bit big endian architecture. Let's
> just declare the local variable the same way as it is in the spec,
> it will be much cleaner and more reliable.
With that said, if previous code used to work on such architectures,
maybe the definition above is only for x86_64 and differs on other
archs. Maybe it's really defined as two longs ?
Willy
Powered by blists - more mailing lists