[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Zw/SZad0QR8eyNnI@1wt.eu>
Date: Wed, 16 Oct 2024 16:49:09 +0200
From: Willy Tarreau <w@....eu>
To: Thomas Weißschuh <thomas.weissschuh@...utronix.de>
Cc: Thomas Weißschuh <linux@...ssschuh.net>,
"Paul E. McKenney" <paulmck@...nel.org>, linux-kernel@...r.kernel.org,
stable@...r.kernel.org
Subject: Re: [PATCH] tools/nolibc/stdlib: fix getenv() with empty environment
On Wed, Oct 16, 2024 at 03:01:08PM +0200, Thomas Weißschuh wrote:
> Ah, environ being assignable is something I did not consider.
Yes, it is. Long ago, in nolibc when there were no global variables,
I used to have this in my programs:
char **environ;
int main(int argc, char **argv, char **envp)
{
environ = envp;
...
}
And it used to work pretty well with any libc.
> > > int idx, i;
> > >
> > > - if (environ) {
> > > + if (*environ) {
> > > for (idx = 0; environ[idx]; idx++) {
> > > for (i = 0; name[i] && name[i] == environ[idx][i];)
> > > i++;
> >
> > However as a quick note, if we decide we don't care about environ being
> > NULL, and since this is essentially a cleanup, why not even get rid of
> > the whole "if" condition, since the loop takes care of it ?
>
> It's not only a cleanup.
OK.
> Without this patch I see crashes due to illegal memory accesses.
> Not reliably, only under special conditions and only on s390, but
> crashes nevertheless.
But I don't understand how the patch could make them disappear, as it
removes an extra check. So if environ was bad before, and was not null,
it remains bad and continues to be dereferenced. And if it was null,
it wouldn't enter the block but will now.
> It's the same binary with the same kernel that sometimes works and
> sometimes crashes.
> The proposed fix makes the issue go away.
Maybe it's related to the size of the executable or code optimization
with some offending parts that could be eliminated by the compiler in
fact.
It's also possible we've subtly broke something in the s390 init code
in a way that slightly violates the official ABI (stack alignment etc)
and that could explain the randomness.
> But my original analysis looks wrong, I'll investigate some more.
OK!
> User process fault: interruption code 0010 ilc:2 in test_nanosleep[43c4,1000000+8000]
> Failing address: 0000000000000000 TEID: 0000000000000800
(...)
Unfortunately I'm not fluent in s390 :-/
Willy
Powered by blists - more mailing lists