lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 6 Jul 2015 09:59:02 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	Andy Lutomirski <luto@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	X86 ML <x86@...nel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Jan Kara <jack@...e.cz>, Borislav Petkov <bp@...en8.de>,
	Denys Vlasenko <dvlasenk@...hat.com>
Subject: Re: [PATCH] x86: Fix detection of GCC -mpreferred-stack-boundary support

On Mon, Jul 6, 2015 at 6:44 AM, Ingo Molnar <mingo@...nel.org> wrote:
>
> * Andy Lutomirski <luto@...nel.org> wrote:
>
>> As per https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53383, GCC only
>> allows -mpreferred-stack-boundary=3 on x86_64 if -mno-sse is set.
>> That means that cc-option will not detect
>> -mpreferred-stack-boundary=3 support, because we test for it before
>> setting -mno-sse.
>>
>> Fix it by reordering the Makefile bits.

...

>
> So the 'stack boundary' is the RSP that GCC generates before it calls another
> function from within an existing function, right?
>

I think so.  Certainly the "incoming stack boundary" (which is exactly
the same as the preferred stack boundary unless explicitly changed) is
the RSP alignment that GCC expects on entry.

> So looking at this I question the choice of -mpreferred-stack-boundary=3. Why not
> do -mpreferred-stack-boundary=2?
>

Easy answer: we can't:

$ gcc -c -mno-sse -mpreferred-stack-boundary=2 empty.c
empty.c:1:0: error: -mpreferred-stack-boundary=2 is not between 3 and 12

> My reasoning: on modern uarchs there's no penalty for 32-bit misalignment of
> 64-bit variables, only if they cross 64-byte cache lines, which should be rare
> with a chance of 1:16. This small penalty (of at most +1 cycle in some
> circumstances IIRC) should be more than counterbalanced by the compression of the
> stack by 5% on average.
>

I'll counter with: what's the benefit?  There are no operations that
will naturally change RSP by anything that isn't a multiple of 8
(there's no pushl in 64-bit mode, or at least not on AMD chips -- the
Intel manual is a bit vague on this point), so we'll end up with RSP
being a multiple of 8 regardless.  Even if we somehow shaved 4 bytes
off in asm, that still wouldn't buy us anything, as a dangling 4 bytes
at the bottom of the stack isn't useful for anything.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ