[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20231012193233.207857-1-paulmck@kernel.org>
Date: Thu, 12 Oct 2023 12:32:15 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: linux-kernel@...r.kernel.org
Cc: gwml@...r.gnuweeb.org, kernel-team@...a.com, w@....eu,
Ammar Faizi <ammarfaizi2@...weeb.org>,
Zhangjin Wu <falcon@...ylab.org>,
Nicholas Rosenberg <inori@...x.org>,
Thomas Weißschuh <linux@...ssschuh.net>,
Alviro Iskandar Setiawan <alviro.iskandar@...weeb.org>,
Willy Tarreau <w@....eu>
Subject: [PATCH nolibc 01/19] tools/nolibc: i386: Fix a stack misalign bug on _start
From: Ammar Faizi <ammarfaizi2@...weeb.org>
The ABI mandates that the %esp register must be a multiple of 16 when
executing a 'call' instruction.
Commit 2ab446336b17 ("tools/nolibc: i386: shrink _start with _start_c")
simplified the _start function, but it didn't take care of the %esp
alignment, causing SIGSEGV on SSE and AVX programs that use aligned move
instruction (e.g., movdqa, movaps, and vmovdqa).
The 'and $-16, %esp' aligns the %esp at a multiple of 16. Then 'push
%eax' will subtract the %esp by 4; thus, it breaks the 16-byte
alignment. Make sure the %esp is correctly aligned after the push by
subtracting 12 before the push.
Extra:
Add 'add $12, %esp' before the 'and $-16, %esp' to avoid over-estimating
for particular cases as suggested by Willy.
A test program to validate the %esp alignment on _start can be found at:
https://lore.kernel.org/lkml/ZOoindMFj1UKqo+s@biznet-home.integral.gnuweeb.org
Cc: Zhangjin Wu <falcon@...ylab.org>
Fixes: 2ab446336b17aad362c6decee29b4efd83a01979 ("tools/nolibc: i386: shrink _start with _start_c")
Reported-by: Nicholas Rosenberg <inori@...x.org>
Acked-by: Thomas Weißschuh <linux@...ssschuh.net>
Signed-off-by: Ammar Faizi <ammarfaizi2@...weeb.org>
Reviewed-by: Alviro Iskandar Setiawan <alviro.iskandar@...weeb.org>
Signed-off-by: Willy Tarreau <w@....eu>
Signed-off-by: Thomas Weißschuh <linux@...ssschuh.net>
---
tools/include/nolibc/arch-i386.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h
index 64415b9fac77..28c26a00a762 100644
--- a/tools/include/nolibc/arch-i386.h
+++ b/tools/include/nolibc/arch-i386.h
@@ -167,7 +167,9 @@ void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_
__asm__ volatile (
"xor %ebp, %ebp\n" /* zero the stack frame */
"mov %esp, %eax\n" /* save stack pointer to %eax, as arg1 of _start_c */
- "and $-16, %esp\n" /* last pushed argument must be 16-byte aligned */
+ "add $12, %esp\n" /* avoid over-estimating after the 'and' & 'sub' below */
+ "and $-16, %esp\n" /* the %esp must be 16-byte aligned on 'call' */
+ "sub $12, %esp\n" /* sub 12 to keep it aligned after the push %eax */
"push %eax\n" /* push arg1 on stack to support plain stack modes too */
"call _start_c\n" /* transfer to c runtime */
"hlt\n" /* ensure it does not return */
--
2.40.1
Powered by blists - more mailing lists