lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220324075728.GC18586@1wt.eu>
Date:   Thu, 24 Mar 2022 08:57:28 +0100
From:   Willy Tarreau <w@....eu>
To:     Ammar Faizi <ammarfaizi2@...weeb.org>
Cc:     "Paul E. McKenney" <paulmck@...nel.org>,
        Alviro Iskandar Setiawan <alviro.iskandar@...weeb.org>,
        Nugraha <richiisei@...il.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        GNU/Weeb Mailing List <gwml@...r.gnuweeb.org>,
        David Laight <David.Laight@...LAB.COM>
Subject: Re: [PATCH v1 04/11] tools/nolibc: x86-64: Use appropriate register
 constraints if exist

On Thu, Mar 24, 2022 at 02:30:32PM +0700, Ammar Faizi wrote:
> Use appropriate register constraints if exist. Don't use register
> variables for all inputs.
> 
> Register variables with "r" constraint should be used when we need to
> pass data through a specific register to extended inline assembly that
> doesn't have a specific register constraint associated with it (anything
> outside %rax, %rbx, %rcx, %rdx, %rsi, %rdi).
> 
> It also simplifies the macro definition.

I'm a bit bothered by this one because I went the exact opposite route
in the early design precisely because I found that the current one was
simpler. E.g, I personally find this one:

> -#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)                  \
> -({                                                                            \
> -	long _ret;                                                            \
> -	register long _num  __asm__("rax") = (num);                           \
> -	register long _arg1 __asm__("rdi") = (long)(arg1);                    \
> -	register long _arg2 __asm__("rsi") = (long)(arg2);                    \
> -	register long _arg3 __asm__("rdx") = (long)(arg3);                    \
> -	register long _arg4 __asm__("r10") = (long)(arg4);                    \
> -	register long _arg5 __asm__("r8")  = (long)(arg5);                    \
> -	register long _arg6 __asm__("r9")  = (long)(arg6);                    \
> -	                                                                      \
> -	__asm__ volatile (                                                    \
> -		"syscall\n"                                                   \
> -		: "=a"(_ret)                                                  \
> -		: "r"(_arg1), "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \
> -		  "r"(_arg6), "0"(_num)                                       \
> -		: "rcx", "r11", "memory", "cc"                                \
> -	);                                                                    \
> -	_ret;                                                                 \

Easier to grasp than this one:

> +#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)	\
> +({								\
> +	long _ret = (num);					\
> +	register long _arg4 __asm__("r10") = (long)(arg4);	\
> +	register long _arg5 __asm__("r8")  = (long)(arg5);	\
> +	register long _arg6 __asm__("r9")  = (long)(arg6);	\
> +	__asm__ volatile (					\
> +		"syscall\n"					\
> +		: "+a"(_ret)	/* %rax */			\
> +		: "D"(arg1),	/* %rdi */			\
> +		  "S"(arg2),	/* %rsi */			\
> +		  "d"(arg3),	/* %rdx */			\
> +		  "r"(_arg4),	/* %r10 */			\
> +		  "r"(_arg5),	/* %r8  */			\
> +		  "r"(_arg6)	/* %r9  */			\
> +		: "rcx", "r11", "memory", "cc"			\
> +	);							\
> +	_ret;							\
>  })

where only half of the registers are described at one place and the
other half is described in comments at another place. But admittedly
it's a matter of taste. However the same approach was adopted for all
other archs (arm, aarch64, mips, riscv) and I tend to find it easier
to compare the ABI between architectures when all registers are
described at once than when two of them (i386 and x86_64) make an
exception.

I'd say that if there is any technical benefit in doing this (occasional
code improvement or better support for older or exotic compilers), I'd say
"ok go for it", but if it's only a matter of taste, I'm not convinced at
all and am rather seeing this as a regression. Now if there's rough
consensus around this approach I'll abide, but then I'd request that other
archs are adapted as well so that we don't keep a different approach only
for these two ones.

Thanks,
Willy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ