lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zre8cPOKetof24nJ@biznet-home.integral.gnuweeb.org>
Date: Sun, 11 Aug 2024 02:16:00 +0700
From: Ammar Faizi <ammarfaizi2@...weeb.org>
To: Thomas Weißschuh <linux@...ssschuh.net>
Cc: Willy Tarreau <w@....eu>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] tools/nolibc: x86_64: wrap asm functions in functions

On Sat, Aug 10, 2024 at 06:04:36PM +0200, Thomas Weißschuh wrote:
> On 2024-08-10 16:35:56+0000, Willy Tarreau wrote:
> > See how rdi and rdx were expected to be preserved after the first call
> > but were not? This was with gcc-9.5 (which supports naked but it's for
> > illustration purposes of the risk of leaving unconstrained asm like this).
> 
> To be honest, I don't see it. Not enough asm experience I guess, but
> I'll look at it some more.

Let me shed some light on what is going wrong in that form.

First, let's revisit the System V ABI x86-64 calling convention:

  - %rdi holds the 1st argument
  - %rsi holds the 2nd argument
  - %rdx holds the 3rd argument
  - %rax is used for the return value

Take a look at the assembly code that Willy sent. The second memmove()
call is receiving incorrect arguments from the previous memmove() call.

The issue arises because your memmove() function doesn't inform the
compiler that it modifies %rdi and %rdx. As far as the compiler is
concerned, your memmove() function:

  Does not alter any registers.

Consequently, the compiler assumes that the values in %rdi and %rdx
remain unchanged after the memmove() function returns. With this
assumption, and since memmove() is within the same translation unit as
the caller, the compiler optimizes the register moves without preserving
the values in %rdi and %rdx.

To properly inform the compiler that certain registers are being
modified, you need to use constraints and a clobber list. Here's an
example (note: this is untested, only compiled on Godbolt):

Link: https://godbolt.org/z/WTz1nvn1h

```
void *memmove(void *dst, const void *src, size_t len)
{
	void *rax;

	__asm__ goto volatile (
		"movq	%%rdx, %%rcx\n\t"
		"movq	%%rdi, %%rax\n\t"
		"movq	%%rdi, %%rdx\n\t"
		"subq	%%rsi, %%rdx\n\t"
		"cmpq	%%rcx, %%rdx\n\t"
		"jb	.Lbackward_copy\n\t"
		"rep movsb\n\t"
		"jmp	%[out]\n\t"
	".Lbackward_copy:\n\t"
		"leaq	-1(%%rdi, %%rcx, 1), %%rdi\n\t"
		"leaq	-1(%%rsi, %%rcx, 1), %%rsi\n\t"
		"std\n\t"
		"rep movsb\n\t"
		"cld"
		: "+D"(dst), "+S"(src), "+c"(len), "=a"(rax)
		:
		: "rdx", "memory", "cc"
		: out
	);

out:
	return rax;
}
```

This approach helps the compiler correctly handle the register
modifications, ensuring that the generated code behaves as expected.

Notes:

  constraints (binding C variables):
    - "+D": Read and Write %rdi
    - "+S": Read and Write %rsi
    - "+c": Read and Write %rcx
    - "=a": Write %rax

  clobber list:
    - "rdx": Write %rdx
    - "memory": Read and Write memory (pointed to by dst and src)
    - "cc": Write %rflags (This might be optional for x86-64)

I believe we can do better code than that. I can't pull Linux git tree
as well for now. I'll be fully available for work on Tuesday morning
(13 August 2024).

-- 
Ammar Faizi


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ