lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 23 Jun 2024 20:41:18 +0200
From: Uros Bizjak <ubizjak@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: kernel test robot <lkp@...el.com>, oe-kbuild-all@...ts.linux.dev, 
	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...nel.org>, 
	Borislav Petkov <bp@...en8.de>, Peter Zijlstra <peterz@...radead.org>
Subject: Re: arch/x86/include/asm/cmpxchg_32.h:149:9: error: inline assembly
 requires more registers than available

On Sun, Jun 23, 2024 at 8:25 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:

> Ah, good. This is something I'm a bit sensitive about, just because
> there's been so many arguments over it over the years, so now I go
> into "preemptive nuclear mode" when the regression issue comes up.
>
> Sorry.

Also sorry from my side if my disagreement was understood as a
criticism of the kernel development process. I'm totally OK with it
(but ignored patches indeed cause a bit of frustration...).

> > I'm OK with the revert, but it won't fix the underlying problem.
> > Please see the definition of __arch_cmpxchg64_emu - it forces the
> > address to %esi registers in the same way as __arch_try_cmpxchg64_emu.
> > Effectively, the compiler allocates 5 input registers just for the
> > instruction.
>
> Oh, I entirely agree that this is a "random compiler implementation"
> issue, and then the code around it makes all the difference.
>
> > > Now, from having looked a bit at this, I can point you to the
> > > differences introduced by having to have the emulation fallback.
> >
> > Yes, I know this - I also (runtime!) tested the emulation, but with GCC only.
>
> Yeah, crossed emails, I started out just doing the "let's see what the
> config difference is", and only after that realized that I had looked
> at the wrong code for cmpxchg (ie I had looked at the simpler native
> case).
>
> > This can be achieved by implementing atomic64_{and,or,xor} as an
> > outline function.
>
> Yes, but then a lot of the whole point of commit 95ece48165c1 goes
> away, doesn't it?

True, because this commit pushed one of the compilers over the edge.

> Or were you suggesting the out-of-line code only for the emulation
> case? That would work.

I am suggesting simply following the approach of
arch_atomic64_{add,sub}{_return} in atomic64_32.h. These functions are
used extensively in the kernel, and if they didn't cause any problems,
then we can rightfully expect that the new ones also won't. This
approach will relax the register pressure, so we won't expect magic
from the compiler.

I can provide a patch series with the revert and a fix in a couple of
days (I'll be away from the keyboard for a short time).

Uros.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ