[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFULd4bxS0y1b7XZ1_J3yHF84Ghzoi1OWoZGfrLWvNaygVCWTQ@mail.gmail.com>
Date: Sun, 23 Jun 2024 20:41:18 +0200
From: Uros Bizjak <ubizjak@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: kernel test robot <lkp@...el.com>, oe-kbuild-all@...ts.linux.dev,
linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...nel.org>,
Borislav Petkov <bp@...en8.de>, Peter Zijlstra <peterz@...radead.org>
Subject: Re: arch/x86/include/asm/cmpxchg_32.h:149:9: error: inline assembly
requires more registers than available
On Sun, Jun 23, 2024 at 8:25 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> Ah, good. This is something I'm a bit sensitive about, just because
> there's been so many arguments over it over the years, so now I go
> into "preemptive nuclear mode" when the regression issue comes up.
>
> Sorry.
Also sorry from my side if my disagreement was understood as a
criticism of the kernel development process. I'm totally OK with it
(but ignored patches indeed cause a bit of frustration...).
> > I'm OK with the revert, but it won't fix the underlying problem.
> > Please see the definition of __arch_cmpxchg64_emu - it forces the
> > address to %esi registers in the same way as __arch_try_cmpxchg64_emu.
> > Effectively, the compiler allocates 5 input registers just for the
> > instruction.
>
> Oh, I entirely agree that this is a "random compiler implementation"
> issue, and then the code around it makes all the difference.
>
> > > Now, from having looked a bit at this, I can point you to the
> > > differences introduced by having to have the emulation fallback.
> >
> > Yes, I know this - I also (runtime!) tested the emulation, but with GCC only.
>
> Yeah, crossed emails, I started out just doing the "let's see what the
> config difference is", and only after that realized that I had looked
> at the wrong code for cmpxchg (ie I had looked at the simpler native
> case).
>
> > This can be achieved by implementing atomic64_{and,or,xor} as an
> > outline function.
>
> Yes, but then a lot of the whole point of commit 95ece48165c1 goes
> away, doesn't it?
True, because this commit pushed one of the compilers over the edge.
> Or were you suggesting the out-of-line code only for the emulation
> case? That would work.
I am suggesting simply following the approach of
arch_atomic64_{add,sub}{_return} in atomic64_32.h. These functions are
used extensively in the kernel, and if they didn't cause any problems,
then we can rightfully expect that the new ones also won't. This
approach will relax the register pressure, so we won't expect magic
from the compiler.
I can provide a patch series with the revert and a fix in a couple of
days (I'll be away from the keyboard for a short time).
Uros.
Powered by blists - more mailing lists