[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFULd4Y5JocT9yRwS0Zkro-pAHihmOHP2D8fMLR29j8_Gy_nNA@mail.gmail.com>
Date: Fri, 13 Oct 2023 11:38:01 +0200
From: Uros Bizjak <ubizjak@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Nadav Amit <namit@...are.com>,
"the arch/x86 maintainers" <x86@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andy Lutomirski <luto@...nel.org>,
Brian Gerst <brgerst@...il.com>,
Denys Vlasenko <dvlasenk@...hat.com>,
"H . Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Nick Desaulniers <ndesaulniers@...gle.com>
Subject: Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr()
On Thu, Oct 12, 2023 at 8:01 PM Uros Bizjak <ubizjak@...il.com> wrote:
>
> On Thu, Oct 12, 2023 at 7:47 PM Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > On Thu, 12 Oct 2023 at 10:10, Linus Torvalds
> > <torvalds@...ux-foundation.org> wrote:
> > >
> > > The fix seems to be a simple one-liner, ie just
> > >
> > > - asm(__pcpu_op2_##size(op, __percpu_arg(P[var]), "%[val]") \
> > > + asm(__pcpu_op2_##size(op, __percpu_arg(a[var]), "%[val]") \
> >
> > Nope. That doesn't work at all.
> >
> > It turns out that we're not the only ones that didn't know about the
> > 'a' modifier.
> >
> > clang has also never heard of it in this context, and the above
> > one-liner results in an endless sea of errors, with
> >
> > error: invalid operand in inline asm: 'movq %gs:${1:a}, $0'
> >
> > Looking around, I think it's X86AsmPrinter::PrintAsmOperand() that is
> > supposed to handle these things, and while it does have some handling
> > for 'a', the comment around it says
> >
> > case 'a': // This is an address. Currently only 'i' and 'r' are expected.
> >
> > and I think our use ends up just confusing the heck out of clang. Of
> > course, clang also does this:
> >
> > case 'P': // This is the operand of a call, treat specially.
> > PrintPCRelImm(MI, OpNo, O);
> > return false;
> >
> > so clang *already* generates those 'current' accesses as PCrelative, and I see
> >
> > movq %gs:pcpu_hot(%rip), %r13
> >
> > in the generated code.
> >
> > End result: clang actually generates what we want just using 'P', and
> > the whole "P vs a" is only a gcc thing.
>
> Ugh, this isn't exactly following Clang's claim that "In general,
> Clang is highly compatible with the GCC inline assembly extensions,
> allowing the same set of constraints, modifiers and operands as GCC
> inline assembly."
For added fun I obtained some old clang:
$ clang --version
clang version 11.0.0 (Fedora 11.0.0-3.fc33)
and tried to compile this:
int m;
__seg_gs int n;
void foo (void)
{
asm ("# %a0 %a1" :: "p" (&m), "p" (&n));
asm ("# %P0 %P1" :: "p" (&m), "p" (&n));
}
clang-11:
# m n
# m n
clang-11 -fpie:
# m(%rip) n(%rip)
# m n
clang-11 -m32:
# m n
# m n
gcc:
# m(%rip) n(%rip)
# m n
gcc -fpie:
# m(%rip) n(%rip)
# m n
gcc -m32:
# m n
# m n
Please find attached a patch that should bring some order to this
issue. The patch includes two demonstration sites, the generated code
for mem_encrypt_identity.c does not change while the change in
percpu.h brings expected 4kB code size reduction.
Uros.
View attachment "memref.diff.txt" of type "text/plain" (2614 bytes)
Powered by blists - more mailing lists