linux-kernel - Re: [PATCH 2/7] x86/percpu: Clean up percpu_to

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAKwvOdn0q77vT8XCRwdgi5LU2yFVEqyhG1Se72gGNR4Tr173_w@mail.gmail.com>
Date:   Tue, 26 May 2020 10:54:35 -0700
From:   Nick Desaulniers <ndesaulniers@...gle.com>
To:     Brian Gerst <brgerst@...il.com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...en8.de>,
        "H . Peter Anvin" <hpa@...or.com>,
        Andy Lutomirski <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 2/7] x86/percpu: Clean up percpu_to_op()

On Thu, May 21, 2020 at 6:06 AM Brian Gerst <brgerst@...il.com> wrote:
>
> On Wed, May 20, 2020 at 1:26 PM Nick Desaulniers
> <ndesaulniers@...gle.com> wrote:
> >
> > On Mon, May 18, 2020 at 8:38 PM Brian Gerst <brgerst@...il.com> wrote:
> > >
> > > On Mon, May 18, 2020 at 5:15 PM Nick Desaulniers
> > > <ndesaulniers@...gle.com> wrote:
> > > >
> > > > On Sun, May 17, 2020 at 8:29 AM Brian Gerst <brgerst@...il.com> wrote:
> > > > >
> > > > > The core percpu macros already have a switch on the data size, so the switch
> > > > > in the x86 code is redundant and produces more dead code.
> > > > >
> > > > > Also use appropriate types for the width of the instructions.  This avoids
> > > > > errors when compiling with Clang.
> > > > >
> > > > > Signed-off-by: Brian Gerst <brgerst@...il.com>
> > > > > ---
> > > > >  arch/x86/include/asm/percpu.h | 90 ++++++++++++++---------------------
> > > > >  1 file changed, 35 insertions(+), 55 deletions(-)
> > > > >
> > > > > diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
> > > > > index 89f918a3e99b..233c7a78d1a6 100644
> > > > > --- a/arch/x86/include/asm/percpu.h
> > > > > +++ b/arch/x86/include/asm/percpu.h
> > > > > @@ -117,37 +117,17 @@ extern void __bad_percpu_size(void);
> > > > >  #define __pcpu_reg_imm_4(x) "ri" (x)
> > > > >  #define __pcpu_reg_imm_8(x) "re" (x)
> > > > >
> > > > > -#define percpu_to_op(qual, op, var, val)               \
> > > > > -do {                                                   \
> > > > > -       typedef typeof(var) pto_T__;                    \
> > > > > -       if (0) {                                        \
> > > > > -               pto_T__ pto_tmp__;                      \
> > > > > -               pto_tmp__ = (val);                      \
> > > > > -               (void)pto_tmp__;                        \
> > > > > -       }                                               \
> > > > > -       switch (sizeof(var)) {                          \
> > > > > -       case 1:                                         \
> > > > > -               asm qual (op "b %1,"__percpu_arg(0)     \
> > > > > -                   : "+m" (var)                        \
> > > > > -                   : "qi" ((pto_T__)(val)));           \
> > > > > -               break;                                  \
> > > > > -       case 2:                                         \
> > > > > -               asm qual (op "w %1,"__percpu_arg(0)     \
> > > > > -                   : "+m" (var)                        \
> > > > > -                   : "ri" ((pto_T__)(val)));           \
> > > > > -               break;                                  \
> > > > > -       case 4:                                         \
> > > > > -               asm qual (op "l %1,"__percpu_arg(0)     \
> > > > > -                   : "+m" (var)                        \
> > > > > -                   : "ri" ((pto_T__)(val)));           \
> > > > > -               break;                                  \
> > > > > -       case 8:                                         \
> > > > > -               asm qual (op "q %1,"__percpu_arg(0)     \
> > > > > -                   : "+m" (var)                        \
> > > > > -                   : "re" ((pto_T__)(val)));           \
> > > > > -               break;                                  \
> > > > > -       default: __bad_percpu_size();                   \
> > > > > -       }                                               \
> > > > > +#define percpu_to_op(size, qual, op, _var, _val)                       \
> > > > > +do {                                                                   \
> > > > > +       __pcpu_type_##size pto_val__ = __pcpu_cast_##size(_val);        \
> > > > > +       if (0) {                                                        \
> > > > > +               typeof(_var) pto_tmp__;                                 \
> > > > > +               pto_tmp__ = (_val);                                     \
> > > > > +               (void)pto_tmp__;                                        \
> > > > > +       }                                                               \
> > > >
> > > > Please replace the whole `if (0)` block with:
> > > > ```c
> > > > __same_type(_var, _val);
> > > > ```
> > > > from include/linux/compiler.h.
> > >
> > > The problem with __builtin_types_compatible_p() is that it considers
> > > unsigned long and u64 (aka unsigned long long) as different types even
> > > though they are the same width on x86-64.  While this may be a good
> > > cleanup to look at in the future, it's not a simple drop-in
> > > replacement.
> >
> > Does it trigger errors in this case?
>
> Yes, see boot_init_stack_canary().  That code looks a bit sketchy but
> it's not wrong, for x86-64 at least.
>
> It also doesn't seem to like "void *" compared to any other pointer type:
>
> In function ‘fpregs_deactivate’,
>     inlined from ‘fpu__drop’ at arch/x86/kernel/fpu/core.c:285:3:
> ./include/linux/compiler.h:379:38: error: call to
> ‘__compiletime_assert_317’ declared with attribute error: BUILD_BUG_ON
> failed: !__same_type((fpu_fpregs_owner_ctx), ((void *)0))
>   379 |  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
>       |                                      ^
> ./include/linux/compiler.h:360:4: note: in definition of macro
> ‘__compiletime_assert’
>   360 |    prefix ## suffix();    \
>       |    ^~~~~~
> ./include/linux/compiler.h:379:2: note: in expansion of macro
> ‘_compiletime_assert’
>   379 |  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
>       |  ^~~~~~~~~~~~~~~~~~~
> ./include/linux/build_bug.h:39:37: note: in expansion of macro
> ‘compiletime_assert’
>    39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
>       |                                     ^~~~~~~~~~~~~~~~~~
> ./include/linux/build_bug.h:50:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’
>    50 |  BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
>       |  ^~~~~~~~~~~~~~~~
> ./arch/x86/include/asm/percpu.h:105:2: note: in expansion of macro
> ‘BUILD_BUG_ON’
>   105 |  BUILD_BUG_ON(!__same_type(_var, _val));    \
>       |  ^~~~~~~~~~~~
> ./arch/x86/include/asm/percpu.h:338:37: note: in expansion of macro
> ‘percpu_to_op’
>   338 | #define this_cpu_write_8(pcp, val)  percpu_to_op(8, volatile,
> "mov", (pcp), val)
>       |                                     ^~~~~~~~~~~~
> ./include/linux/percpu-defs.h:380:11: note: in expansion of macro
> ‘this_cpu_write_8’
>   380 |   case 8: stem##8(variable, __VA_ARGS__);break;  \
>       |           ^~~~
> ./include/linux/percpu-defs.h:508:34: note: in expansion of macro
> ‘__pcpu_size_call’
>   508 | #define this_cpu_write(pcp, val)
> __pcpu_size_call(this_cpu_write_, pcp, val)
>       |                                  ^~~~~~~~~~~~~~~~
> ./arch/x86/include/asm/fpu/internal.h:525:2: note: in expansion of
> macro ‘this_cpu_write’
>   525 |  this_cpu_write(fpu_fpregs_owner_ctx, NULL);
>       |  ^~~~~~~~~~~~~~
>
> >
> > It's interesting to know how this trick differs from
> > __builtin_types_compatible_p().  Might even be helpful to wrap this
> > pattern in a macro with a comment with the pros/cons of this approach
> > vs __same_type.
>
> I think the original code is more to catch a mismatch between pointers
> and integers.  It doesn't seem to care about truncation
>
> > On the other hand, the use of `long` seems tricky in x86 code as x86
> > (32b) is ILP32 but x86_64 (64b) is LP64.  So the use of `long` is
> > ambiguous in the sense that it's a different size depending on the
> > target ABI.  Wouldn't it potentially be a bug for x86 kernel code to
> > use `long` percpu variables (or rather mix `long` and `long long` in
> > the same operation) in that case, since the sizes of the two would be
> > different for i386?
>
> Not necessarily.  Some things like registers are naturally 32-bit on a
> 32-bit kernel and 64-bit on a 64-bit kernel, so 'long' is appropriate
> there.

Sorry for not getting back to this sooner, amazing how fast emails get
buried in an inbox.

Interesting findings.  Feels almost like a _Static_assert that the
sizeof these types match might be more straightforward, but I don't
need to nitpick pre-existing code that this patch simply carries
forward.  I realized I never signed off on this.

Reviewed-by: Nick Desaulniers <ndesaulniers@...gle.com>

It looks like there's still an outstanding issue with patch 4/7?
https://lore.kernel.org/lkml/CAKwvOdnVU3kZnGzkYjEFJWMPuVjOmAHuRSB8FJ-Ks+FWzX2M_Q@mail.gmail.com/
-- 
Thanks,
~Nick Desaulniers