lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFULd4aTY002A7NHRCX21aTpYOE=tnpouBk6hkoeWND=LnT4ww@mail.gmail.com>
Date:   Wed, 18 Oct 2023 22:51:59 +0200
From:   Uros Bizjak <ubizjak@...il.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Nadav Amit <namit@...are.com>,
        "the arch/x86 maintainers" <x86@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Andy Lutomirski <luto@...nel.org>,
        Brian Gerst <brgerst@...il.com>,
        Denys Vlasenko <dvlasenk@...hat.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Nick Desaulniers <ndesaulniers@...gle.com>
Subject: Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr()

On Wed, Oct 18, 2023 at 10:34 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> On Wed, 18 Oct 2023 at 13:22, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > And yes, sometimes we use actual volatile accesses for them
> > (READ_ONCE() and WRITE_ONCE()) but those are *horrendous* in general,
> > and are much too strict. Not only does gcc generally lose its mind
> > when it sees volatile (ie it stops doing various sane combinations
> > that would actually be perfectly valid), but it obviously also stops
> > doing CSE on the loads (as it has to).
>
> Note, in case you wonder what I mean by "lose its mind", try this
> (extremely stupid) test program:
>
>     void a(volatile int *i) { ++*i; }
>     void b(int *i) { ++*i; }
>
> and note that the non-volatile version does
>
>     addl $1, (%rdi)
>
> but the volatile version then refuses to combine the read+write into a
> rmw instruction, and generates
>
>     movl (%rdi), %eax
>     addl $1, %eax
>     movl %eax, (%rdi)
>
> instead.
>
> Sure, it's correct, but it's an example of how 'volatile' ends up
> disabling a lot of other optimizations than just the "don't remove the
> access".
>
> Doing the volatile as one rmw instruction would still have been very
> obviously valid - it's still doing a read and a write. You don't need
> two instructions for that.

FYI: This is the reason RMW instructions in percpu.h are not (blindly)
converted to C ops.  They will remain in their (volatile or not) asm
form because of the above reason, and due to the fact that they don't
combine with anything.

> I'm not complaining, and I understand *why* it happens - compiler
> writers very understandably go "oh, I'm not touching that".
>
> I'm just trying to point out that volatile really screws up code
> generation even aside from the "access _exactly_ once" issue.
>
> So using inline asm and relying on gcc doing (minimal) CSE will then
> generate better code than volatile ever could, even when we just use a
> simple 'mov" instruction. At least you get that basic combining
> effect, even if it's not great.

Actually, RMW insns are better written in asm, while simple "mov"
should be converted to (volatile or not) memory access. On x86 "mov"s
from memory (reads) will combine nicely with almost all other
instructions.

> And for memory ops, *not* using volatile is dangerous when they aren't stable.

True.

Uros.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ