lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 6 Jun 2021 14:41:50 -0400
From:   Alan Stern <stern@...land.harvard.edu>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Segher Boessenkool <segher@...nel.crashing.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Will Deacon <will@...nel.org>,
        Andrea Parri <parri.andrea@...il.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Nick Piggin <npiggin@...il.com>,
        David Howells <dhowells@...hat.com>,
        Jade Alglave <j.alglave@....ac.uk>,
        Luc Maranget <luc.maranget@...ia.fr>,
        Akira Yokosawa <akiyks@...il.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-toolchains@...r.kernel.org,
        linux-arch <linux-arch@...r.kernel.org>
Subject: Re: [RFC] LKMM: Add volatile_if()

On Sat, Jun 05, 2021 at 08:41:00PM -0700, Linus Torvalds wrote:
> On Sat, Jun 5, 2021 at 6:29 PM Alan Stern <stern@...land.harvard.edu> wrote:
> >
> > Interesting.  And changing one of the branches from barrier() to __asm__
> > __volatile__("nop": : :"memory") also causes a branch to be emitted.  So
> > even though the compiler doesn't "look inside" assembly code, it does
> > compare two pieces at least textually and apparently assumes if they are
> > identical then they do the same thing.
> 
> That's actually a feature in some cases, ie the ability to do CSE on
> asm statements (ie the "always has the same output" optimization that
> the docs talk about).
> 
> So gcc has always looked at the asm string for that reason, afaik.
> 
> I think it's something of a bug when it comes to "asm volatile", but
> the documentation isn't exactly super-specific.
> 
> There is a statement of "Under certain circumstances, GCC may
> duplicate (or remove duplicates of) your assembly code when
> optimizing" and a suggestion of using "%=" to generate a unique
> instance of an asm.
> 
> Which might actually be a good idea for "barrier()", just in case.
> However, the problem with that is that I don't think we are guaranteed
> to have a universal comment character for asm statements.
> 
> IOW, it might be a good idea to do something like
> 
>    #define barrier() \
>         __asm__ __volatile__("# barrier %=": : :"memory")
> 
> but I'm  not 100% convinced that '#' is always a comment in asm code,
> so the above might not actually build everywhere.
> 
> However, *testing* the above (in my config, where '#' does work as a
> comment character) shows that gcc doesn't actually consider them to be
> distinct EVEN THEN, and will still merge two barrier statements.
> 
> That's distressing.
> 
> So the gcc docs are actively wrong, and %= does nothing - it will
> still compare as the exact same inline asm, because the string
> equality testing is apparently done before any expansion.
> 
> Something like this *does* seem to work:
> 
>    #define ____barrier(id) __asm__ __volatile__("#" #id: : :"memory")
>    #define __barrier(id) ____barrier(id)
>    #define barrier() __barrier(__COUNTER__)
> 
> which is "interesting" or "disgusting" depending on how you happen to feel.
> 
> And again - the above works only as long as "#" is a valid comment
> character in the assembler. And I have this very dim memory of us
> having comments in inline asm, and it breaking certain configurations
> (for when the assembler that the compiler uses is a special
> human-unfriendly one that only accepts compiler output).
> 
> You could make even more disgusting hacks, and have it generate something like
> 
>     .pushsection .discard.barrier
>     .long #id
>     .popsection
> 
> instead of a comment. We already expect that to work and have generic
> inline asm cases that generate code like that.

I tried the experiment with this code:

#define READ_ONCE(x) (*(volatile typeof(x) *)&(x))
#define WRITE_ONCE(x, val) (READ_ONCE(x) = (val))
#define barrier() __asm__ __volatile__("": : :"memory")

int x, y;

int main(int argc, char *argv[])
{
    if (READ_ONCE(x)) {
        barrier();
        y = 1;
    } else {
        y = 1;
    }
    return 0;
}

The output from gcc -O2 is:

main:
        mov     eax, DWORD PTR x[rip]
        test    eax, eax
        je      .L2
.L2:
        mov     DWORD PTR y[rip], 1

The output from clang is essentially the same (the mov and test are 
replaced by a cmp).

This does what we want, but I wouldn't bet against a future 
optimization pass getting rid of the "useless" test and branch.

Alan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ