lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YHCfgHwDtT7m4ffq@hirez.programming.kicks-ass.net>
Date:   Fri, 9 Apr 2021 20:40:00 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     David Malcolm <dmalcolm@...hat.com>
Cc:     Ard Biesheuvel <ardb@...nel.org>, linux-toolchains@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Jason Baron <jbaron@...mai.com>,
        "Steven Rostedt (VMware)" <rostedt@...dmis.org>
Subject: Re: static_branch/jump_label vs branch merging

On Fri, Apr 09, 2021 at 09:48:33AM -0400, David Malcolm wrote:
> You tried __pure on arch_static_branch; did you try it on
> static_branch_unlikely?

static_branch_unlikely() is a CPP macro that expands to a statement
expression, or as with the later patch, a _Generic(). I'm not sure how
to apply the attribute to either of them since it is a function
attribute.

I was hoping the attribute would percolate through, so to speak.

> With the caveat that my knowledge of GCC's middle-end is mostly about
> implementing warnings, rather than optimization, I did some
> experimentation, with gcc trunk on x86_64 FWIW.
> 
> Given:
> 
> int __attribute__((pure)) foo(void);
> 
> int t(void)
> {
>   int a;
>   if (foo())
>     a++;
>   if (foo())
>     a++;
>   if (foo())
>     a++;
>   return a;
> }
> 
> At -O1 and above this is optimized to a single call to foo, returning 0
> or 3 accordingly.
> 
> -fdump-tree-all shows that it's the "fre1" pass that eliminates the
> subsequent calls to foo, replacing them with reuses of the result of
> the first call.
> 
> This is in gcc/tree-ssa-sccvn.c, a value-numbering pass.
> 
> I think you want to somehow "teach" the compiler that:
>   static_branch_unlikely(&sched_schedstats)
> is "pure-ish", that for some portion of the surrounding code that you
> want the result to be treated as pure - though I suspect compiler
> maintainers with more experience than me are thinking "but which
> portion? what is it safe to assume, and what will users be annoyed
> about if we optimize away? what if t itself is inlined somewhere?" and
> similar concerns.

Right, pure or even const. As to the scope, as wide as possible. It
literally is a global constant, the value returned is the same
everywhere.

All we need GCC to do for the static_branch construct is to emit both
branches; that is, it must not treat the result as a constant and elide
the other branches. But it can consider consecutive calls (as far and
wide as it wants) to return the same value.

> Or maybe the asm stmt itself could somehow be marked as pure??? (with
> similar concerns about semantics as above)

Yeah, not sure, someone with more clue will have to inform us what, if
anything more than marking it either pure or const is required. Perhaps
that attribute is sufficient and the compiler just isn't optimizing for
an unrelated reason.

Regards,

Peter

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ