lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 09 Apr 2021 09:48:33 -0400
From:   David Malcolm <dmalcolm@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Ard Biesheuvel <ardb@...nel.org>, linux-toolchains@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Jason Baron <jbaron@...mai.com>,
        "Steven Rostedt (VMware)" <rostedt@...dmis.org>
Subject: Re: static_branch/jump_label vs branch merging

On Fri, 2021-04-09 at 15:13 +0200, Peter Zijlstra wrote:
> On Fri, Apr 09, 2021 at 03:01:50PM +0200, Peter Zijlstra wrote:
> > On Fri, Apr 09, 2021 at 02:03:46PM +0200, Peter Zijlstra wrote:
> > > On Fri, Apr 09, 2021 at 07:55:42AM -0400, David Malcolm wrote:
> > 
> > > > Sorry if this is a dumb question, but does the function
> > > > attribute:
> > > >   __attribute__ ((pure)) 
> > > > help here?  It's meant to allow multiple calls to a predicate
> > > > to be
> > > > merged - though I'd be nervous of using it here, the predicate
> > > > isn't
> > > > 100% pure, since AIUI the whole point of what you've built is
> > > > for
> > > > predicates that very rarely change - but can change
> > > > occasionally.
> > > 
> > > I actually tried that, but it doesn't seem to work. Given the
> > > function
> > > arguments are all compile time constants it should DTRT AFAICT,
> > > but
> > > alas.
> > 
> > FWIW, I tried the below patch and GCC-10.2.1 on current tip/master.
> 
> I also just tried __attribute__((__const__)), which is stronger still
> than __pure__ and that's also not working :/
> 
> I then also tried to replace the __buildin_types_compatible_p() magic
> in
> static_branch_unlikely() with _Generic(), but still no joy.
> 

[..snip...]

You tried __pure on arch_static_branch; did you try it on
static_branch_unlikely?

With the caveat that my knowledge of GCC's middle-end is mostly about
implementing warnings, rather than optimization, I did some
experimentation, with gcc trunk on x86_64 FWIW.

Given:

int __attribute__((pure)) foo(void);

int t(void)
{
  int a;
  if (foo())
    a++;
  if (foo())
    a++;
  if (foo())
    a++;
  return a;
}

At -O1 and above this is optimized to a single call to foo, returning 0
or 3 accordingly.

-fdump-tree-all shows that it's the "fre1" pass that eliminates the
subsequent calls to foo, replacing them with reuses of the result of
the first call.

This is in gcc/tree-ssa-sccvn.c, a value-numbering pass.

I think you want to somehow "teach" the compiler that:
  static_branch_unlikely(&sched_schedstats)
is "pure-ish", that for some portion of the surrounding code that you
want the result to be treated as pure - though I suspect compiler
maintainers with more experience than me are thinking "but which
portion? what is it safe to assume, and what will users be annoyed
about if we optimize away? what if t itself is inlined somewhere?" and
similar concerns.

Or maybe the asm stmt itself could somehow be marked as pure??? (with
similar concerns about semantics as above)

Hope this is constructive
Dave


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ