lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wjfmmYPw0wX1BW6gi_KAhdZi+81or024JFcRYHiQh-jpw@mail.gmail.com>
Date:   Tue, 17 Jan 2023 10:33:41 -0800
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Nick Desaulniers <ndesaulniers@...gle.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Andy Lutomirski <luto@...nel.org>, x86@...nel.org,
        Kostya Serebryany <kcc@...gle.com>,
        Andrey Ryabinin <ryabinin.a.a@...il.com>,
        Andrey Konovalov <andreyknvl@...il.com>,
        Alexander Potapenko <glider@...gle.com>,
        Taras Madan <tarasmadan@...gle.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        "H . J . Lu" <hjl.tools@...il.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Rick Edgecombe <rick.p.edgecombe@...el.com>,
        Bharata B Rao <bharata@....com>,
        Jacob Pan <jacob.jun.pan@...ux.intel.com>,
        Ashok Raj <ashok.raj@...el.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Sami Tolvanen <samitolvanen@...gle.com>,
        joao@...rdrivepizza.com
Subject: Re: [PATCHv14 08/17] x86/mm: Reduce untagged_addr() overhead until
 the first LAM user

On Tue, Jan 17, 2023 at 10:26 AM Nick Desaulniers
<ndesaulniers@...gle.com> wrote:
>
> On Tue, Jan 17, 2023 at 9:29 AM Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > Side note: that's not something new or unusual. It's been the case
> > since I started testing clang - we have several code-paths where we
> > use "unlikely()" to try to get very unlikely cases to be out-of-line,
> > and clang just mostly ignores it, or treats it as a very weak hint. I
> > think the only way to get clang to treat it as a *strong* hint is to
> > use PGO.
>
> I'd be surprised if that were intentional or by design.
>
> Do you guys have a bug report we could look at?

Heh. I actually sent you an example long ago. Let me go fish it out of
my mail archives and quote some of it below so that you can find it in
yours..

              Linus

[ Time passes. Found this email to you and Bill Wendling from Feb 16,
2020, Message-ID
CAHk-=wigVshsByCMjkUiZyQSR5N5zi2aAeQc+VJCzQV=nm8E7g@...l.gmail.com ]:

  Anyway, I'm looking at clang code generation, and comparing it with
  gcc on one of my "this has been optimized to hell and back" functions:
  __d_lookup_rcu().

  It looks like clang does frame pointers, and ignores our
  likely/unlikely annotations.

  So this code:

                if (unlikely(parent->d_flags & DCACHE_OP_COMPARE)) {
                        int tlen;
                        const char *tname;
                        ......

  doesn't actually jump out of line, but instead generates the unlikely
  case as the fallthrough:

        testb   $2, (%r12)
        je      .LBB50_9
        ... unlikely code goes here...

  and then the likely case ends up having unfortunate reloads inside the
  hot loop. Possibly because it has one fewer free registers than gcc
  because of the frame pointer.

  I didn't look into _why_ clang generates frame pointers but gcc
  doesn't. It may be just a compiler default, I think we don't end up
  explicitly asking either way.

[ And then Bill replied with this ]

  It's not a no-op. We add branch probabilities to the IR, whether
  they're honored or not depends on the transformation. But they
  *should* be honored when available. I've seen in the past that instead
  of moving unlikely blocks out of the way (like gcc, which moves them
  below the function's "ret" instruction), LLVM will do something like
  this:

    <normal code>
    <jmp to loop conditional test>
        <unlikely code>
        <more likely code>
    <loop conditional test>
    <...>

  I.e. the loop is rotated and the unlikely code is first and the hotter
  code is closer together but between the unlikely and conditional test.
  Could this be what's going on? Otherwise, maybe clang decided that
  it's not beneficial to move the code out-of-line because the benefit
  was minimal? (I'm guessing here.)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ