[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y8blowOi0AfbdUbF@hirez.programming.kicks-ass.net>
Date: Tue, 17 Jan 2023 19:14:59 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Andy Lutomirski <luto@...nel.org>, x86@...nel.org,
Kostya Serebryany <kcc@...gle.com>,
Andrey Ryabinin <ryabinin.a.a@...il.com>,
Andrey Konovalov <andreyknvl@...il.com>,
Alexander Potapenko <glider@...gle.com>,
Taras Madan <tarasmadan@...gle.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
"H . J . Lu" <hjl.tools@...il.com>,
Andi Kleen <ak@...ux.intel.com>,
Rick Edgecombe <rick.p.edgecombe@...el.com>,
Bharata B Rao <bharata@....com>,
Jacob Pan <jacob.jun.pan@...ux.intel.com>,
Ashok Raj <ashok.raj@...el.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
Sami Tolvanen <samitolvanen@...gle.com>,
ndesaulniers@...gle.com, joao@...rdrivepizza.com
Subject: Re: [PATCHv14 08/17] x86/mm: Reduce untagged_addr() overhead until
the first LAM user
On Tue, Jan 17, 2023 at 09:18:01AM -0800, Linus Torvalds wrote:
> The reason clang seems to generate saner code is that clang seems to
> largely ignore the whole "__builtin_expect()", at least not to the
> point where it tries to make the unlikely case be out-of-line.
So in this case there is only a 'likely' hint, we're explicitly trying
to keep the thing in-line so we can jump over it.
It is GCC that generated an implicit else (and marked it 'unlikely' --
which we didn't ask for), but worse, it failed to spot the else case is
in fact shared with the normal case and it could've simply lifted that
mov instruction.
That is, instead of this:
0003 23b3: eb 76 jmp 242b <write_ok_or_segv+0x7b>
0005 23b5: 65 48 8b 0d 00 00 00 00 mov %gs:0x0(%rip),%rcx # 23bd <write_ok_or_segv+0xd> 23b9: R_X86_64_PC32 tlbstate_untag_mask-0x4
000d 23bd: 48 89 f8 mov %rdi,%rax
0010 23c0: 48 c1 f8 3f sar $0x3f,%rax
0014 23c4: 48 09 c8 or %rcx,%rax
0017 23c7: 48 21 f8 and %rdi,%rax
001a 23ca: 48 b9 00 f0 ff ff ff 7f 00 00 movabs $0x7ffffffff000,%rcx
007b 242b: 48 89 f8 mov %rdi,%rax
007e 242e: eb 9a jmp 23ca <write_ok_or_segv+0x1a>
It could've just done:
0003 48 89 f8 mov %rdi,%rax
0006 eb 76 jmp +18
0008 65 48 8b 0d 00 00 00 00 mov %gs:0x0(%rip),%rcx # 23bd <write_ok_or_segv+0xd> 23b9: R_X86_64_PC32 tlbstate_untag_mask-0x4
0010 48 c1 f8 3f sar $0x3f,%rax
0014 48 09 c8 or %rcx,%rax
0017 48 21 f8 and %rdi,%rax
001a 48 b9 00 f0 ff ff ff 7f 00 00 movabs $0x7ffffffff000,%rcx
and everything would've been good. In all the cases I've seen it do
this, it was the same, it has this silly move out of line that's also
part of the regular branch.
That is, I like __builtin_expect() to be a strong hint. If I don't want
things out of line, I shouldn't have put unlikely on it. What I don't
like is that implicit else branches get the opposite strong hint.
What I like even less is that it found it needed that else branch at
all.
Powered by blists - more mailing lists