lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 13 Oct 2023 14:02:26 -0700
From:   Sean Christopherson <seanjc@...gle.com>
To:     Uros Bizjak <ubizjak@...il.com>
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Nadav Amit <namit@...are.com>, Ingo Molnar <mingo@...nel.org>,
        Andy Lutomirski <luto@...nel.org>,
        Brian Gerst <brgerst@...il.com>,
        Denys Vlasenko <dvlasenk@...hat.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Josh Poimboeuf <jpoimboe@...hat.com>
Subject: Re: [PATCH tip] x86/percpu: Rewrite arch_raw_cpu_ptr()

On Fri, Oct 13, 2023, Uros Bizjak wrote:
> On Fri, Oct 13, 2023 at 6:04 PM Sean Christopherson <seanjc@...gle.com> wrote:
> >
> > On Wed, Oct 11, 2023, Uros Bizjak wrote:
> > > Additionaly, the patch introduces 'rdgsbase' alternative for CPUs with
> > > X86_FEATURE_FSGSBASE. The rdgsbase instruction *probably* will end up
> > > only decoding in the first decoder etc. But we're talking single-cycle
> > > kind of effects, and the rdgsbase case should be much better from
> > > a cache perspective and might use fewer memory pipeline resources to
> > > offset the fact that it uses an unusual front end decoder resource...
> >
> > The switch to RDGSBASE should be a separate patch, and should come with actual
> > performance numbers.
> 
> This *is* the patch to switch to RDGSBASE. The propagation of
> arguments is a nice side-effect of the patch. due to the explicit
> addition of the offset addend to the %gs base. This patch is
> alternative implementation of [1]
> 
> [1] x86/percpu: Use C for arch_raw_cpu_ptr(),
> https://lore.kernel.org/lkml/20231010164234.140750-1-ubizjak@gmail.com/

Me confused, can't you first switch to MOV with tcp_ptr__ += (unsigned long)(ptr),
and then introduce the RDGSBASE alternative?

> Unfortunately, I have no idea on how to measure the impact of such a
> low-level feature, so I'll at least need some guidance. The "gut
> feeling" says that special instruction, intended to support the
> feature, is always better than emulating said feature with a memory
> access.

AIUI, {RD,WR}{FS,GS}BASE were added as faster alternatives to {RD,WR}MSR, not to
accelerate actual accesses to per-CPU data, TLS, etc.  E.g. loading a 64-bit base
via a MOV to FS/GS is impossible.  And presumably saving a userspace controlled
by actually accessing FS/GS is dangerous for one reason or another.

The instructions are guarded by a CR4 bit, the ucode cost just to check CR4.FSGSBASE
is probably non-trivial.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ