lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wiLyA0g3BvQ_nsF2PWi-FDtcNS5+4-ai1FX-xFzTBeTzg@mail.gmail.com>
Date:   Wed, 11 Oct 2023 12:51:56 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Uros Bizjak <ubizjak@...il.com>
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org,
        Nadav Amit <namit@...are.com>,
        Andy Lutomirski <luto@...nel.org>,
        Brian Gerst <brgerst@...il.com>,
        Denys Vlasenko <dvlasenk@...hat.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Josh Poimboeuf <jpoimboe@...hat.com>
Subject: Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr()

On Wed, 11 Oct 2023 at 11:42, Uros Bizjak <ubizjak@...il.com> wrote:
>
> The attached patch was tested on a target with fsgsbase CPUID and
> without it. It works!

.. I should clearly read all my emails before answering some of them.

Yes, that patch looks good to me, and I'm happy to hear that you
actually tested it unlike my "maybe something like this".

> The patch improves amd_pmu_enable_virt() in the same way as reported
> in the original patch submission and also reduces the number of percpu
> offset reads (either from this_cpu_off or with rdgsbase) from 1663 to
> 1571.

Dio y ou have any actka performance numbers? The patch looks good to
me, and I *think* rdgsbase ends up being faster in practice due to
avoiding a memory access, but that's very much a gut feel.

> The only drawback is a larger binary size:
>
>   text    data     bss     dec     hex filename
> 25546594        4387686  808452 30742732        1d518cc vmlinux-new.o
> 25515256        4387814  808452 30711522        1d49ee2 vmlinux-old.o
>
> that increases by 31k (0.123%), probably due to 1578 rdgsbase alternatives.

I'm actually surprised that it increases the text size. The 'rdgsbase'
instruction should be smaller than a 'mov %gs', so I would have
expected the *data* size to increase due to the alternatives tables,
but not the text size.

[ Looks around ]

Oh. It's because we put the altinstructions into the text section.
That's kind of silly, but whatever.

So I think that increase in text-size is not "real" - yes, it
increases our binary size because we obviously have two instructions,
but the actual *executable* part likely stays the same, and it's just
that we grow the altinstruction metadata.

                  Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ