lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 15 Nov 2023 22:26:30 -0500
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     David Howells <dhowells@...hat.com>
Cc:     Borislav Petkov <bp@...en8.de>,
        kernel test robot <oliver.sang@...el.com>,
        oe-lkp@...ts.linux.dev, lkp@...el.com,
        linux-kernel@...r.kernel.org,
        Christian Brauner <brauner@...nel.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
        Christian Brauner <christian@...uner.io>,
        Matthew Wilcox <willy@...radead.org>,
        David Laight <David.Laight@...lab.com>, ying.huang@...el.com,
        feng.tang@...el.com, fengwei.yin@...el.com
Subject: Re: [linus:master] [iov_iter] c9eec08bac: vm-scalability.throughput
 -16.9% regression

On Wed, 15 Nov 2023 at 18:00, David Howells <dhowells@...hat.com> wrote:
>
> And using __memcpy() rather than memcpy():

Yeah, that's just sad. It might indeed be that you're running on a
Haswell core, and the retpoline overhead just kills that entirely. You
could try building the kernel without mitigations (or booting with
them off, which isn't quite as good) to verify.

> A disassembly of _copy_from_iter() for the latter is attached.  Note that the
> UBUF/IOVEC still uses "rep movsb"

Well, yes and no.

User copies do that X86_FEATURE_FSRM alternatives dance, so the code
gets generated with "rep movs", but you'll note that there are several
'nops' after it.

Some of the nops are because we'll be inserting STAC/CLAC (three bytes
each, I think) instructions around user accesses for SMAP-capable
CPU's.

But some of the nops are because we'll be rewriting that "rep stosb"
(two bytes, iirc) as "call rep_stos_alternative" (5 bytes) on CPU's
that don't do FSRM like yours. So your CPU won't actually be executing
that 'rep stosb' sequence.

And yes, the '__x86_return_thunk' overhead can be pretty horrific. It
will get rewritten to the appropriate thing by "apply_returns". But
like the "rep movs" and the missing STAC/CLAC, you won't see that in
the objdump, you only see it in the final binary.

                    Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ