linux-kernel - Re: [linus:master] [iov_iter] c9eec08bac: vm-scalability.throughput -16.9% regression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wiRQHD5xnB8H9Lwk9fJPDpfVNAwPS4KLnfrcrU3zbMAdQ@mail.gmail.com>
Date:   Fri, 17 Nov 2023 13:57:32 -0800
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Borislav Petkov <bp@...en8.de>
Cc:     David Howells <dhowells@...hat.com>,
        kernel test robot <oliver.sang@...el.com>,
        oe-lkp@...ts.linux.dev, lkp@...el.com,
        linux-kernel@...r.kernel.org,
        Christian Brauner <brauner@...nel.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
        Christian Brauner <christian@...uner.io>,
        Matthew Wilcox <willy@...radead.org>,
        David Laight <David.Laight@...lab.com>, ying.huang@...el.com,
        feng.tang@...el.com, fengwei.yin@...el.com
Subject: Re: [linus:master] [iov_iter] c9eec08bac: vm-scalability.throughput
 -16.9% regression

On Fri, 17 Nov 2023 at 11:13, Borislav Petkov <bp@...en8.de> wrote:
>
> I wouldn't want to optimize some weird loads. Especially if you have
> weird loads which perform differently depending on what uarch
> "optimizations" they sport.
>
> I guess optimizing for the majority of machines - modern FSRM ones which
> can do "rep; movsb" just fine - is one way to put it. And the rest is
> best effort.

Yeah, we shouldn't optimize for microbenchmarks in particular.

The kernel robot performance reports have been interesting, because
they do end up often pointing to real issues. But we've had these
kinds of things too, where the benchmark is just odd and clearly
happens to trigger something that is just very machine-specific.

So I don't think we should use either of these benchmarks as a "we
need to optimize for *this*", but it is another example of how much
memcpy() does matter. Even if the end result is then "but different
microarchitectrues react so differently that we can't please
everybody".

            Linus