lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wjYOZf2wPj_=arATJ==DQQAQwh0ki=Za0RcE542rWBGFw@mail.gmail.com>
Date:   Sun, 3 Sep 2023 14:05:34 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Mateusz Guzik <mjguzik@...il.com>
Cc:     linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
        bp@...en8.de
Subject: Re: [PATCH v2] x86: bring back rep movsq for user access on CPUs
 without ERMS

On Sun, 3 Sept 2023 at 13:49, Mateusz Guzik <mjguzik@...il.com> wrote:
>
> "real fstat" is syscall(5, fd, &sb).
>
> Sapphire Rapids, will-it-scale, ops/s
>
> stock fstat     5088199
> patched fstat   7625244 (+49%)
> real fstat      8540383 (+67% / +12%)
>
> It dodges lockref et al, but it does not dodge SMAP which accounts for
> the difference.

Side note, since I was looking at this, I hacked up a quick way for
architectures to do their own optimized cp_new_stat() that avoids the
double-buffering.

Sadly it *is* architecture-specific due to padding and
architecture-specific field sizes (and thus EOVERFLOW rules), but it
is what it is.

I don't know how much it matters, but it might make a difference. And
'stat()' is most certainly worth optimizing for, even if glibc has
made our life more difficult.

Want to try out another entirely untested patch? Attached.

                Linus

View attachment "patch.diff" of type "text/x-patch" (3087 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ