lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wg7MDktwhh9FPFRTEQOLEgFxNcNhm+znsMevSyY1+aLyw@mail.gmail.com>
Date:   Fri, 1 Sep 2023 08:29:09 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Mateusz Guzik <mjguzik@...il.com>
Cc:     linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
        bp@...en8.de
Subject: Re: [PATCH v2] x86: bring back rep movsq for user access on CPUs
 without ERMS

On Fri, 1 Sept 2023 at 08:20, Mateusz Guzik <mjguzik@...il.com> wrote:
>
> cp_new_stat and the counterpart for statx can dodge this rep mov by
> filling user memory directly.

Yeah, they could be made to use the "unsafe_put_user()" machinery
these days, and we could go back to the good old days of avoiding the
extra temp buffer.

> I'm going to patch this, but first I want to address the bigger
> problem of glibc implementing fstat as newfstatat, demolishing perf of
> that op. In their defense currently they have no choice as this is the
> only exporter of the "new" struct stat. I'll be sending a long email
> to fsdevel soon(tm) with a proposed fix.

I wouldn't mind re-instating the "copy directly to user space rather
than go through a temporary buffer", for the stat family of functions,
so please do..

> So I was wondering if rep movsq is any worse than ERMS'ed rep movsb
> when there is no tail to handle and the buffer is aligned to a page,
> or more to the point if clear_page gets any benefit for going with
> movsb.

Hard to tell. 'movsq' is *historically* better, and likely on all
current microarchitectures.

But 'movsb' is actually in many ways easier for the CPU to optimize,
because there's no question of the sub-chunking if anything is not
aligned just rught.

             Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ