lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wgjyGX3OVDtzJW6Oh2ukviXtJYi9+7eJW75DgX+d673iw@mail.gmail.com>
Date:   Mon, 4 Sep 2023 10:28:12 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Mateusz Guzik <mjguzik@...il.com>
Cc:     Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org,
        linux-arch@...r.kernel.org, bp@...en8.de
Subject: Re: [PATCH v2] x86: bring back rep movsq for user access on CPUs
 without ERMS

On Sun, 3 Sept 2023 at 23:03, Mateusz Guzik <mjguzik@...il.com> wrote:
>
> Worst case if the 64 bit structs differ one can settle for
> user-accessing INIT_STRUCT_STAT_PADDING.

As I said, try it. I think you'll find that you are wrong.  It's
_hard_ to get the padding right. The "use a temporary" model of the
current code makes the fallback easy - just clear it before copying
the fields. Without that, you have to get every architecture padding
right manually.

You almost inevitably end up with "one function for the fallback case,
a completely different function for the unsafe_put_user() case, and
fairly painful macro for architectures that get converted".

And even then, you'll get non-optimal code, because you won't get the
order of the stores right to get nice contiguous stores. That
admittedly only matters for architectures with bad store coalescing,
which is hopefully not any of the ones we care about (and those kinds
of microarchitectures usually also want the loads done first, so
ld-ld-ld-ld-st-st-st-st patterns).

But just giving up on that, and using a weak fallback function, and
then an optimal one for the (single) architecture that anybody will do
this for, makes it all much simpler.

Feel free to send me a patch to prove your point. Because without a
patch, I claim you are just blowing hot air.

               Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ