lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=whd82fzhEbFRw9d_EMtR1SeefOJabjCHcm4-6jzeqqd3g@mail.gmail.com>
Date: Thu, 20 Mar 2025 12:23:38 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: x86@...nel.org, hkrzesin@...hat.com, tglx@...utronix.de, mingo@...hat.com, 
	bp@...en8.de, dave.hansen@...ux.intel.com, hpa@...or.com, olichtne@...hat.com, 
	atomasov@...hat.com, aokuliar@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86: handle the tail in rep_movs_alternative() with an
 overlapping store

On Thu, 20 Mar 2025 at 12:06, Mateusz Guzik <mjguzik@...il.com> wrote:
>
> Sizes ranged <8,64> are copied 8 bytes at a time with a jump out to a
> 1 byte at a time loop to handle the tail.

I definitely do not mind this patch, but I think it doesn't go far enough.

It gets rid of the byte-at-a-time loop at the end, but only for the
short-copy case of 8-63 bytes.

The .Llarge_movsq ends up still doing

        testl %ecx,%ecx
        jne .Lcopy_user_tail
        RET

and while that is only triggered by the non-ERMS case, that's what
most older AMD CPU's will trigger, afaik.

So I think that if we do this, we should do it properly.

               Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ