[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wgemNj9GBepSEJXS5N99rr9wLkL668UC9TsKH45NnJ7Mg@mail.gmail.com>
Date: Tue, 29 Aug 2023 13:03:54 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
bp@...en8.de
Subject: Re: [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS
On Tue, 29 Aug 2023 at 12:45, Mateusz Guzik <mjguzik@...il.com> wrote:
>
> So I think I know how to fix it, but I'm going to sleep on it.
I think you can just skip the %r8 games, and do that
leal (%rax,%rcx,8),%rcx
in the exception fixup code, since %rax will have the low bits of the
byte count, and %rcx will have the remaining qword count.
We should also have some test-case for partial reads somewhere, but I
have to admit that when I did the cleanup patches I just wrote some
silly test myself (ie just doing a 'mmap()' and then reading/writing
into the end of that mmap at different offsets.
I didn't save that hacky thing, I'm afraid.
I also tried to figure out if there is any CPU we should care about
that doesn't like 'rep movsq', but I think you are right that there
really isn't. The "good enough" rep things were introduced in the PPro
if I recall correctly, and while you could disable them in the BIOS,
by the time Intel did 64-bit in Northwood (?) it was pretty much
standard.
So yeah, no reason to have the unrolled loop at all, and I think your
patch is fine conceptually, just needs fixing and testing for the
partial success case.
Oh, and you should also remove the clobbers of r8-r11 in the
copy_user_generic() inline asm in <asm/uaccess_64.h> when you've fixed
the exception handling. The only reason for those clobbers were for
that unrolled register use.
So only %rax ends up being a clobber for the rep_movs_alternative
case, as far as I can tell.
Linus
Powered by blists - more mailing lists