lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGudoHFNPb8CHGNK+msCE70bPZkKyCom7g_7Laja+AVoAN5CXQ@mail.gmail.com>
Date:   Tue, 29 Aug 2023 22:20:07 +0200
From:   Mateusz Guzik <mjguzik@...il.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
        bp@...en8.de
Subject: Re: [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS

On 8/29/23, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> On Tue, 29 Aug 2023 at 12:45, Mateusz Guzik <mjguzik@...il.com> wrote:
>>
>> So I think I know how to fix it, but I'm going to sleep on it.
>
> I think you can just skip the %r8 games, and do that
>
>         leal (%rax,%rcx,8),%rcx
>
> in the exception fixup code, since %rax will have the low bits of the
> byte count, and %rcx will have the remaining qword count.
>
> We should also have some test-case for partial reads somewhere, but I
> have to admit that when I did the cleanup patches I just wrote some
> silly test myself (ie just doing a 'mmap()' and then reading/writing
> into the end of that mmap at different offsets.
>
> I didn't save that hacky thing, I'm afraid.
>

Ye I was planning on writing some tests to illustrate this works as
intended and v1 does not. Part of why I'm going to take more time,
there is no rush patching this.

> I also tried to figure out if there is any CPU we should care about
> that doesn't like 'rep movsq', but I think you are right that there
> really isn't. The "good enough" rep things were introduced in the PPro
> if I recall correctly, and while you could disable them in the BIOS,
> by the time Intel did 64-bit in Northwood (?) it was pretty much
> standard.
>

gcc already inlines rep movsq for copies which fit, so....

On that note I'm going to submit a patch to whack non-rep clear_page as well.

> So yeah, no reason to have the unrolled loop at all, and I think your
> patch is fine conceptually, just needs fixing and testing for the
> partial success case.
>
> Oh, and you should also remove the clobbers of r8-r11 in the
> copy_user_generic() inline asm in <asm/uaccess_64.h> when you've fixed
> the exception handling. The only reason for those clobbers were for
> that unrolled register use.
>
> So only %rax ends up being a clobber for the rep_movs_alternative
> case, as far as I can tell.
>

Ok, I'll patch it up.

That label reorg was cosmetics, did not matter. But that bad fixup
thing was quite avoidable by checking what original movsq was doing,
which I should have done before sending v1. Sorry for the lame patch
on that front. ;) (fwiw I did multiple kernel builds and whatnot with
it, nothing blew up)

Thanks for the review.

-- 
Mateusz Guzik <mjguzik gmail.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ