lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGudoHHnCKwObL7Y_4hiX7FmREiX6cGfte5EuyGitbXwe_RhkQ@mail.gmail.com>
Date:   Tue, 29 Aug 2023 21:45:09 +0200
From:   Mateusz Guzik <mjguzik@...il.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
        bp@...en8.de
Subject: Re: [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS

On 8/29/23, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> On Mon, 28 Aug 2023 at 10:07, Mateusz Guzik <mjguzik@...il.com> wrote:
>>
>> Hand-rolled mov loops executing in this case are quite pessimal compared
>> to rep movsq for bigger sizes. While the upper limit depends on uarch,
>> everyone is well south of 1KB AFAICS and sizes bigger than that are
>> common. The problem can be easily remedied so do it.
>
> Ok, looking at teh actual code now, and your patch is buggy.
>
>> +.Llarge_movsq:
>> +       movq %rcx,%r8
>> +       movq %rcx,%rax
>> +       shrq $3,%rcx
>> +       andl $7,%eax
>> +6:     rep movsq
>> +       movl %eax,%ecx
>>         testl %ecx,%ecx
>>         jne .Lcopy_user_tail
>>         RET
>
> The fixup code is very very broken:
>
>> +/*
>> + * Recovery after failed rep movsq
>> + */
>> +7:     movq %r8,%rcx
>> +       jmp .Lcopy_user_tail
>> +
>> +       _ASM_EXTABLE_UA( 6b, 7b)
>
> That just copies the original value back into %rcx. That's not at all
> ok. The "rep movsq" may have succeeded partially, and updated %rcx
> (and %rsi/rdi) accordingly. You now will do the "tail" for entirely
> too much, and returning the wrong return value.
>
> In fact, if this then races with a mmap() in another thread, the user
> copy might end up then succeeding for the part that used to fail, and
> in that case it will possibly end up copying much more than asked for
> and overrunning the buffers provided.
>
> So all those games with %r8 are entirely bogus. There is no way that
> "save the original length" can ever be relevant or correct.
>

Huh, pretty obvious now that you mention it, I don't know why I
thought regs go back. But more importantly I should have checked
handling in the now-removed movsq routine (copy_user_generic_string):

[snip]
        movl %edx,%ecx
        shrl $3,%ecx
        andl $7,%edx
1:      rep movsq
2:      movl %edx,%ecx
3:      rep movsb
        xorl %eax,%eax
        ASM_CLAC
        RET

11:     leal (%rdx,%rcx,8),%ecx
12:     movl %ecx,%edx          /* ecx is zerorest also */
        jmp .Lcopy_user_handle_tail

        _ASM_EXTABLE_CPY(1b, 11b)
        _ASM_EXTABLE_CPY(3b, 12b)
[/snip]

So I think I know how to fix it, but I'm going to sleep on it.

-- 
Mateusz Guzik <mjguzik gmail.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ