[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGudoHHnCKwObL7Y_4hiX7FmREiX6cGfte5EuyGitbXwe_RhkQ@mail.gmail.com>
Date: Tue, 29 Aug 2023 21:45:09 +0200
From: Mateusz Guzik <mjguzik@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
bp@...en8.de
Subject: Re: [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS
On 8/29/23, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> On Mon, 28 Aug 2023 at 10:07, Mateusz Guzik <mjguzik@...il.com> wrote:
>>
>> Hand-rolled mov loops executing in this case are quite pessimal compared
>> to rep movsq for bigger sizes. While the upper limit depends on uarch,
>> everyone is well south of 1KB AFAICS and sizes bigger than that are
>> common. The problem can be easily remedied so do it.
>
> Ok, looking at teh actual code now, and your patch is buggy.
>
>> +.Llarge_movsq:
>> + movq %rcx,%r8
>> + movq %rcx,%rax
>> + shrq $3,%rcx
>> + andl $7,%eax
>> +6: rep movsq
>> + movl %eax,%ecx
>> testl %ecx,%ecx
>> jne .Lcopy_user_tail
>> RET
>
> The fixup code is very very broken:
>
>> +/*
>> + * Recovery after failed rep movsq
>> + */
>> +7: movq %r8,%rcx
>> + jmp .Lcopy_user_tail
>> +
>> + _ASM_EXTABLE_UA( 6b, 7b)
>
> That just copies the original value back into %rcx. That's not at all
> ok. The "rep movsq" may have succeeded partially, and updated %rcx
> (and %rsi/rdi) accordingly. You now will do the "tail" for entirely
> too much, and returning the wrong return value.
>
> In fact, if this then races with a mmap() in another thread, the user
> copy might end up then succeeding for the part that used to fail, and
> in that case it will possibly end up copying much more than asked for
> and overrunning the buffers provided.
>
> So all those games with %r8 are entirely bogus. There is no way that
> "save the original length" can ever be relevant or correct.
>
Huh, pretty obvious now that you mention it, I don't know why I
thought regs go back. But more importantly I should have checked
handling in the now-removed movsq routine (copy_user_generic_string):
[snip]
movl %edx,%ecx
shrl $3,%ecx
andl $7,%edx
1: rep movsq
2: movl %edx,%ecx
3: rep movsb
xorl %eax,%eax
ASM_CLAC
RET
11: leal (%rdx,%rcx,8),%ecx
12: movl %ecx,%edx /* ecx is zerorest also */
jmp .Lcopy_user_handle_tail
_ASM_EXTABLE_CPY(1b, 11b)
_ASM_EXTABLE_CPY(3b, 12b)
[/snip]
So I think I know how to fix it, but I'm going to sleep on it.
--
Mateusz Guzik <mjguzik gmail.com>
Powered by blists - more mailing lists