lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f0d06d39-3dc8-fb10-4d37-a75ef866cdc8@gmail.com>
Date:   Thu, 15 Jul 2021 15:20:03 +0900
From:   Akira Tsukamoto <akira.tsukamoto@...il.com>
To:     Geert Uytterhoeven <geert@...ux-m68k.org>,
        Guenter Roeck <linux@...ck-us.net>
Cc:     akira.tsukamoto@...il.com,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        Albert Ou <aou@...s.berkeley.edu>,
        linux-riscv <linux-riscv@...ts.infradead.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 1/1] riscv: __asm_copy_to-from_user: Optimize unaligned
 memory access and pipeline stall



On 7/14/2021 3:10 AM, Geert Uytterhoeven wrote:
> Hi Günter, Tsukamoto-san,
> 
> On Sat, Jul 10, 2021 at 3:50 AM Guenter Roeck <linux@...ck-us.net> wrote:
>> On Wed, Jun 23, 2021 at 09:40:39PM +0900, Akira Tsukamoto wrote:
>>> This patch will reduce cpu usage dramatically in kernel space especially
>>> for application which use sys-call with large buffer size, such as network
>>> applications. The main reason behind this is that every unaligned memory
>>> access will raise exceptions and switch between s-mode and m-mode causing
>>> large overhead.
>>>
>>> First copy in bytes until reaches the first word aligned boundary in
>>> destination memory address. This is the preparation before the bulk
>>> aligned word copy.
>>>
>>> The destination address is aligned now, but oftentimes the source address
>>> is not in an aligned boundary. To reduce the unaligned memory access, it
>>> reads the data from source in aligned boundaries, which will cause the
>>> data to have an offset, and then combines the data in the next iteration
>>> by fixing offset with shifting before writing to destination. The majority
>>> of the improving copy speed comes from this shift copy.
>>>
>>> In the lucky situation that the both source and destination address are on
>>> the aligned boundary, perform load and store with register size to copy the
>>> data. Without the unrolling, it will reduce the speed since the next store
>>> instruction for the same register using from the load will stall the
>>> pipeline.
>>>
>>> At last, copying the remainder in one byte at a time.
>>>
>>> Signed-off-by: Akira Tsukamoto <akira.tsukamoto@...il.com>
>>
>> This patch causes all riscv32 qemu emulations to stall during boot.
>> The log suggests that something in kernel/user communication may be wrong.
>>
>> Bad case:
>>
>> Starting syslogd: OK
>> Starting klogd: OK
>> /etc/init.d/S02sysctl: line 68: syntax error: EOF in backquote substitution
>> /etc/init.d/S20urandom: line 1: syntax error: unterminated quoted string
>> Starting network: /bin/sh: syntax error: unterminated quoted string
> 
>> # first bad commit: [ca6eaaa210deec0e41cbfc380bf89cf079203569] riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall
> 
> Same here on vexriscv. Bisected to the same commit.
> 
> The actual scripts look fine when using "cat", but contain some garbage
> when executing them using "sh -v".
> 
> Tsukamoto-san: glancing at the patch:
> 
> +       addi    a0, a0, 8*SZREG
> +       addi    a1, a1, 8*SZREG
> 
> I think you forgot about rv32, where registers cover only 4
> bytes each?

Thanks Günter and Geert for the pointing out the errors.
I will send the fixes, probably this weekend.

Akira

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ