lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <a7a801d2-13d2-7b5b-66a5-98e7c95b00cc@gmail.com>
Date:   Mon, 19 Jul 2021 21:51:44 +0900
From:   Akira Tsukamoto <akira.tsukamoto@...il.com>
To:     Palmer Dabbelt <palmer@...belt.com>,
        Guenter Roeck <linux@...ck-us.net>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Qiu Wenbo <qiuwenbo@...inos.com.cn>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Albert Ou <aou@...s.berkeley.edu>,
        Akira Tsukamoto <akira.tsukamoto@...il.com>,
        linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: [PATCH v4 0/1] riscv: improving uaccess with logs from network bench

Hi Guenter, Geert and Qiu,

I fixed the bug which was overrunning the copy when the size was in the
between 8*SZREG to 9*SZREG. The SZREG holds the bytes per register size
which is 4 for RV32 and 8 for RV64.

Do you mind trying this patch? It works OK at my place.

Since I had to respin the patch I added word copy without unrolling when
the size is in the between 2*SZREG to 9*SZREG to reduce the number of byte
copies which has heavy overhead as Palmer has mentioned when he included
this patch to riscv/for-next.


I rewrote the functions but heavily influenced by Garry's memcpy
function [1]. It must be written in assembler to handle page faults
manually inside the function unlike other memcpy functions.

This patch will reduce cpu usage dramatically in kernel space especially
for applications which use sys-call with large buffer size, such as network
applications. The main reason behind this is that every unaligned memory
access will raise exceptions and switch between s-mode and m-mode causing
large overhead.

---
v3 -> v4:
- Fixed overrun copy
- Added word copy without unrolling to reduce byte copy for left over

v2 -> v3:
- Merged all patches

v1 -> v2:
- Added shift copy
- Separated patches for readability of changes in assembler
- Using perf results

[1] https://lkml.org/lkml/2021/2/16/778

Akira Tsukamoto (1):
  riscv: __asm_copy_to-from_user: Optimize unaligned memory access and
    pipeline stall

 arch/riscv/lib/uaccess.S | 218 ++++++++++++++++++++++++++++++++-------
 1 file changed, 183 insertions(+), 35 deletions(-)

-- 
2.17.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ