linux-kernel - [PATCH v4 0/1] riscv: improving uaccess with logs from network bench

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <a7a801d2-13d2-7b5b-66a5-98e7c95b00cc@gmail.com>
Date:   Mon, 19 Jul 2021 21:51:44 +0900
From:   Akira Tsukamoto <akira.tsukamoto@...il.com>
To:     Palmer Dabbelt <palmer@...belt.com>,
        Guenter Roeck <linux@...ck-us.net>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Qiu Wenbo <qiuwenbo@...inos.com.cn>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Albert Ou <aou@...s.berkeley.edu>,
        Akira Tsukamoto <akira.tsukamoto@...il.com>,
        linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: [PATCH v4 0/1] riscv: improving uaccess with logs from network bench

Hi Guenter, Geert and Qiu,

I fixed the bug which was overrunning the copy when the size was in the
between 8*SZREG to 9*SZREG. The SZREG holds the bytes per register size
which is 4 for RV32 and 8 for RV64.

Do you mind trying this patch? It works OK at my place.

Since I had to respin the patch I added word copy without unrolling when
the size is in the between 2*SZREG to 9*SZREG to reduce the number of byte
copies which has heavy overhead as Palmer has mentioned when he included
this patch to riscv/for-next.

I rewrote the functions but heavily influenced by Garry's memcpy
function [1]. It must be written in assembler to handle page faults
manually inside the function unlike other memcpy functions.

This patch will reduce cpu usage dramatically in kernel space especially
for applications which use sys-call with large buffer size, such as network
applications. The main reason behind this is that every unaligned memory
access will raise exceptions and switch between s-mode and m-mode causing
large overhead.

---
v3 -> v4:
- Fixed overrun copy
- Added word copy without unrolling to reduce byte copy for left over

v2 -> v3:
- Merged all patches

v1 -> v2:
- Added shift copy
- Separated patches for readability of changes in assembler
- Using perf results

[1] https://lkml.org/lkml/2021/2/16/778

Akira Tsukamoto (1):
  riscv: __asm_copy_to-from_user: Optimize unaligned memory access and
    pipeline stall

 arch/riscv/lib/uaccess.S | 218 ++++++++++++++++++++++++++++++++-------
 1 file changed, 183 insertions(+), 35 deletions(-)

-- 
2.17.1