[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230624-tycoon-pliable-325806e73a11@spud>
Date: Sat, 24 Jun 2023 11:08:03 +0100
From: Conor Dooley <conor@...nel.org>
To: Evan Green <evan@...osinc.com>
Cc: Palmer Dabbelt <palmer@...osinc.com>,
Simon Hosie <shosie@...osinc.com>,
Albert Ou <aou@...s.berkeley.edu>,
Alexandre Ghiti <alexghiti@...osinc.com>,
Andrew Jones <ajones@...tanamicro.com>,
Andy Chiu <andy.chiu@...ive.com>,
Anup Patel <apatel@...tanamicro.com>,
Conor Dooley <conor.dooley@...rochip.com>,
Greentime Hu <greentime.hu@...ive.com>,
Guo Ren <guoren@...nel.org>,
Heiko Stuebner <heiko.stuebner@...ll.eu>,
Heiko Stuebner <heiko@...ech.de>,
Jisheng Zhang <jszhang@...nel.org>,
Jonathan Corbet <corbet@....net>,
Ley Foon Tan <leyfoon.tan@...rfivetech.com>,
Li Zhengyu <lizhengyu3@...wei.com>,
Masahiro Yamada <masahiroy@...nel.org>,
Palmer Dabbelt <palmer@...belt.com>,
Paul Walmsley <paul.walmsley@...ive.com>,
Randy Dunlap <rdunlap@...radead.org>,
Samuel Holland <samuel@...lland.org>,
Sia Jee Heng <jeeheng.sia@...rfivetech.com>,
Sunil V L <sunilvl@...tanamicro.com>,
Xianting Tian <xianting.tian@...ux.alibaba.com>,
Yangyu Chen <cyy@...self.name>, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-riscv@...ts.infradead.org
Subject: Re: [PATCH 0/2] RISC-V: Probe for misaligned access speed
On Fri, Jun 23, 2023 at 03:20:14PM -0700, Evan Green wrote:
>
> The current setting for the hwprobe bit indicating misaligned access
> speed is controlled by a vendor-specific feature probe function. This is
> essentially a per-SoC table we have to maintain on behalf of each vendor
> going forward. Let's convert that instead to something we detect at
> runtime.
>
> We have two assembly routines at the heart of our probe: one that
> does a bunch of word-sized accesses (without aligning its input buffer),
> and the other that does byte accesses. If we can move a larger number of
> bytes using misaligned word accesses than we can with the same amount of
> time doing byte accesses, then we can declare misaligned accesses as
> "fast".
>
> The tradeoff of reducing this maintenance burden is boot time. We spend
> 4-6 jiffies per core doing this measurement (0-2 on jiffie edge
> alignment, and 4 on measurement). The timing loop was based on
> raid6_choose_gen(), which uses (16+1)*N jiffies (where N is the number
> of algorithms). On my THead C906, I found measurements to be stable
> across several reboots, and looked like this:
>
> [ 0.047582] cpu0: Unaligned word copy 1728 MB/s, byte copy 402 MB/s, misaligned accesses are fast
>
> I don't have a machine where misaligned accesses are slow, but I'd be
> interested to see the results of booting this series if someone did.
Can you elaborate on "results" please? Otherwise,
[ 0.333110] smp: Bringing up secondary CPUs ...
[ 0.370794] cpu1: Unaligned word copy 2 MB/s, byte copy 231 MB/s, misaligned accesses are slow
[ 0.411368] cpu2: Unaligned word copy 2 MB/s, byte copy 231 MB/s, misaligned accesses are slow
[ 0.451947] cpu3: Unaligned word copy 2 MB/s, byte copy 231 MB/s, misaligned accesses are slow
[ 0.462628] smp: Brought up 1 node, 4 CPUs
[ 0.631464] cpu0: Unaligned word copy 2 MB/s, byte copy 229 MB/s, misaligned accesses are slow
btw, why the mixed usage of "unaligned" and misaligned"?
Cheers,
Conor.
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists