[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMuHMdVtXGjP8VFMiv-7OMFz1XvfU1cz=Fw4jL3fcp4wO1etzQ@mail.gmail.com>
Date: Wed, 13 Sep 2023 14:36:27 +0200
From: Geert Uytterhoeven <geert@...ux-m68k.org>
To: Evan Green <evan@...osinc.com>
Cc: Palmer Dabbelt <palmer@...osinc.com>,
Heiko Stuebner <heiko@...ech.de>, linux-doc@...r.kernel.org,
Björn Töpel <bjorn@...osinc.com>,
Conor Dooley <conor.dooley@...rochip.com>,
Guo Ren <guoren@...nel.org>,
Jisheng Zhang <jszhang@...nel.org>,
linux-riscv@...ts.infradead.org, Jonathan Corbet <corbet@....net>,
Sia Jee Heng <jeeheng.sia@...rfivetech.com>,
Marc Zyngier <maz@...nel.org>,
Masahiro Yamada <masahiroy@...nel.org>,
Greentime Hu <greentime.hu@...ive.com>,
Simon Hosie <shosie@...osinc.com>,
Andrew Jones <ajones@...tanamicro.com>,
Albert Ou <aou@...s.berkeley.edu>,
Alexandre Ghiti <alexghiti@...osinc.com>,
Ley Foon Tan <leyfoon.tan@...rfivetech.com>,
Paul Walmsley <paul.walmsley@...ive.com>,
Anup Patel <apatel@...tanamicro.com>,
linux-kernel@...r.kernel.org,
Xianting Tian <xianting.tian@...ux.alibaba.com>,
David Laight <David.Laight@...lab.com>,
Palmer Dabbelt <palmer@...belt.com>,
Andy Chiu <andy.chiu@...ive.com>
Subject: Re: [PATCH v4 1/2] RISC-V: Probe for unaligned access speed
Hi Evan,
On Fri, Aug 18, 2023 at 9:44 PM Evan Green <evan@...osinc.com> wrote:
> Rather than deferring unaligned access speed determinations to a vendor
> function, let's probe them and find out how fast they are. If we
> determine that an unaligned word access is faster than N byte accesses,
> mark the hardware's unaligned access as "fast". Otherwise, we mark
> accesses as slow.
>
> The algorithm itself runs for a fixed amount of jiffies. Within each
> iteration it attempts to time a single loop, and then keeps only the best
> (fastest) loop it saw. This algorithm was found to have lower variance from
> run to run than my first attempt, which counted the total number of
> iterations that could be done in that fixed amount of jiffies. By taking
> only the best iteration in the loop, assuming at least one loop wasn't
> perturbed by an interrupt, we eliminate the effects of interrupts and
> other "warm up" factors like branch prediction. The only downside is it
> depends on having an rdtime granular and accurate enough to measure a
> single copy. If we ever manage to complete a loop in 0 rdtime ticks, we
> leave the unaligned setting at UNKNOWN.
>
> There is a slight change in user-visible behavior here. Previously, all
> boards except the THead C906 reported misaligned access speed of
> UNKNOWN. C906 reported FAST. With this change, since we're now measuring
> misaligned access speed on each hart, all RISC-V systems will have this
> key set as either FAST or SLOW.
>
> Currently, we don't have a way to confidently measure the difference between
> SLOW and EMULATED, so we label anything not fast as SLOW. This will
> mislabel some systems that are actually EMULATED as SLOW. When we get
> support for delegating misaligned access traps to the kernel (as opposed
> to the firmware quietly handling it), we can explicitly test in Linux to
> see if unaligned accesses trap. Those systems will start to report
> EMULATED, though older (today's) systems without that new SBI mechanism
> will continue to report SLOW.
>
> I've updated the documentation for those hwprobe values to reflect
> this, specifically: SLOW may or may not be emulated by software, and FAST
> represents means being faster than equivalent byte accesses. The change
> in documentation is accurate with respect to both the former and current
> behavior.
>
> Signed-off-by: Evan Green <evan@...osinc.com>
> Acked-by: Conor Dooley <conor.dooley@...rochip.com>
Thanks for your patch, which is now commit 584ea6564bcaead2 ("RISC-V:
Probe for unaligned access speed") in v6.6-rc1.
On the boards I have, I get:
rzfive:
cpu0: Ratio of byte access time to unaligned word access is
1.05, unaligned accesses are fast
icicle:
cpu1: Ratio of byte access time to unaligned word access is
0.00, unaligned accesses are slow
cpu2: Ratio of byte access time to unaligned word access is
0.00, unaligned accesses are slow
cpu3: Ratio of byte access time to unaligned word access is
0.00, unaligned accesses are slow
cpu0: Ratio of byte access time to unaligned word access is
0.00, unaligned accesses are slow
k210:
cpu1: Ratio of byte access time to unaligned word access is
0.02, unaligned accesses are slow
cpu0: Ratio of byte access time to unaligned word access is
0.02, unaligned accesses are slow
starlight:
cpu1: Ratio of byte access time to unaligned word access is
0.01, unaligned accesses are slow
cpu0: Ratio of byte access time to unaligned word access is
0.02, unaligned accesses are slow
vexriscv/orangecrab:
cpu0: Ratio of byte access time to unaligned word access is
0.00, unaligned accesses are slow
I am a bit surprised by the near-zero values. Are these expected?
Thanks!
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Powered by blists - more mailing lists