[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMuHMdUG3SUVPJHSiNyQNzyqxiJpczUHhBxHN7YqEDcaWYwkFA@mail.gmail.com>
Date: Thu, 19 Oct 2023 08:37:09 +0200
From: Geert Uytterhoeven <geert@...ux-m68k.org>
To: "Lad, Prabhakar" <prabhakar.mahadev-lad.rj@...renesas.com>
Cc: Palmer Dabbelt <palmer@...osinc.com>,
Heiko Stuebner <heiko@...ech.de>, linux-doc@...r.kernel.org,
Björn Töpel <bjorn@...osinc.com>,
Conor Dooley <conor.dooley@...rochip.com>,
Guo Ren <guoren@...nel.org>,
Jisheng Zhang <jszhang@...nel.org>,
linux-riscv@...ts.infradead.org, Jonathan Corbet <corbet@....net>,
Sia Jee Heng <jeeheng.sia@...rfivetech.com>,
Marc Zyngier <maz@...nel.org>,
Masahiro Yamada <masahiroy@...nel.org>,
Greentime Hu <greentime.hu@...ive.com>,
Simon Hosie <shosie@...osinc.com>,
Andrew Jones <ajones@...tanamicro.com>,
Albert Ou <aou@...s.berkeley.edu>,
Alexandre Ghiti <alexghiti@...osinc.com>,
Ley Foon Tan <leyfoon.tan@...rfivetech.com>,
Paul Walmsley <paul.walmsley@...ive.com>,
Anup Patel <apatel@...tanamicro.com>,
linux-kernel@...r.kernel.org,
Xianting Tian <xianting.tian@...ux.alibaba.com>,
David Laight <David.Laight@...lab.com>,
Palmer Dabbelt <palmer@...belt.com>,
Andy Chiu <andy.chiu@...ive.com>,
Evan Green <evan@...osinc.com>
Subject: Re: [PATCH v4 1/2] RISC-V: Probe for unaligned access speed
Hi Prabahkar,
On Thu, Sep 14, 2023 at 9:32 AM Geert Uytterhoeven <geert@...ux-m68k.org> wrote:
> On Wed, Sep 13, 2023 at 7:46 PM Evan Green <evan@...osinc.com> wrote:
> > On Wed, Sep 13, 2023 at 5:36 AM Geert Uytterhoeven <geert@...ux-m68k.org> wrote:
> > > On Fri, Aug 18, 2023 at 9:44 PM Evan Green <evan@...osinc.com> wrote:
> > > > Rather than deferring unaligned access speed determinations to a vendor
> > > > function, let's probe them and find out how fast they are. If we
> > > > determine that an unaligned word access is faster than N byte accesses,
> > > > mark the hardware's unaligned access as "fast". Otherwise, we mark
> > > > accesses as slow.
> > > >
> > > > The algorithm itself runs for a fixed amount of jiffies. Within each
> > > > iteration it attempts to time a single loop, and then keeps only the best
> > > > (fastest) loop it saw. This algorithm was found to have lower variance from
> > > > run to run than my first attempt, which counted the total number of
> > > > iterations that could be done in that fixed amount of jiffies. By taking
> > > > only the best iteration in the loop, assuming at least one loop wasn't
> > > > perturbed by an interrupt, we eliminate the effects of interrupts and
> > > > other "warm up" factors like branch prediction. The only downside is it
> > > > depends on having an rdtime granular and accurate enough to measure a
> > > > single copy. If we ever manage to complete a loop in 0 rdtime ticks, we
> > > > leave the unaligned setting at UNKNOWN.
> > > >
> > > > There is a slight change in user-visible behavior here. Previously, all
> > > > boards except the THead C906 reported misaligned access speed of
> > > > UNKNOWN. C906 reported FAST. With this change, since we're now measuring
> > > > misaligned access speed on each hart, all RISC-V systems will have this
> > > > key set as either FAST or SLOW.
> > > >
> > > > Currently, we don't have a way to confidently measure the difference between
> > > > SLOW and EMULATED, so we label anything not fast as SLOW. This will
> > > > mislabel some systems that are actually EMULATED as SLOW. When we get
> > > > support for delegating misaligned access traps to the kernel (as opposed
> > > > to the firmware quietly handling it), we can explicitly test in Linux to
> > > > see if unaligned accesses trap. Those systems will start to report
> > > > EMULATED, though older (today's) systems without that new SBI mechanism
> > > > will continue to report SLOW.
> > > >
> > > > I've updated the documentation for those hwprobe values to reflect
> > > > this, specifically: SLOW may or may not be emulated by software, and FAST
> > > > represents means being faster than equivalent byte accesses. The change
> > > > in documentation is accurate with respect to both the former and current
> > > > behavior.
> > > >
> > > > Signed-off-by: Evan Green <evan@...osinc.com>
> > > > Acked-by: Conor Dooley <conor.dooley@...rochip.com>
> > >
> > > Thanks for your patch, which is now commit 584ea6564bcaead2 ("RISC-V:
> > > Probe for unaligned access speed") in v6.6-rc1.
> > >
> > > On the boards I have, I get:
> > >
> > > rzfive:
> > > cpu0: Ratio of byte access time to unaligned word access is
> > > 1.05, unaligned accesses are fast
> >
> > Hrm, I'm a little surprised to be seeing this number come out so close
> > to 1. If you reboot a few times, what kind of variance do you get on
> > this?
>
> Rock-solid at 1.05 (even with increased resolution: 1.05853 on 3 tries)
After upgrading the firmware from [1] to [2], this changed to
"0.00, unaligned accesses are slow".
[1] RZ-Five-ETH
U-Boot 2020.10-g611c657e43 (Aug 26 2022 - 11:29:06 +0100)
[2] OpenSBI v1.3-75-g3cf0ea4
U-Boot 2023.01-00209-g1804c8ab17 (Oct 04 2023 - 13:18:01 +0100)
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Powered by blists - more mailing lists