linux-kernel - Re: [PATCH v4 1/2] RISC-V: Probe for unaligned access speed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMuHMdVtXGjP8VFMiv-7OMFz1XvfU1cz=Fw4jL3fcp4wO1etzQ@mail.gmail.com>
Date:   Wed, 13 Sep 2023 14:36:27 +0200
From:   Geert Uytterhoeven <geert@...ux-m68k.org>
To:     Evan Green <evan@...osinc.com>
Cc:     Palmer Dabbelt <palmer@...osinc.com>,
        Heiko Stuebner <heiko@...ech.de>, linux-doc@...r.kernel.org,
        Björn Töpel <bjorn@...osinc.com>,
        Conor Dooley <conor.dooley@...rochip.com>,
        Guo Ren <guoren@...nel.org>,
        Jisheng Zhang <jszhang@...nel.org>,
        linux-riscv@...ts.infradead.org, Jonathan Corbet <corbet@....net>,
        Sia Jee Heng <jeeheng.sia@...rfivetech.com>,
        Marc Zyngier <maz@...nel.org>,
        Masahiro Yamada <masahiroy@...nel.org>,
        Greentime Hu <greentime.hu@...ive.com>,
        Simon Hosie <shosie@...osinc.com>,
        Andrew Jones <ajones@...tanamicro.com>,
        Albert Ou <aou@...s.berkeley.edu>,
        Alexandre Ghiti <alexghiti@...osinc.com>,
        Ley Foon Tan <leyfoon.tan@...rfivetech.com>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Anup Patel <apatel@...tanamicro.com>,
        linux-kernel@...r.kernel.org,
        Xianting Tian <xianting.tian@...ux.alibaba.com>,
        David Laight <David.Laight@...lab.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        Andy Chiu <andy.chiu@...ive.com>
Subject: Re: [PATCH v4 1/2] RISC-V: Probe for unaligned access speed

Hi Evan,

On Fri, Aug 18, 2023 at 9:44 PM Evan Green <evan@...osinc.com> wrote:
> Rather than deferring unaligned access speed determinations to a vendor
> function, let's probe them and find out how fast they are. If we
> determine that an unaligned word access is faster than N byte accesses,
> mark the hardware's unaligned access as "fast". Otherwise, we mark
> accesses as slow.
>
> The algorithm itself runs for a fixed amount of jiffies. Within each
> iteration it attempts to time a single loop, and then keeps only the best
> (fastest) loop it saw. This algorithm was found to have lower variance from
> run to run than my first attempt, which counted the total number of
> iterations that could be done in that fixed amount of jiffies. By taking
> only the best iteration in the loop, assuming at least one loop wasn't
> perturbed by an interrupt, we eliminate the effects of interrupts and
> other "warm up" factors like branch prediction. The only downside is it
> depends on having an rdtime granular and accurate enough to measure a
> single copy. If we ever manage to complete a loop in 0 rdtime ticks, we
> leave the unaligned setting at UNKNOWN.
>
> There is a slight change in user-visible behavior here. Previously, all
> boards except the THead C906 reported misaligned access speed of
> UNKNOWN. C906 reported FAST. With this change, since we're now measuring
> misaligned access speed on each hart, all RISC-V systems will have this
> key set as either FAST or SLOW.
>
> Currently, we don't have a way to confidently measure the difference between
> SLOW and EMULATED, so we label anything not fast as SLOW. This will
> mislabel some systems that are actually EMULATED as SLOW. When we get
> support for delegating misaligned access traps to the kernel (as opposed
> to the firmware quietly handling it), we can explicitly test in Linux to
> see if unaligned accesses trap. Those systems will start to report
> EMULATED, though older (today's) systems without that new SBI mechanism
> will continue to report SLOW.
>
> I've updated the documentation for those hwprobe values to reflect
> this, specifically: SLOW may or may not be emulated by software, and FAST
> represents means being faster than equivalent byte accesses. The change
> in documentation is accurate with respect to both the former and current
> behavior.
>
> Signed-off-by: Evan Green <evan@...osinc.com>
> Acked-by: Conor Dooley <conor.dooley@...rochip.com>

Thanks for your patch, which is now commit 584ea6564bcaead2 ("RISC-V:
Probe for unaligned access speed") in v6.6-rc1.

On the boards I have, I get:

    rzfive:
        cpu0: Ratio of byte access time to unaligned word access is
1.05, unaligned accesses are fast

    icicle:

        cpu1: Ratio of byte access time to unaligned word access is
0.00, unaligned accesses are slow
        cpu2: Ratio of byte access time to unaligned word access is
0.00, unaligned accesses are slow
        cpu3: Ratio of byte access time to unaligned word access is
0.00, unaligned accesses are slow

        cpu0: Ratio of byte access time to unaligned word access is
0.00, unaligned accesses are slow

    k210:

        cpu1: Ratio of byte access time to unaligned word access is
0.02, unaligned accesses are slow
        cpu0: Ratio of byte access time to unaligned word access is
0.02, unaligned accesses are slow

    starlight:

        cpu1: Ratio of byte access time to unaligned word access is
0.01, unaligned accesses are slow
        cpu0: Ratio of byte access time to unaligned word access is
0.02, unaligned accesses are slow

    vexriscv/orangecrab:

        cpu0: Ratio of byte access time to unaligned word access is
0.00, unaligned accesses are slow

I am a bit surprised by the near-zero values.  Are these expected?
Thanks!

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds