linux-kernel - RE: [PATCH v4 1/2] RISC-V: Probe for unaligned access speed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <abdde70ac5b947508c8c71d72ec4f294@AcuMS.aculab.com>
Date:   Fri, 15 Sep 2023 07:57:22 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Evan Green' <evan@...osinc.com>
CC:     Geert Uytterhoeven <geert@...ux-m68k.org>,
        Palmer Dabbelt <palmer@...osinc.com>,
        Heiko Stuebner <heiko@...ech.de>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        Björn Töpel <bjorn@...osinc.com>,
        Conor Dooley <conor.dooley@...rochip.com>,
        Guo Ren <guoren@...nel.org>,
        Jisheng Zhang <jszhang@...nel.org>,
        "linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>,
        Jonathan Corbet <corbet@....net>,
        "Sia Jee Heng" <jeeheng.sia@...rfivetech.com>,
        Marc Zyngier <maz@...nel.org>,
        "Masahiro Yamada" <masahiroy@...nel.org>,
        Greentime Hu <greentime.hu@...ive.com>,
        "Simon Hosie" <shosie@...osinc.com>,
        Andrew Jones <ajones@...tanamicro.com>,
        "Albert Ou" <aou@...s.berkeley.edu>,
        Alexandre Ghiti <alexghiti@...osinc.com>,
        "Ley Foon Tan" <leyfoon.tan@...rfivetech.com>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Anup Patel <apatel@...tanamicro.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Xianting Tian <xianting.tian@...ux.alibaba.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        "Andy Chiu" <andy.chiu@...ive.com>
Subject: RE: [PATCH v4 1/2] RISC-V: Probe for unaligned access speed

From: Evan Green
> Sent: 14 September 2023 17:37
> 
> On Thu, Sep 14, 2023 at 8:55 AM David Laight <David.Laight@...lab.com> wrote:
> >
> > From: Evan Green
> > > Sent: 14 September 2023 16:01
> > >
> > > On Thu, Sep 14, 2023 at 1:47 AM David Laight <David.Laight@...lab.com> wrote:
> > > >
> > > > From: Geert Uytterhoeven
> > > > > Sent: 14 September 2023 08:33
> > > > ...
> > > > > > >     rzfive:
> > > > > > >         cpu0: Ratio of byte access time to unaligned word access is
> > > > > > > 1.05, unaligned accesses are fast
> > > > > >
> > > > > > Hrm, I'm a little surprised to be seeing this number come out so close
> > > > > > to 1. If you reboot a few times, what kind of variance do you get on
> > > > > > this?
> > > > >
> > > > > Rock-solid at 1.05 (even with increased resolution: 1.05853 on 3 tries)
> > > >
> > > > Would that match zero overhead unless the access crosses a
> > > > cache line boundary?
> > > > (I can't remember whether the test is using increasing addresses.)
> > >
> > > Yes, the test does use increasing addresses, it copies across 4 pages.
> > > We start with a warmup, so caching effects beyond L1 are largely not
> > > taken into account.
> >
> > That seems entirely excessive.
> > If you want to avoid data cache issues (which probably do)
> > then just repeating a single access would almost certainly
> > suffice.
> > Repeatedly using a short buffer (say 256 bytes) won't add
> > much loop overhead.
> > Although you may want to do a test that avoids transfers
> > that cross cache line and especially page boundaries.
> > Either of those could easily be much slower than a read
> > that is entirely within a cache line.
> 
> We won't be faulting on any of these pages, and they should remain in
> the TLB, so I don't expect many page boundary specific effects. If
> there is a steep penalty for misaligned loads across a cache line,
> such that it's worse than doing byte accesses, I want the test results
> to be dinged for that.

That is an entirely different issue.

Are you absolutely certain that the reason 8 byte loads take
as long as a 64-bit mis-aligned load isn't because the entire
test is limited by L1 cache fills?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)