lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 8 Feb 2023 12:45:10 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Rob Herring' <robh@...nel.org>, Evan Green <evan@...osinc.com>
CC:     Palmer Dabbelt <palmer@...osinc.com>,
        Conor Dooley <conor@...nel.org>,
        "vineetg@...osinc.com" <vineetg@...osinc.com>,
        "heiko@...ech.de" <heiko@...ech.de>,
        "slewis@...osinc.com" <slewis@...osinc.com>,
        Albert Ou <aou@...s.berkeley.edu>,
        Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
        Palmer Dabbelt <palmer@...belt.com>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        "devicetree@...r.kernel.org" <devicetree@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>
Subject: RE: [PATCH v2 4/6] dt-bindings: Add RISC-V misaligned access
 performance

From: Rob Herring
> Sent: 07 February 2023 17:06
> 
> On Mon, Feb 06, 2023 at 12:14:53PM -0800, Evan Green wrote:
> > From: Palmer Dabbelt <palmer@...osinc.com>
> >
> > This key allows device trees to specify the performance of misaligned
> > accesses to main memory regions from each CPU in the system.
> >
> > Signed-off-by: Palmer Dabbelt <palmer@...osinc.com>
> > Signed-off-by: Evan Green <evan@...osinc.com>
> > ---
> >
> > (no changes since v1)
> >
> >  Documentation/devicetree/bindings/riscv/cpus.yaml | 15 +++++++++++++++
> >  1 file changed, 15 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml
> b/Documentation/devicetree/bindings/riscv/cpus.yaml
> > index c6720764e765..2c09bd6f2927 100644
> > --- a/Documentation/devicetree/bindings/riscv/cpus.yaml
> > +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
> > @@ -85,6 +85,21 @@ properties:
> >      $ref: "/schemas/types.yaml#/definitions/string"
> >      pattern: ^rv(?:64|32)imaf?d?q?c?b?v?k?h?(?:_[hsxz](?:[a-z])+)*$
> >
> > +  riscv,misaligned-access-performance:
> > +    description:
> > +      Identifies the performance of misaligned memory accesses to main memory
> > +      regions.  There are three flavors of unaligned access performance: "emulated"
> > +      means that misaligned accesses are emulated via software and thus
> > +      extremely slow, "slow" means that misaligned accesses are supported by
> > +      hardware but still slower that aligned accesses sequences, and "fast"
> > +      means that misaligned accesses are as fast or faster than the
> > +      cooresponding aligned accesses sequences.
> > +    $ref: "/schemas/types.yaml#/definitions/string"
> > +    enum:
> > +      - emulated
> > +      - slow
> > +      - fast
> 
> I don't think this belongs in DT. (I'm not sure about a userspace
> interface either.)
> 
> Can't this be tested and determined at runtime? Do misaligned accesses
> and compare the performance. We already do this for things like memcpy
> or crypto implementation selection.

There is also an long discussion about misaligned accesses
for loooongarch.

Basically if you want to run a common kernel (and userspace)
you have to default to compiling everything with -mno-stict-align
so that the compiler generates byte accesses for anything
marked 'packed' (etc).

Run-time tests can optimise some hot-spots.

In any case 'slow' is probably pointless - unless the accesses
take more than 1 or 2 extra cycles.

Oh, and you really never, ever want to emulate them.

Technically misaligned reads on (some) x86-64 cpu are slower
than aligned ones, but the difference is marginal.
I've measured two 64bit misaligned reads every clock.
But it is consistently slower by much less than one clock
per cache line.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ