linux-kernel - Re: [RFC PATCH 0/5] RISC-V: Add dynamic TSO support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <ZWVFe4SzeGMLD4wj@gmail.com>
Date:   Mon, 27 Nov 2023 20:42:19 -0500
From:   Guo Ren <guoren@...nel.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Christoph Muellner <christoph.muellner@...ll.eu>,
        linux-riscv@...ts.infradead.org, linux-kselftest@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
        Palmer Dabbelt <palmer@...belt.com>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Albert Ou <aou@...s.berkeley.edu>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Shuah Khan <shuah@...nel.org>,
        Jonathan Corbet <corbet@....net>,
        Anup Patel <apatel@...tanamicro.com>,
        Philipp Tomsich <philipp.tomsich@...ll.eu>,
        Andrew Jones <ajones@...tanamicro.com>,
        Daniel Henrique Barboza <dbarboza@...tanamicro.com>,
        Conor Dooley <conor.dooley@...rochip.com>,
        Björn Töpel <bjorn@...osinc.com>,
        Alan Stern <stern@...land.harvard.edu>,
        Andrea Parri <parri.andrea@...il.com>,
        Will Deacon <will@...nel.org>,
        Daniel Lustig <dlustig@...dia.com>
Subject: Re: [RFC PATCH 0/5] RISC-V: Add dynamic TSO support

On Mon, Nov 27, 2023 at 12:16:43PM +0100, Peter Zijlstra wrote:
> On Fri, Nov 24, 2023 at 09:51:53PM -0500, Guo Ren wrote:
> > On Fri, Nov 24, 2023 at 11:15:19AM +0100, Peter Zijlstra wrote:
> > > On Fri, Nov 24, 2023 at 08:21:37AM +0100, Christoph Muellner wrote:
> > > > From: Christoph Müllner <christoph.muellner@...ll.eu>
> > > > 
> > > > The upcoming RISC-V Ssdtso specification introduces a bit in the senvcfg
> > > > CSR to switch the memory consistency model at run-time from RVWMO to TSO
> > > > (and back). The active consistency model can therefore be switched on a
> > > > per-hart base and managed by the kernel on a per-process/thread base.
> > > 
> > > You guys, computers are hartless, nobody told ya?
> > > 
> > > > This patch implements basic Ssdtso support and adds a prctl API on top
> > > > so that user-space processes can switch to a stronger memory consistency
> > > > model (than the kernel was written for) at run-time.
> > > > 
> > > > I am not sure if other architectures support switching the memory
> > > > consistency model at run-time, but designing the prctl API in an
> > > > arch-independent way allows reusing it in the future.
> > > 
> > > IIRC some Sparc chips could do this, but I don't think anybody ever
> > > exposed this to userspace (or used it much).
> > > 
> > > IA64 had planned to do this, except they messed it up and did it the
> > > wrong way around (strong first and then relax it later), which lead to
> > > the discovery that all existing software broke (d'uh).
> > > 
> > > I think ARM64 approached this problem by adding the
> > > load-acquire/store-release instructions and for TSO based code,
> > > translate into those (eg. x86 -> arm64 transpilers).
> 
> > Keeping global TSO order is easier and faster than mixing
> > acquire/release and regular load/store. That means when ssdtso is
> > enabled, the transpiler's load-acquire/store-release becomes regular
> > load/store. Some micro-arch hardwares could speed up the performance.
> 
> Why is it faster? Because the release+acquire thing becomes RcSC instead
> of RcTSO? Surely that can be fixed with a weaker store-release variant
> ot something?
The "ld.acq + st.rel" could only be close to the ideal RCtso because
maintaining "ld.acq + st.rel + ld + st" is more complex in LSU than "ld
+ st" by global TSO.  So, that is why we want a global TSO flag to
simplify the micro-arch implementation, especially for some small
processors in the big-little system.

> 
> The problem I have with all of this is that you need to context switch
> this state and that you need to deal with exceptions, which must be
> written for the weak model but then end up running in the tso model --
> possibly slower than desired.
The s-mode TSO is useless for the riscv Linux kernel and this patch only
uses u-mode TSO. So, the exception handler and the whole kernel always
run in WMO.

Two years ago, we worried about stuff like io_uring, which means
io_uring userspace is in TSO, but the kernel side is in WMO. But it
still seems like no problem because every side has a different
implementation, but they all ensure their order. So, there should be no
problem between TSO & WMO io_uring communication. The only things we
need to prevent are:
1. Do not let the WMO code run in TSO mode, which is inefficient. (you mentioned)
2. Do not let the TSO code run in WMO mode, which is incorrect.

> If OTOH you only have a single model, everything becomes so much
> simpler. You just need to be able to express exactly what you want.
The ssdtso is no harm to the current WMO; it's just a tradeoff for
micro-arch implementation. You still could use "ld + st" are "ld.acq +
st.rl", but they are the same in the global tso state.

> 
> 
>