[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJF2gTSELHo6mSUB9ODvrOkDOtbT85waC9O7T7DUoF3MVZEseQ@mail.gmail.com>
Date: Sat, 20 Sep 2025 13:59:56 +0800
From: Guo Ren <guoren@...nel.org>
To: Andrea Parri <parri.andrea@...il.com>
Cc: Xu Lu <luxu.kernel@...edance.com>, robh@...nel.org, krzk+dt@...nel.org,
conor+dt@...nel.org, paul.walmsley@...ive.com, palmer@...belt.com,
aou@...s.berkeley.edu, alex@...ti.fr, ajones@...tanamicro.com,
brs@...osinc.com, devicetree@...r.kernel.org, linux-riscv@...ts.infradead.org,
linux-kernel@...r.kernel.org, apw@...onical.com, joe@...ches.com
Subject: Re: [PATCH v2 0/4] riscv: Add Zalasr ISA extension support
On Fri, Sep 19, 2025 at 5:24 AM Andrea Parri <parri.andrea@...il.com> wrote:
>
> [merging replies]
>
> > > I prefer option c) at first, it has fewer modification and influence.
> > Another reason is that store-release-to-load-acquire would give out a
> > FENCE rw, rw according to RVWMO PPO 7th rule instead of FENCE.TSO, which
> > is stricter than the Linux requirement you've mentioned.
>
> I mean, if "fewer modification" and "not-a-full-fence" were the only
> arguments, we would probably just stick with the current scheme (b),
> right? What other arguments are available? Don't get me wrong: no a
> priori objection from my end; I was really just wondering about the
> various interests/rationales in the RISC-V kernel community. (It may
> surprise you, but some communities did consider that "UNLOCK+LOCK is
> not a full memory barrier" a disadvantage, because briefly "locking
> should provide strong ordering guarantees and be easy to reason about";
> in fact, not just "locking" if we consider x86 or arm64...)
The ld.aq is really faster than the "ld + fence r, rw" in microarch. I
don't care about the performance of the "UNLOCK+LOCK" scenario.
>
>
> > > asm volatile(ALTERNATIVE("fence rw, w;\t\nsb %0, 0(%1)\t\n", \
> > > - SB_RL(%0, %1) "\t\nnop\t\n", \
> > > + SB_RL(%0, %1) "\t\n fence.tso;\t\n", \
> > > 0, RISCV_ISA_EXT_ZALASR, 1) \
> > > : : "r" (v), "r" (p) : "memory"); \
>
> nit: Why placing the fence after the store? I imagine that FENCE.TSO
> could precede the store, this way, the store would actually not need
> to have that .RL annotation. More importantly,
Yes, fence.tso is stricter than fence rw, w, it gives an additional
fence r, r barrier.
>
> That for (part of) smp_store_release(). Let me stress that my option
> (c) was meant to include similar changes for _every releases (that is,
> cmpxchg_release(), atomic_inc_return_release(), and many other), even
> if most of such releases do not currently create "problematic pairs"
> with a corresponding acquire: the concern is that future changes in the
> RISC-V atomics implementation or in generic locking code will introduce
> pairs of the form FENCE RW,W + .AQ or .RL + FENCE R,RW without anyone
> noticing... In other words, I was suggesting that RISC-V _continues
> to meet the ordering property under discussion "by design" rather than
> "by Andrea or whoever's code auditing/review" (assuming it's feasible,
> i.e. that it doesn't clash with other people's arguments?); options (a)
> and (b) were also "designed" following this same criterion.
>
> Andrea
--
Best Regards
Guo Ren
Powered by blists - more mailing lists