[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240323-28943722feb57a41fb0ff488@orel>
Date: Sat, 23 Mar 2024 10:35:38 +0100
From: Andrew Jones <ajones@...tanamicro.com>
To: Deepak Gupta <debug@...osinc.com>
Cc: Samuel Holland <samuel.holland@...ive.com>,
Conor Dooley <conor@...nel.org>, Palmer Dabbelt <palmer@...belt.com>,
linux-riscv@...ts.infradead.org, devicetree@...r.kernel.org,
Catalin Marinas <catalin.marinas@....com>, linux-kernel@...r.kernel.org, tech-j-ext@...ts.risc-v.org,
kasan-dev@...glegroups.com, Evgenii Stepanov <eugenis@...gle.com>,
Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>, Rob Herring <robh+dt@...nel.org>, Guo Ren <guoren@...nel.org>,
Heiko Stuebner <heiko@...ech.de>, Paul Walmsley <paul.walmsley@...ive.com>
Subject: Re: [RISC-V] [tech-j-ext] [RFC PATCH 5/9] riscv: Split per-CPU and
per-thread envcfg bits
On Fri, Mar 22, 2024 at 10:13:48AM -0700, Deepak Gupta wrote:
> On Thu, Mar 21, 2024 at 5:13 PM Samuel Holland
> <samuel.holland@...ive.com> wrote:
> >
> > On 2024-03-19 11:39 PM, Deepak Gupta wrote:
> > >>>> --- a/arch/riscv/include/asm/switch_to.h
> > >>>> +++ b/arch/riscv/include/asm/switch_to.h
> > >>>> @@ -69,6 +69,17 @@ static __always_inline bool has_fpu(void) { return false; }
> > >>>> #define __switch_to_fpu(__prev, __next) do { } while (0)
> > >>>> #endif
> > >>>>
> > >>>> +static inline void sync_envcfg(struct task_struct *task)
> > >>>> +{
> > >>>> + csr_write(CSR_ENVCFG, this_cpu_read(riscv_cpu_envcfg) | task->thread.envcfg);
> > >>>> +}
> > >>>> +
> > >>>> +static inline void __switch_to_envcfg(struct task_struct *next)
> > >>>> +{
> > >>>> + if (riscv_cpu_has_extension_unlikely(smp_processor_id(), RISCV_ISA_EXT_XLINUXENVCFG))
> > >>>
> > >>> I've seen `riscv_cpu_has_extension_unlikely` generating branchy code
> > >>> even if ALTERNATIVES was turned on.
> > >>> Can you check disasm on your end as well. IMHO, `entry.S` is a better
> > >>> place to pick up *envcfg.
> > >>
> > >> The branchiness is sort of expected, since that function is implemented by
> > >> switching on/off a branch instruction, so the alternate code is necessarily a
> > >> separate basic block. It's a tradeoff so we don't have to write assembly code
> > >> for every bit of code that depends on an extension. However, the cost should be
> > >> somewhat lowered since the branch is unconditional and so entirely predictable.
> > >>
> > >> If the branch turns out to be problematic for performance, then we could use
> > >> ALTERNATIVE directly in sync_envcfg() to NOP out the CSR write.
> > >
> > > Yeah I lean towards using alternatives directly.
> >
> > One thing to note here: we can't use alternatives directly if the behavior needs
> > to be different on different harts (i.e. a subset of harts implement the envcfg
> > CSR). I think we need some policy about which ISA extensions are allowed to be
> > asymmetric across harts, or else we add too much complexity.
>
> As I've responded on the same thread . We are adding too much
> complexity by assuming
> that heterogeneous ISA exists (which it doesn't today). And even if it
> exists, it wouldn't work.
> Nobody wants to spend a lot of time figuring out which harts have
> which ISA and which
> packages are compiled with which ISA. Most of the end users do `sudo
> apt get install blah blah`
> And then expect it to just work.
That will still work if the applications and libraries installed are
heterogeneous-platform aware, i.e. they do the figuring out which harts
have which extensions themselves. Applications/libraries should already
be probing for ISA extensions before using them. It's not a huge leap to
also check which harts support those extensions and then ensure affinity
is set appropriately.
> It doesn't work for other
> architectures and even when someone
> tried, they had to disable certain ISA features to make sure that all
> cores have the same ISA feature
> (search AVX12 Intel Alder Lake Disable).
The RISC-V software ecosystem is still being developed. We have an
opportunity to drop assumptions made by other architectures.
As I said in a different reply, it's reasonable for Linux to not add the
complexity until a use case comes along that Linux would like to support,
but I think it would be premature for Linux to put a stake in the sand.
So, how about we add code that confirms Zicboz is on all harts. If any
hart does not have it, then we complain loudly and disable it on all
the other harts. If it was just a hardware description bug, then it'll
get fixed. If there's actually a platform which doesn't have Zicboz
on all harts, then, when the issue is reported, we can decide to not
support it, support it with defconfig, or support it under a Kconfig
guard which must be enabled by the user.
Thanks,
drew
Powered by blists - more mailing lists