[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <mhng-9D9CB730-A22F-43E2-A012-D51EF3C1E027@palmerdabbelt-mac>
Date: Thu, 07 Aug 2025 10:26:29 -0700 (PDT)
From: Palmer Dabbelt <palmer@...belt.com>
To: Marc Zyngier <maz@...nel.org>
CC: Catalin Marinas <catalin.marinas@....com>, Mark Rutland <mark.rutland@....com>,
Will Deacon <will@...nel.org>, oliver.upton@...ux.dev, james.morse@....com, cohuck@...hat.com,
anshuman.khandual@....com, palmerdabbelt@...a.com, lpieralisi@...nel.org, kevin.brodsky@....com,
scott@...amperecomputing.com, broonie@...nel.org, james.clark@...aro.org, yeoreum.yun@....com,
joey.gouly@....com, huangxiaojia2@...wei.com, yebin10@...wei.com,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] arm64: Expose CPUECTLR{,2}_EL1 via sysfs
On Thu, 07 Aug 2025 01:08:26 PDT (-0700), Marc Zyngier wrote:
> On Wed, 06 Aug 2025 20:48:13 +0100,
> Palmer Dabbelt <palmer@...belt.com> wrote:
>>
>> From: Palmer Dabbelt <palmerdabbelt@...a.com>
>>
>> We've found that some of our workloads run faster when some of these
>> fields are set to non-default values on some of the systems we're trying
>> to run those workloads on. This allows us to set those values via
>> sysfs, so we can do workload/system-specific tuning.
>>
>> Signed-off-by: Palmer Dabbelt <palmerdabbelt@...a.com>
>> ---
>> I've only really smoke tested this, but I figured I'd send it along because I'm
>> not sure if this is even a sane thing to be doing -- these extended control
>> registers have some wacky stuff in them, so maybe they're not exposed to
>> userspace on purpose. IIUC firmware can gate these writes, though, so it
>> should be possible for vendors to forbid the really scary values.
>
> That's really wrong.
>
> For a start, these encodings fall into the IMPDEF range. They won't
> exist on non-ARM implementations.
OK, and that's because it says "Provides additional IMPLEMENTATION
DEFINED configuration and control options for the processor." at the
start of the manual page? Sorry, I'm kind of new to trying to read the
Arm specs -- I thought just the meaning of the values was changing, but
I probably just didn't read enough specs.
> Next, this will catch fire as a guest, either because the hypervisor
> will simply refuse to entertain letting it access registers that have
> no definition, or because the VM has been migrated from one
> implementation to another, and you have no idea this is doing on the
> new target.
Ya, makes sense.
>> That said, we do see some performance improvements here on real workloads. So
>> we're hoping to roll some of this tuning work out more widely, but we also
>> don't want to adopt some internal interface. Thus it'd make our lives easier
>> if we could twiddle these bits in a standard way.
>
> Honestly, this is the sort of bring-up stuff that is better kept in
> your private sandbox, and doesn't really help in general.
So we're not doing bringup (or at least not doing what I'd call bringup)
here, the theory is that we just get better performance on different
workloads with different tunings. That's all still a little early, but
if the data holds we'd want to be setting these based on what workload
is running (ie, not just some static tuning for everything).
That said, part of the reason I just sent this as-is is because I was
sort of expecting the answer to be "no" here. No big deal if that's the
case, we can figure out some other way to solve the problem. Happy to
throw some time in to making some more generic flavor of this, though...
> Thanks,
>
> M.
Powered by blists - more mailing lists