[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5daac1da-9196-4330-b604-5e2073d43f19@linaro.org>
Date: Wed, 27 Nov 2024 10:08:07 +0000
From: James Clark <james.clark@...aro.org>
To: Oliver Upton <oliver.upton@...ux.dev>
Cc: suzuki.poulose@....com, coresight@...ts.linaro.org,
kvmarm@...ts.linux.dev, Marc Zyngier <maz@...nel.org>,
Joey Gouly <joey.gouly@....com>, Zenghui Yu <yuzenghui@...wei.com>,
Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
Mike Leach <mike.leach@...aro.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Mark Rutland <mark.rutland@....com>,
Anshuman Khandual <anshuman.khandual@....com>,
"Rob Herring (Arm)" <robh@...nel.org>, Shiqi Liu <shiqiliu@...t.edu.cn>,
Fuad Tabba <tabba@...gle.com>, James Morse <james.morse@....com>,
Mark Brown <broonie@...nel.org>, Raghavendra Rao Ananta
<rananta@...gle.com>, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7 11/12] KVM: arm64: Swap TRFCR on guest switch
On 26/11/2024 4:23 pm, Oliver Upton wrote:
> On Thu, Nov 21, 2024 at 12:50:10PM +0000, James Clark wrote:
>>
>>
>> On 20/11/2024 5:31 pm, Oliver Upton wrote:
>>> On Tue, Nov 12, 2024 at 10:37:10AM +0000, James Clark wrote:
>>>> +void kvm_set_trfcr(u64 host_trfcr, u64 guest_trfcr)
>>>> +{
>>>> + if (kvm_arm_skip_trace_state())
>>>> + return;
>>>> +
>>>> + if (has_vhe())
>>>> + write_sysreg_s(guest_trfcr, SYS_TRFCR_EL12);
>>>> + else
>>>> + if (host_trfcr != guest_trfcr) {
>>>> + *host_data_ptr(host_debug_state.trfcr_el1) = guest_trfcr;
>>>
>>> Huh? That's going into host_debug_state, which is the dumping grounds
>>> for *host* context when entering a guest.
>>>
>>> Not sure why we'd stick a *guest* value in there...
>>>
>>
>> Only to save a 3rd storage place for trfcr when just the register and one
>> place is technically enough. But yes if it's more readable to have
>> guest_trfcr_el1 separately then that makes sense.
>
> Yeah, since this is all per-cpu data at this point rather than per-vCPU,
> it isn't the end of the world to use a few extra bytes.
>
>> That works, it would be nice to have it consistent and have it that way for
>> filtering, like kvm_set_guest_trace_filters(bool kernel, bool user). But I
>> suppose we can justify not doing it there because we're not really
>> interpreting the TRFCR value just writing it whole.
>
> Agreed, the biggest thing I'd want to see in the exported interfaces
> like this is to have enable/disable helpers to tell KVM when a driver
> wants KVM to start/stop managing a piece of state while in a guest.
>
> Then the hypervisor code can blindly save/restore some opaque values to
> whatever registers it needs to update.
>
>>> What if trace is disabled in the guest or in the host? Do we need to
>>> synchronize when transitioning from an enabled -> disabled state like we
>>> do today?
>>>
>>
>> By synchronize do you mean the tsb_csync()? I can only see it being
>> necessary for the TRBE case because then writing to the buffer is fatal.
>> Without TRBE the trace sinks still work and the boundary of when exactly
>> tracing is disabled in the kernel isn't critical.
>
> Ack, I had the blinders on that we cared only about TRBE here.
>
>>> I took a stab at this, completely untested of course && punts on
>>> protected mode. But this is _generally_ how I'd like to see everything
>>> fit together.
>>>
>>
>> Would you expect to see the protected mode stuff ignored if I sent another
>> version more like yours below? Or was that just skipped to keep the example
>> shorter?
>
> Skipped since I slapped this together in a hurry.
>
>> I think I'm a bit uncertain on that one because removing HAS_TRBE means you
>> can't check if TRBE is enabled or not in protected mode and it will go wrong
>> if it is.
>
> The protected mode hypervisor will need two bits of information.
> Detecting that the feature is present can be done in the kernel so long
> as the corresponding static key / cpucap is toggled before we drop
> privileges.
>
> Whether or not it is programmable + enabled is a decision that must be
> made by observing hardware state from the hypervisor before entering a
> guest.
>
> [...]
>
>>> +void kvm_enable_trbe(u64 guest_trfcr)
>>> +{
>>> + if (WARN_ON_ONCE(preemptible()))
>>> + return;
>>> +
>>> + if (has_vhe()) {
>>> + write_sysreg_s(guest_trfcr, SYS_TRFCR_EL12);
>>> + return;
>>> + }
>>> +
>>> + *host_data_ptr(guest_trfcr_el1) = guest_trfcr;
>>> + host_data_set_flag(HOST_TRBE_ENABLED);
>>
>> FWIW TRBE and TRF are separate features, so this wouldn't do the filtering
>> correctly if TRBE wasn't in use, but I can split it out into
>> separate kvm_enable_trbe(void) and kvm_set_guest_filters(u64 guest_trfcr).
>
> KVM manages the same piece of state (TRFCR_EL1) either way though right?
>
> The expectation I had is that KVM is informed any time a trace session
> (TRBE or otherwise) is enabled/disabled on a CPU, likely with a TRFCR_EL1
> of 0 if guest mode is excluded.
>
> The function names might need massaging, but I was hoping to have a
> single set of enable/disable knobs to cover all bases here.
>
I sent another version, it did come out much simpler and still does all
the same things as before.
I didn't manage to make a single enable/disable knob though. The thing
is the filtering is set on the source side of the driver and trbe is a
sink thing. I would have to couple them together and add knowledge of
the sink type to the source to make it work.
That would then open up the possibility for anyone adding a new source
to get the trbe bit wrong in the future. Having KVM override the filter
setting when trbe is in use seems a lot safer and easier to understand.
Powered by blists - more mailing lists