[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <75ce4c12-f0e3-32c4-604f-9745980022e0@arm.com>
Date: Wed, 2 Sep 2020 11:09:04 +0100
From: Steven Price <steven.price@....com>
To: Marc Zyngier <maz@...nel.org>
Cc: Keqian Zhu <zhukeqian1@...wei.com>, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
kvm@...r.kernel.org, Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>,
James Morse <james.morse@....com>,
Suzuki K Poulose <suzuki.poulose@....com>,
wanghaibin.wang@...wei.com
Subject: Re: [RFC PATCH 0/5] KVM: arm64: Add pvtime LPT support
Hi Marc,
Sorry for the slow response, I've been on holiday.
On 22/08/2020 11:31, Marc Zyngier wrote:
> Hi Steven,
>
> On Wed, 19 Aug 2020 09:54:40 +0100,
> Steven Price <steven.price@....com> wrote:
>>
>> On 18/08/2020 15:41, Marc Zyngier wrote:
>>> On 2020-08-17 09:41, Keqian Zhu wrote:
[...]
>>>>
>>>> Things need concern:
>>>> 1. https://developer.arm.com/docs/den0057/a needs update.
>>>
>>> LPT was explicitly removed from the spec because it doesn't really
>>> solve the problem, specially for the firmware: EFI knows
>>> nothing about this, for example. How is it going to work?
>>> Also, nobody was ever able to explain how this would work for
>>> nested virt.
>>>
>>> ARMv8.4 and ARMv8.6 have the feature set that is required to solve
>>> this problem without adding more PV to the kernel.
>>
>> Hi Marc,
>>
>> These are good points, however we do still have the situation that
>> CPUs that don't have ARMv8.4/8.6 clearly cannot implement this. I
>> presume the use-case Keqian is looking at predates the necessary
>> support in the CPU - Keqian if you can provide more details on the
>> architecture(s) involved that would be helpful.
>
> My take on this is that it is a fictional use case. In my experience,
> migration happens across *identical* systems, and *any* difference
> visible to guests will cause things to go wrong. Errata management
> gets in the way, as usual (name *one* integration that isn't broken
> one way or another!).
Keqian appears to have a use case - but obviously I don't know the
details. I guess Keqian needs to convince you of that.
> Allowing migration across heterogeneous hosts requires a solution to
> the errata management problem, which everyone (including me) has
> decided to ignore so far (and I claim that not having a constant timer
> frequency exposed to guests is an architecture bug).
I agree - errata management needs to be solved before LPT. Between
restricted subsets of hosts this doesn't seem impossible, but I guess we
should stall LPT until a credible solution is proposed. I'm certainly
not proposing one at the moment.
>> Nested virt is indeed more of an issue - we did have some ideas around
>> using SDEI that never made it to the spec.
>
> SDEI? Sigh... Why would SDEI be useful for NV and not for !NV?
SDEI provides a way of injecting a synchronous exception on migration -
although that certainly isn't the only possible mechanism. For NV we
have the problem that a guest-guest may be running at the point of
migration. However it's not practical for the host hypervisor to provide
the necessary table directly to the guest-guest which means the
guest-hypervisor must update the tables before the guest-guest is
allowed to run on the new host. The only plausible route I could see for
this is injecting a synchronous exception into the guest (per VCPU) to
ensure any guest-guests running are exited at migration time.
!NV is easier because we don't have to worry about multiple levels of
para-virtualisation.
>> However I would argue that the most pragmatic approach would be to
>> not support the combination of nested virt and LPT. Hopefully that
>> can wait until the counter scaling support is available and not
>> require PV.
>
> And have yet another set of band aids that paper over the fact that we
> can't get a consistent story on virtualization? No, thank you.
>
> NV is (IMHO) much more important than LPT as it has a chance of
> getting used. LPT is just another tick box, and the fact that ARM is
> ready to ignore sideline a decent portion of the architecture is a
> clear sign that it hasn't been thought out.
Different people have different priorities. NV is definitely important
for many people. LPT may also be important if you've already got a bunch
of VMs running on machines and you want to be able to (gradually)
replace them with newer hosts which happen to have a different clock
frequency. Those VMs running now clearly aren't using NV.
However, I have to admit it's not me that has the use-case, so I'll
leave it for others who might actually know the specifics to explain the
details.
>> We are discussing (re-)releasing the spec with the LPT parts added. If
>> you have fundamental objections then please me know.
>
> I do, see above. I'm stating that the use case doesn't really exist
> given the state of the available HW and the fragmentation of the
> architecture, and that ignoring the most important innovation in the
> virtualization architecture since ARMv7 is at best short-sighted.
>
> Time scaling is just an instance of the errata management problem, and
> that is the issue that needs solving. Papering over part of the
> problem is not helping.
I fully agree - errata management is definitely the first step that
needs solving. This is why I abandoned LPT originally because I don't
have a generic solution and the testing I did involved really ugly hacks
just to make the migration possible.
For now I propose we (again) park LPT until some progress has been made
on errata management.
Thanks,
Steve
Powered by blists - more mailing lists