[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4e2b87897485e38e251c447b9ad70eb6@kernel.org>
Date: Tue, 24 Nov 2020 08:26:55 +0000
From: Marc Zyngier <maz@...nel.org>
To: Shenming Lu <lushenming@...wei.com>
Cc: James Morse <james.morse@....com>,
Julien Thierry <julien.thierry.kdev@...il.com>,
Suzuki K Poulose <suzuki.poulose@....com>,
Eric Auger <eric.auger@...hat.com>,
linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
Christoffer Dall <christoffer.dall@....com>,
Alex Williamson <alex.williamson@...hat.com>,
Kirti Wankhede <kwankhede@...dia.com>,
Cornelia Huck <cohuck@...hat.com>, Neo Jia <cjia@...dia.com>,
wanghaibin.wang@...wei.com, yuzenghui@...wei.com
Subject: Re: [RFC PATCH v1 2/4] KVM: arm64: GICv4.1: Try to save hw pending
state in save_pending_tables
On 2020-11-24 07:40, Shenming Lu wrote:
> On 2020/11/23 17:18, Marc Zyngier wrote:
>> On 2020-11-23 06:54, Shenming Lu wrote:
>>> After pausing all vCPUs and devices capable of interrupting, in order
>> ^^^^^^^^^^^^^^^^^
>> See my comment below about this.
>>
>>> to save the information of all interrupts, besides flushing the
>>> pending
>>> states in kvm’s vgic, we also try to flush the states of VLPIs in the
>>> virtual pending tables into guest RAM, but we need to have GICv4.1
>>> and
>>> safely unmap the vPEs first.
>>>
>>> Signed-off-by: Shenming Lu <lushenming@...wei.com>
>>> ---
>>> arch/arm64/kvm/vgic/vgic-v3.c | 62
>>> +++++++++++++++++++++++++++++++----
>>> 1 file changed, 56 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c
>>> b/arch/arm64/kvm/vgic/vgic-v3.c
>>> index 9cdf39a94a63..e1b3aa4b2b12 100644
>>> --- a/arch/arm64/kvm/vgic/vgic-v3.c
>>> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
>>> @@ -1,6 +1,8 @@
>>> // SPDX-License-Identifier: GPL-2.0-only
>>>
>>> #include <linux/irqchip/arm-gic-v3.h>
>>> +#include <linux/irq.h>
>>> +#include <linux/irqdomain.h>
>>> #include <linux/kvm.h>
>>> #include <linux/kvm_host.h>
>>> #include <kvm/arm_vgic.h>
>>> @@ -356,6 +358,39 @@ int vgic_v3_lpi_sync_pending_status(struct kvm
>>> *kvm, struct vgic_irq *irq)
>>> return 0;
>>> }
>>>
>>> +/*
>>> + * With GICv4.1, we can get the VLPI's pending state after unmapping
>>> + * the vPE. The deactivation of the doorbell interrupt will trigger
>>> + * the unmapping of the associated vPE.
>>> + */
>>> +static void get_vlpi_state_pre(struct vgic_dist *dist)
>>> +{
>>> + struct irq_desc *desc;
>>> + int i;
>>> +
>>> + if (!kvm_vgic_global_state.has_gicv4_1)
>>> + return;
>>> +
>>> + for (i = 0; i < dist->its_vm.nr_vpes; i++) {
>>> + desc = irq_to_desc(dist->its_vm.vpes[i]->irq);
>>> + irq_domain_deactivate_irq(irq_desc_get_irq_data(desc));
>>> + }
>>> +}
>>> +
>>> +static void get_vlpi_state_post(struct vgic_dist *dist)
>>
>> nit: the naming feels a bit... odd. Pre/post what?
>
> My understanding is that the unmapping is a preparation for
> get_vlpi_state...
> Maybe just call it unmap/map_all_vpes?
Yes, much better.
[...]
>>> + if (irq->hw) {
>>> + WARN_RATELIMIT(irq_get_irqchip_state(irq->host_irq,
>>> + IRQCHIP_STATE_PENDING, &is_pending),
>>> + "IRQ %d", irq->host_irq);
>>
>> Isn't this going to warn like mad on a GICv4.0 system where this, by
>> definition,
>> will generate an error?
>
> As we have returned an error in save_its_tables if hw && !has_gicv4_1,
> we don't
> have to warn this here?
Are you referring to the check in vgic_its_save_itt() that occurs in
patch 4?
Fair enough, though I think the use of irq_get_irqchip_state() isn't
quite
what we want, as per my comments on patch #1.
>>
>>> + }
>>> +
>>> + if (stored == is_pending)
>>> continue;
>>>
>>> - if (irq->pending_latch)
>>> + if (is_pending)
>>> val |= 1 << bit_nr;
>>> else
>>> val &= ~(1 << bit_nr);
>>>
>>> ret = kvm_write_guest_lock(kvm, ptr, &val, 1);
>>> if (ret)
>>> - return ret;
>>> + goto out;
>>> }
>>> - return 0;
>>> +
>>> +out:
>>> + get_vlpi_state_post(dist);
>>
>> This bit worries me: you have unmapped the VPEs, so any interrupt that
>> has been
>> generated during that phase is now forever lost (the GIC doesn't have
>> ownership
>> of the pending tables).
>
> In my opinion, during this phase, the devices capable of interrupting
> should have already been paused (prevent from sending interrupts),
> such as VFIO migration protocol has already realized it.
Is that a hard guarantee? Pausing devices *may* be possible for a
limited
set of endpoints, but I'm not sure that is universally possible to
restart
them and expect a consistent state (you have just dropped a bunch of
network
packets on the floor...).
>> Do you really expect the VM to be restartable from that point? I don't
>> see how
>> this is possible.
>>
>
> If the migration has encountered an error, the src VM might be
> restarted, so we have to map the vPEs back.
As I said above, I doubt it is universally possible to do so, but
after all, this probably isn't worse that restarting on the target...
M.
--
Jazz is not dead. It just smells funny...
Powered by blists - more mailing lists