[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aN8S-8HJFLjq6i2M@linux.dev>
Date: Thu, 2 Oct 2025 17:04:11 -0700
From: Oliver Upton <oliver.upton@...ux.dev>
To: Volodymyr Babchuk <Volodymyr_Babchuk@...m.com>
Cc: "linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>,
"kvmarm@...ts.linux.dev" <kvmarm@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Marc Zyngier <maz@...nel.org>, Joey Gouly <joey.gouly@....com>,
Suzuki K Poulose <suzuki.poulose@....com>,
Zenghui Yu <yuzenghui@...wei.com>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>,
Wei-Lin Chang <r09922117@...e.ntu.edu.tw>,
Christoffer Dall <christoffer.dall@....com>,
Alok Tiwari <alok.a.tiwari@...cle.com>
Subject: Re: [PATCH] KVM: arm64: nv: do not inject L2-bound IRQs to L1
hypervisor
Hi Volodymyr,
On Thu, Oct 02, 2025 at 09:00:11PM +0000, Volodymyr Babchuk wrote:
> Difference between nested virtualization and "baremetal" case is that
> real GIC can track lots of active interrupts simultaneously, but vGIC
> is limited only to 4..16 LRs.
There isn't an architectural limitation here. Nothing prevents a
virtualized GIC from representing more active IRQs than there are LRs in
hardware.
ICH_HCR_EL2.LRENPIE allows you to take a trap when an EOI is received
for an IRQ that exists outside of teh list registers which would allow
the deactivation of the SW view of that IRQ.
As Marc suggested, the correct thing to do is adjust the sorting of IRQs
such that pending IRQs fill LRs before those in an active state.
> diff --git a/arch/arm64/kvm/vgic/vgic-v3-nested.c b/arch/arm64/kvm/vgic/vgic-v3-nested.c
> index 7f1259b49c505..bdd1fb78e3682 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3-nested.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3-nested.c
> @@ -286,9 +286,13 @@ void vgic_v3_sync_nested(struct kvm_vcpu *vcpu)
> if (WARN_ON(!irq)) /* Shouldn't happen as we check on load */
> continue;
>
> + irq->targets_l2 = true;
> +
> lr = __gic_v3_get_lr(lr_map_idx_to_shadow_idx(shadow_if, i));
> - if (!(lr & ICH_LR_STATE))
> + if (!(lr & ICH_LR_STATE)) {
> irq->active = false;
> + irq->targets_l2 = false;
> + }
>
> vgic_put_irq(vcpu->kvm, irq);
> }
> diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
> index 6dd5a10081e27..6f6759a74569a 100644
> --- a/arch/arm64/kvm/vgic/vgic.c
> +++ b/arch/arm64/kvm/vgic/vgic.c
> @@ -858,6 +858,17 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
> break;
> }
>
> + /*
> + * If we are switching to L1 Hypervisor - populate LR with
> + * IRQs that targeting it especially and are not targeting
> + * its L2 guest
> + */
> + if (vcpu_has_nv(vcpu) && !vgic_state_is_nested(vcpu) &&
> + irq->targets_l2) {
> + raw_spin_unlock(&irq->irq_lock);
> + continue;
> + }
> +
> if (likely(vgic_target_oracle(irq) == vcpu)) {
> vgic_populate_lr(vcpu, irq, count++);
This has some serious issues as it violates our architectural model of
the GIC.
The existence of an LR associating a vIRQ to a pIRQ only has relevance
when in a nested state, specifically when handling vIRQ deactivation.
Besides that it is just a number...
For example, imagine the guest hypervisor associated an EL1 timer IRQ
with an LR and later tried to clear the active state and re-arm the
timer. Seems like we would miss the IRQ entirely.
I'd really like to see a solution similar to Marc's proposal which
addresses the fundamental problem of active IRQs overflowing the list
registers.
Thanks,
Oliver
Powered by blists - more mailing lists