lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87frblxx3b.wl-maz@kernel.org>
Date: Tue, 14 Oct 2025 17:32:56 +0100
From: Marc Zyngier <maz@...nel.org>
To: Kunkun Jiang <jiangkunkun@...wei.com>
Cc: Oliver Upton <oliver.upton@...ux.dev>,
	Joey
 Gouly <joey.gouly@....com>,
	Suzuki K Poulose <suzuki.poulose@....com>,
	Zenghui Yu <yuzenghui@...wei.com>,
	Catalin Marinas <catalin.marinas@....com>,
	Will Deacon <will@...nel.org>,
	"moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)"
	<linux-arm-kernel@...ts.infradead.org>,
	"open list:KERNEL VIRTUAL MACHINE FOR\
 ARM64 (KVM/arm64)" <kvmarm@...ts.linux.dev>,
	open list
	<linux-kernel@...r.kernel.org>,
	"wanghaibin.wang@...wei.com"
	<wanghaibin.wang@...wei.com>
Subject: Re: [Question] Received vtimer interrupt but ISTATUS is 0

On Tue, 14 Oct 2025 15:45:37 +0100,
Kunkun Jiang <jiangkunkun@...wei.com> wrote:
> 
> Hi all,
> 
> I'm having a very strange problem that can be simplified to a vtimer
> interrupt being received but ISTATUS is 0. Why dose this happen?
> According to analysis, it may be the timer condition is met and the
> interrupt is generated. Maybe some actions(cancel timer?) are done in
> the VM, ISTATUS becomes 0 and he hardware needs to clear the
> interrupt. But the clear command is sent too slowly, the OS has
> already read the ICC_IAR_EL1. So hypervisor executed
> kvm_arch_timer_handler but ISTATUS is 0.

If what you describe is accurate, and that the HW takes so long to
retire the timer interrupt that we cannot trust having taken an
interrupt, how long until we can trust that what we have is actually
correct?

Given that it takes a full exit from the guest before we can handle
the interrupt, I am rather puzzled that you observe this sort of bad
behaviours on modern HW. You either have an insanely fast CPU with a
very slow GIC, or a very bizarre machine (a bit like a ThunderX -- not
a compliment).

How does it work when context-switching from a vcpu that has a pending
timer interrupt to one that doesn't? Do you also see spurious
interrupts?

> The code flow is as follows:
> kvm_arch_timer_handler
>     ->if (kvm_timer_should_fire)
>         ->the value of SYS_CNTV_CTL is 0b001(ISTATUS=0,IMASK=0,ENABLE=1)
>     ->return IRQ_HANDLED
> 
> Because ISTATUS is 0, kvm_timer_update_irq will not be executed to
> inject this interrupt into the VM. Since EOImode is 1 and the vtimer
> interrupt has IRQD_FORWARDED_TO_VCPU flag, hypervisor will not write
> ICC_DIR_EL1 to deactivate the interrupt. This interrupt remains in
> active state, blocking subsequent interrupt from being
> process. Fortunately, in kvm_timer_vcpu_load it will be determined
> again whether an interrupt needs to be injected into the VM. But the
> delay will definitely increase.

Right, so you are at most a context switch away from your next
interrupt, just like in the !vcpu case. While not ideal, that's not
fatal.

> 
> What I want to discuss is the solution to this problem. My solution is
> to add a deactivation action:
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index dbd74e4885e2..46baba531d51 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -228,8 +228,13 @@ static irqreturn_t kvm_arch_timer_handler(int
> irq, void *dev_id)
>         else
>                 ctx = map.direct_ptimer;
> 
> -       if (kvm_timer_should_fire(ctx))
> +       if (kvm_timer_should_fire(ctx)) {
>                 kvm_timer_update_irq(vcpu, true, ctx);
> +       } else {
> +               struct vgic_irq *irq;
> +               irq = vgic_get_vcpu_irq(vcpu, timer_irq(timer_ctx));
> +               gic_write_dir(irq->hwintid);
> +       }
> 
>         if (userspace_irqchip(vcpu->kvm) &&
>             !static_branch_unlikely(&has_gic_active_state))
> 
> If you have any new ideas or other solutions to this problem, please
> let me know.

That's not right.

For a start, this is GICv3 specific, and will break on everything
else. Also, why the round-trip via the vgic_irq when you already have
the interrupt number that has fired *as a parameter*?

Finally, this breaks with NV, as you could have switched between EL1
and EL2 timers, and since you cannot trust you are in the correct
interrupt context (interrupt firing out of context), you can't trust
irq->hwintid either, as the mappings will have changed.

Something like the patchlet below should do the trick, but I'm
definitely not happy about this sort of sorry hacks.

	M.

diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index dbd74e4885e24..3db7c6bdffbc0 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -206,6 +206,13 @@ static void soft_timer_cancel(struct hrtimer *hrt)
 	hrtimer_cancel(hrt);
 }
 
+static void set_timer_irq_phys_active(struct arch_timer_context *ctx, bool active)
+{
+	int r;
+	r = irq_set_irqchip_state(ctx->host_timer_irq, IRQCHIP_STATE_ACTIVE, active);
+	WARN_ON(r);
+}
+
 static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
 {
 	struct kvm_vcpu *vcpu = *(struct kvm_vcpu **)dev_id;
@@ -230,6 +237,8 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
 
 	if (kvm_timer_should_fire(ctx))
 		kvm_timer_update_irq(vcpu, true, ctx);
+	else
+		set_timer_irq_phys_active(ctx, false);
 
 	if (userspace_irqchip(vcpu->kvm) &&
 	    !static_branch_unlikely(&has_gic_active_state))
@@ -659,13 +668,6 @@ static void timer_restore_state(struct arch_timer_context *ctx)
 	local_irq_restore(flags);
 }
 
-static inline void set_timer_irq_phys_active(struct arch_timer_context *ctx, bool active)
-{
-	int r;
-	r = irq_set_irqchip_state(ctx->host_timer_irq, IRQCHIP_STATE_ACTIVE, active);
-	WARN_ON(r);
-}
-
 static void kvm_timer_vcpu_load_gic(struct arch_timer_context *ctx)
 {
 	struct kvm_vcpu *vcpu = ctx->vcpu;

-- 
Jazz isn't dead. It just smells funny.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ