lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1456758285-25060-1-git-send-email-joro@8bytes.org>
Date:	Mon, 29 Feb 2016 16:04:42 +0100
From:	Joerg Roedel <joro@...tes.org>
To:	Paolo Bonzini <pbonzini@...hat.com>, Gleb Natapov <gleb@...nel.org>
Cc:	kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
	Joerg Roedel <joro@...tes.org>
Subject: [PATCH 0/3] KVM: Fix lost IRQ acks for RTC

Hi,

here is a small patch-set to fix a race condition which
happens when an RTC-IRQ is migrated to another VCPU while it
is being handled by the guest.

The RTC-EOI handling in KVM requires that all sent interrupt
messages to the VCPUs need to be acked before another
RTC-IRQ can be sent. When an EOI signal from the guest is
lost, it will never see an RTC interrupt again (until it
reboots).

This is easily reproducible with a Linux guest executing
this loop:

	$ while true;do time hwclock --show --test --debug;done

When the guest has multiple vcpus and the RTC-IRQ is
regularily migrated (e.g. by irqbalance), the race condition
will be hit after some time and the hwclock tool will fail
with:

	select() to /dev/rtc to wait for clock tick timed out...synchronization failed

The race condition happens because of the way the EOI
backtracking between local APIC and IOAPIC works in KVM. The
destination VCPU and vector is part of the IOAPIC state.
When the guest sends an EOI to the local APIC the vector is
matched against the destinations stored in the IOAPIC and
ACKed there too if it matches.

The problem begins when a VCPU handles an RTC interrupt and
at the same time another VCPU migrates the RTC-IRQ away from
that VCPU. This updates the IOAPIC state in KVM to
the new destination, so that the EOI sent from the first
VCPU does not match anymore in the IOAPIC, hence losing the
RTC-EOI.

This patch-set fixes the race-condition by adding explicit
back-tracking information for RTC-IRQs. The rtc_status
struct already holds a dest_map bitmap to store which VCPUs
receveived an RTC-IRQ. This is extended to also hold the
vector that was sent to this VCPU.

This information is then used to match EOI signals from the
guest to the RTC. This explicit back-tracking fixes the
issue.

Regards,

	Joerg

Joerg Roedel (3):
  kvm: x86: Convert ioapic->rtc_status.dest_map to a struct
  kvm: x86: Track irq vectors in ioapic->rtc_status.dest_map
  kvm: x86: Check dest_map->vector to match eoi signals for rtc

 arch/x86/kvm/ioapic.c   | 30 +++++++++++++++++++++---------
 arch/x86/kvm/ioapic.h   | 17 +++++++++++++++--
 arch/x86/kvm/irq_comm.c |  2 +-
 arch/x86/kvm/lapic.c    | 14 ++++++++------
 arch/x86/kvm/lapic.h    |  7 +++++--
 5 files changed, 50 insertions(+), 20 deletions(-)

-- 
1.9.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ