lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250228085115.105648-1-Neeraj.Upadhyay@amd.com>
Date: Fri, 28 Feb 2025 14:20:56 +0530
From: Neeraj Upadhyay <Neeraj.Upadhyay@....com>
To: <seanjc@...gle.com>, <pbonzini@...hat.com>, <kvm@...r.kernel.org>
CC: <linux-kernel@...r.kernel.org>, <bp@...en8.de>, <tglx@...utronix.de>,
	<mingo@...hat.com>, <dave.hansen@...ux.intel.com>, <Thomas.Lendacky@....com>,
	<nikunj@....com>, <Santosh.Shukla@....com>, <Vasant.Hegde@....com>,
	<Suravee.Suthikulpanit@....com>, <David.Kaplan@....com>, <x86@...nel.org>,
	<hpa@...or.com>, <peterz@...radead.org>, <huibo.wang@....com>,
	<naveen.rao@....com>, <binbin.wu@...ux.intel.com>, <isaku.yamahata@...el.com>
Subject: [RFC PATCH 00/19] AMD: Add Secure AVIC KVM Support

Introduction
------------

Secure AVIC is a new hardware feature in the AMD64 architecture to
allow SEV-SNP guests to prevent hypervisor from generating unexpected
interrupts to a vCPU or otherwise violate architectural assumptions
around APIC behavior.

One of the significant differences from AVIC or emulated x2APIC is that
Secure AVIC uses a guest-owned and managed APIC backing page. It also
introduces additional fields in both the VMCB and the Secure AVIC backing
page to aid the guest in limiting which interrupt vectors can be injected
into the guest.

Guest APIC Backing Page
-----------------------
Each vCPU has a guest-allocated APIC backing page of size 4K, which
maintains APIC state for that vCPU. The x2APIC MSRs are mapped at
their corresposing x2APIC MMIO offset within the guest APIC backing
page. All x2APIC accesses by guest or Secure AVIC hardware operate
on this backing page. The backing page should be pinned and NPT entry
for it should be always mapped while the corresponding vCPU is running.

MSR Accesses
------------
Secure AVIC only supports x2APIC MSR accesses. xAPIC MMIO offset based
accesses are not supported.

Some of the MSR writes such as ICR writes (with shorthand equal to
self), SELF_IPI, EOI, TPR writes are accelerated by Secure AVIC
hardware. Other MSR writes generate a #VC exception (
VMEXIT_AVIC_NOACCEL or VMEXIT_AVIC_INCOMPLETE_IPI). The #VC
exception handler reads/writes to the guest APIC backing page.
As guest APIC backing page is accessible to the guest, guest can
optimize APIC register access by directly reading/writing to the
guest APIC backing page (instead of taking the #VC exception route).
APIC msr reads are accelerated similar to AVIC, as described in
table "15-22. Guest vAPIC Register Access Behavior" of APM.

In addition to the architected MSRs, following new fields are added to
the guest APIC backing page which can be modified directly by the
guest:

a. ALLOWED_IRR

ALLOWED_IRR vector indicates the interrupt vectors which the guest
allows the hypervisor to send. The combination of host-controlled
REQUESTED_IRR vectors (part of VMCB) and ALLOWED_IRR is used by
hardware to update the IRR vectors of the Guest APIC backing page.

#Offset        #bits        Description
204h           31:0         Guest allowed vectors 0-31
214h           31:0         Guest allowed vectors 32-63
...
274h           31:0         Guest allowed vectors 224-255

ALLOWED_IRR is meant to be used specifically for vectors that the
hypervisor emulates and is allowed to inject, such as IOAPIC/MSI
device interrupts.  Interrupt vectors used exclusively by the guest
itself (like IPI vectors) should not be allowed to be injected into
the guest for security reasons.

b. NMI Request
 
#Offset        #bits        Description
278h           0            Set by Guest to request Virtual NMI

Guest can set NMI_REQUEST to trigger APIC_ICR based NMIs.

APIC Registers
--------------

1. APIC ID

APIC_ID values is set by KVM and similar to x2apic, it is equal to
vcpu_id for a vCPU.

2. APIC LVR

APIC Version register is expected to be read from KVM's APIC state using
MSR_PROT rdmsr VMGEXIT and updated in guest APIC backing page.

3. APIC TPR

TPR writes are accelerated and not communicated to KVM. So,
hypervisor does not have information about TPR value for a vCPU.

4. APIC PPR

Current state of PPR is not visible to KVM.

5. APIC SPIV

Spurious Interrupt Vector register value is not communicated to KVM.

6. APIC IRR and ISR

IRR and ISR states are visible only to guest. So, KVM cannot use these
registers to determine interrupt which are pending completion.

7. APIC TMR

Trigger Mode Register state is owned by guest and not visible to
KVM.

8. Timer registers - TMICT, TMCCT, TDCR

Timer registers are accessed using MSR_PROT VMGEXIT calls and not from
the guest APIC backing page.

9. LVT* registers

LVT registers state is accessed from KVM APIC state for the vCPU.

Idle halt Intercept
-------------------

As hypervisor does not have access to the APIC IRR state for a Secure
AVIC guest, idle halt intercept feature should be always enabled for
a Secure AVIC guest. Otherwise, any pending interrupts in APIC IRR during
halt vmexit would not be serviced and vCPU could get stuck in halt forever.
For idle halt intercept to work APIC TPR value should not block the
pending interrupts.

LAPIC Timer Support
-------------------
LAPIC timer is emulated by KVM. So, APIC_LVTT, APIC_TMICT and APIC_TDCR,
APIC_TMCCT APIC registers are not read/written to the guest APIC backing
page and are communicated to the hypervisor using MSR_PROT VMGEXIT. 

IPI Support
-----------
Only SELF_IPI is accelerated by Secure AVIC hardware. Other IPI
destination shorthands result in VMEXIT_AVIC_INCOMPLETE_IPI #VC exception.
The expected guest handling for VMEXIT_AVIC_INCOMPLETE_IPI is:

- For interrupts, update APIC_IRR in target vCPUs' guest APIC backing
  page.

- For NMIs, update NMI_REQUEST in target vCPUs' guest backing
  page.

- ICR based SMI, INIT, SIPI requests are not supported.

- After updating the target vCPU's guest APIC backing page, source vCPU
  does a MSR_PROT VMGEXIT.

- KVM either wakes up the non-running target vCPU or sends a
  AVIC doorbell.

Exceptions Injection
--------------------

Secure AVIC does not support event injection for guests with Secure AVIC
enabled in SEV_FEATURES. So, KVM cannot inject exceptions to Secure AVIC
guests. Hardware takes care of reinjecting an interrupted exception (for
example due to NPF) raised in guest on next VMRUN. VC exception is not
reinjected. KVM clears all exception intercepts for Secure AVIC guest.

Interrupt Injection
-------------------

IOAPIC and MSI based device interrupts can be injected by KVM. The
interrupt flow for this is:

- IOAPIC/MSI interrupts are updated in KVM's APIC_IRR state via
  kvm_irq_delivery_to_apic().
- in ->inject_irq() callback, all interrupts which are set in KVM's
  APIC_IRR are copied to RequestedIRR VMCB field and UpdateIRR bit is
  set.
- VMENTER moves the current value of RequestedIRR to APIC_IRR in
  guest APIC backing page and clears UpdateIRR.

Given that hardware clearing of RequestedIRR and UpdateIRR can race
with software writes to these fields, above interrupt injection
flow ensures that all RequestedIRR and UpdateIRR writes are done
from the same CPU where vCPU is run.

As interrupt delivery to vCPU is managed by hardware, interrupt window
is not applicable for Secure AVIC guests and interrupts are always
allowed to be injected.

PIC interrupts
--------------

Legacy PIC interrupts cannot be injected as they required event_inj or
VINTR injection support. Both of these are cannot be done for Secure
AVIC guest.

PIT
---

PIT Reinject mode is not supported as it requires IRQ ack notification
on EOI. As EOI is accelerated for edge interrupts, IRQ ack notification
is not called for those interrupts.

NMI Injection
-------------

NMI injection requires ALLOWED_NMI to be set in Secure AVIC control
msr by the guest. Only VNMI injection is allowed.

Open Points
-----------

- RTC_GSI requires pending EOI information to detect coalesced
  interrupts. As RTC_GSI is edge triggered, Secure AVIC does not
  forward EOI write to KVM for this interrupt. In addition, APIC_IRR
  and APIC_ISR states are not visible to KVM and are part of guest
  APIC backing page. Approach taken in this series is to disable
  checking of coalesced RTC_GSI interrupts for Secure AVIC, which
  could impact userspace.

- EOI handling for level interrupts uses KVM's unused APIC_ISR regs
  for tracking pending level interrupts. KVM uses its APIC_TMR state
  to determine level-triggered interrupts. As KVM's APIC_TMR is
  updated from IOAPIC redirect tables, the TMR information should be
  accurate and match guest APIC state. This can be cleaned up later
  to not use KVM's APIC_ISR state and maintained within sev code.

- Spurious Interrupt Vector Register writes are not visible to KVM.
  So, KVM cannot determine if the SW enabled bit is set.

- As exceptions cannot be injected by KVM, a more detailed examination
  of which intercepts need to be allowed for Secure AVIC guests is
  required.

- As KVM does not have access to the guest's APIC_IRR and APIC_ISR
  states, kvm_apic_pending_eoi() does not return correct information.

- External interrupts (PIC) are not supported. This breaks KVM's PIC
  emulation.

- PIT reinject mode is not supported.

- Current code uses KVM's vCPU APIC_IRR for interrupts which
  need to be injected to guest. Another approach could be to
  maintain pending interrupts within sev code and inject using
  flow similar to posted interrupts. 

This series is based on top of commit f7bafceba76e ("KVM: remove
kvm_arch_post_init_vm ") and is based on

  git.kernel.org/pub/scm/virt/kvm/kvm.git next

Git tree is available at:

  https://github.com/AMDESE/linux-kvm/tree/savic-host-latest

Qemu tree is at:
  https://github.com/AMDESE/qemu/tree/secure-avic

Guest Secure AVIC support is available at:

  https://lore.kernel.org/lkml/20250226090525.231882-1-Neeraj.Upadhyay@amd.com/

This series depends on below patch series:

1. Idle Halt Intercept

https://lore.kernel.org/all/20250128124812.7324-1-manali.shukla@amd.com/

2. ALLOWED_SEV_FEATURES support

https://lore.kernel.org/kvm/20250207233410.130813-1-kim.phillips@amd.com/


Kishon Vijay Abraham I (5):
  KVM: SEV: Do not intercept SECURE_AVIC_CONTROL MSR
  KVM: SVM: Secure AVIC: Do not inject "Exceptions" for Secure AVIC
  KVM: SVM/SEV: Secure AVIC: Set VGIF in VMSA area
  KVM: SVM/SEV: Secure AVIC: Enable NMI support
  KVM: x86: Secure AVIC: Indicate APIC is enabled by guest SW _always_

Neeraj Upadhyay (12):
  KVM: x86: Convert guest_apic_protected bool to an enum type
  x86/cpufeatures: Add Secure AVIC CPU Feature
  KVM: SVM: Add support for Secure AVIC capability in KVM
  KVM: SVM: Initialize apic protected state for SAVIC guests
  KVM: SVM/SEV/X86: Secure AVIC: Add support to inject interrupts
  KVM: SVM/SEV/X86: Secure AVIC: Add hypervisor side IPI Delivery
    Support
  KVM: SVM/SEV: Do not intercept exceptions for Secure AVIC guest
  KVM: SVM/SEV: Add SVM_VMGEXIT_SECURE_AVIC GHCB protocol event handling
  KVM: x86: Secure AVIC: Add IOAPIC EOI support for level interrupts
  KVM: x86/ioapic: Disable RTC_GSI EOI tracking for protected APIC
  X86: SVM: Check injected vectors before waiting for timer expiry
  KVM: SVM/SEV: Allow creating VMs with Secure AVIC enabled

Sean Christopherson (2):
  KVM: TDX: Add support for find pending IRQ in a protected local APIC
  KVM: x86: Assume timer IRQ was injected if APIC state is protected

 arch/x86/include/asm/cpufeatures.h |   1 +
 arch/x86/include/asm/kvm-x86-ops.h |   1 +
 arch/x86/include/asm/kvm_host.h    |   1 +
 arch/x86/include/asm/msr-index.h   |   2 +
 arch/x86/include/asm/svm.h         |   9 +-
 arch/x86/include/uapi/asm/svm.h    |   3 +
 arch/x86/kvm/ioapic.c              |   8 +-
 arch/x86/kvm/irq.c                 |   6 +
 arch/x86/kvm/lapic.c               |  23 +-
 arch/x86/kvm/lapic.h               |  16 ++
 arch/x86/kvm/svm/sev.c             | 371 +++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c             |  79 ++++--
 arch/x86/kvm/svm/svm.h             |  17 +-
 arch/x86/kvm/x86.c                 |  12 +-
 14 files changed, 518 insertions(+), 31 deletions(-)


base-commit: f7bafceba76e9ab475b413578c1757ee18c3e44b
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ