linux-kernel - Re: [PATCH v5 1/2] kvm: sev: Add SEV-SNP guest request throttling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aCZtdN0LhkRqm1Vn@google.com>
Date: Thu, 15 May 2025 15:40:52 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Dionna Glaze <dionnaglaze@...gle.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	linux-coco@...ts.linux.dev, Thomas Lendacky <Thomas.Lendacky@....com>, 
	Paolo Bonzini <pbonzini@...hat.com>, Joerg Roedel <jroedel@...e.de>, Peter Gonda <pgonda@...gle.com>, 
	Borislav Petkov <bp@...en8.de>
Subject: Re: [PATCH v5 1/2] kvm: sev: Add SEV-SNP guest request throttling

On Thu, May 15, 2025, Dionna Glaze wrote:
> The AMD-SP is a precious resource that doesn't have a scheduler other
> than a mutex lock queue. To avoid customers from causing a DoS, a
> mem_enc_ioctl command for rate limiting guest requests is added.
> 
> Recommended values are {.interval_ms = 1000, .burst = 1} or
> {.interval_ms = 2000, .burst = 2} to average 1 request every second.
> You may need to allow 2 requests back to back to allow for the guest
> to query the certificate length in an extended guest request without
> a pause. The 1 second average is our target for quality of service
> since empirical tests show that 64 VMs can concurrently request an
> attestation report with a maximum latency of 1 second. We don't

Who is we?

> anticipate more concurrency than that for a seldom used request for
> a majority well-behaved set of VMs. The majority point is decided as
> >64 VMs given the assumed 128 VM count for "extreme load".
> 
> Cc: Thomas Lendacky <Thomas.Lendacky@....com>
> Cc: Paolo Bonzini <pbonzini@...hat.com>
> Cc: Joerg Roedel <jroedel@...e.de>
> Cc: Peter Gonda <pgonda@...gle.com>
> Cc: Borislav Petkov <bp@...en8.de>
> Cc: Sean Christopherson <seanjc@...gle.com>
> 
> Signed-off-by: Dionna Glaze <dionnaglaze@...gle.com>
> ---
>  .../virt/kvm/x86/amd-memory-encryption.rst    | 23 +++++++++++++
>  arch/x86/include/uapi/asm/kvm.h               |  7 ++++
>  arch/x86/kvm/svm/sev.c                        | 33 +++++++++++++++++++
>  arch/x86/kvm/svm/svm.h                        |  3 ++
>  4 files changed, 66 insertions(+)
> 
> diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> index 1ddb6a86ce7f..1b5b4fc35aac 100644
> --- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> @@ -572,6 +572,29 @@ Returns: 0 on success, -negative on error
>  See SNP_LAUNCH_FINISH in the SEV-SNP specification [snp-fw-abi]_ for further
>  details on the input parameters in ``struct kvm_sev_snp_launch_finish``.
>  
> +21. KVM_SEV_SNP_SET_REQUEST_THROTTLE_RATE
> +-----------------------------------------
> +
> +The KVM_SEV_SNP_SET_REQUEST_THROTTLE_RATE command is used to set a per-VM rate
> +limit on responding to requests for AMD-SP to process a guest request.
> +The AMD-SP is a global resource with limited capacity, so to avoid noisy
> +neighbor effects, the host may set a request rate for guests.
> +
> +Parameters (in): struct kvm_sev_snp_set_request_throttle_rate
> +
> +Returns: 0 on success, -negative on error
> +
> +::
> +
> +	struct kvm_sev_snp_set_request_throttle_rate {
> +		__u32 interval_ms;
> +		__u32 burst;
> +	};
> +
> +The interval will be translated into jiffies, so if it after transformation

I assume this is a limitation of the __ratelimit() interface?

> +the interval is 0, the command will return ``-EINVAL``. The ``burst`` value
> +must be greater than 0.

Ugh, whose terribly idea was a per-VM capability?  Oh, mine[*].  *sigh*

Looking at this again, a per-VM capability doesn't change anything.  In fact,
it's far, far worse.  At least with a module param there's guaranteed to be some
amount of ratelimiting.  Relying on the VMM to opt-in to ratelimiting its VM if
userspace is compromised is completely nonsensical.

Unless someone has a better idea, let's just go with a module param.  

[*] https://lore.kernel.org/all/Y8rEFpbMV58yJIKy@google.com

> @@ -4015,6 +4042,12 @@ static int snp_handle_guest_req(struct vcpu_svm *svm, gpa_t req_gpa, gpa_t resp_
>  
>  	mutex_lock(&sev->guest_req_mutex);
>  
> +	if (!__ratelimit(&sev->snp_guest_msg_rs)) {
> +		svm_vmgexit_no_action(svm, SNP_GUEST_ERR(SNP_GUEST_VMM_ERR_BUSY, 0));
> +		ret = 1;
> +		goto out_unlock;

Can you (or anyone) explain what a well-behaved guest will do in in response to
BUSY?  And/or explain why KVM injecting an error into the guest is better than
exiting to userspace.