linux-kernel - Re: [PATCH] virt: Prevent AES-GCM IV reuse in SNP guest driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <195a9a2d-d758-e70f-335f-c394b0c587ad@amd.com>
Date:   Thu, 20 Oct 2022 09:02:06 -0500
From:   Tom Lendacky <thomas.lendacky@....com>
To:     Peter Gonda <pgonda@...gle.com>
Cc:     Dionna Amalie Glaze <dionnaglaze@...gle.com>,
        Borislav Petkov <bp@...e.de>,
        Michael Roth <michael.roth@....com>,
        Haowen Bai <baihaowen@...zu.com>,
        Yang Yingliang <yangyingliang@...wei.com>,
        Marc Orr <marcorr@...gle.com>,
        David Rientjes <rientjes@...gle.com>,
        Ashish Kalra <Ashish.Kalra@....com>,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH] virt: Prevent AES-GCM IV reuse in SNP guest driver

On 10/19/22 16:47, Peter Gonda wrote:
> On Wed, Oct 19, 2022 at 2:58 PM Tom Lendacky <thomas.lendacky@....com> wrote:
>> On 10/19/22 15:39, Peter Gonda wrote:
>>> On Wed, Oct 19, 2022 at 1:56 PM Tom Lendacky <thomas.lendacky@....com> wrote:
>>>> On 10/19/22 14:17, Dionna Amalie Glaze wrote:
>>>>> On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@....com> wrote:
>>>>>> On 10/19/22 12:40, Peter Gonda wrote:
>>>>>>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@....com> wrote:
>>>>>>>> On 10/19/22 10:03, Peter Gonda wrote:
>>>>>>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
>>>>>>>>> communicate securely with each other. The IV to this scheme is a
>>>>>>>>> sequence number that both the ASP and the guest track. Currently this
>>>>>>>>> sequence number in a guest request must exactly match the sequence
>>>>>>>>> number tracked by the ASP. This means that if the guest sees an error
>>>>>>>>> from the host during a request it can only retry that exact request or
>>>>>>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
>>>>>>>>> reuse see:
>>>>>>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf

>>> OK so the guest retires with the same request when it gets an
>>> SNP_GUEST_REQ_INVALID_LEN error. It expands its internal buffer to
>>
>> It would just use the pre-allocated snp_dev->certs_data buffer with npages
>> set to the full size of that buffer.
> 
> Actually we allocate that buffer with size SEV_FW_BLOB_MAX_SIZE. Maybe
> we want to just allocate this buffer which we think is sufficient and
> never increase the allocation?
> 
> I see the size of
> https://developer.amd.com/wp-content/resources/ask_ark_milan.cert is
> 3200 bytes. Assuming the VCEK cert is the same size (which it should
> be since this .cert is 2 certificates). 16K seems to leave enough room
> even for some vendor certificates?

I think just using the 16K buffer (4 pages) as it is allocated today is 
ok. If we get a SNP_GUEST_REQ_INVALID_LEN error that is larger than 4 
pages, then we won't ever be able to pull the certs given how the driver 
is coded today. In that case, disabling the VMPCK is in order.

A separate patch could be submitted later to improve this overall aspect 
of the certs buffer if needed.

Thanks,
Tom

> 
>>
>>> hold the certificates. When it finally gets a successful request w/
>>> certs. Do we want to return the attestation bits to userspace, but
>>> leave out the certificate data. Or just error out the ioctl
>>> completely?
>>
>> We need to be able to return the attestation bits that came back with the
>> extra certs. So just error out of the ioctl with the length error and let
>> user-space retry with the recommended number of pages.
> 
> That sounded simpler to me. Will do.
> 
>>
>>>
>>> I can do that in this series.
>>
>> Thanks!
>>
>>>
>>>>
>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> For the rate-limiting patch series [1], the rate-limiting will have to be
>>>>>>>> performed within the kernel, while the mutex is held, and then retry the
>>>>>>>> exact request again. Otherwise, that error will require disabling the
>>>>>>>> VMPCK. Either that, or the hypervisor must provide the rate limiting.
>>>>>>>>
>>>>>>>> Thoughts?
>>>>>>>>
>>>>>>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/
>>>>>>>
>>>>>>> Yes I think if the host rate limits the guest. The guest kernel should
>>>>>>> retry the exact message. Which mutex are you referring too?
>>>>>>
>>>>>> Or the host waits and then submits the request and the guest kernel
>>>>>> doesn't have to do anything. The mutex I'm referring to is the
>>>>>> snp_cmd_mutex that is taken in snp_guest_ioctl().
>>>>>
>>>>> I think that either the host kernel or guest kernel waiting can lead
>>>>> to unacceptable delays.
>>>>> I would recommend that we add a zero argument ioctl to /dev/sev-guest
>>>>> specifically for retrying the last request.
>>>>>
>>>>> We can know what the last request is due to the sev_cmd_mutex serialization.
>>>>> The driver will just keep a scratch buffer for this. Any other request
>>>>> that comes in without resolving the retry will get an -EBUSY error
>>>>> code.
>>>>
>>>> And the first caller will have received an -EAGAIN in order to
>>>> differentiate between the two situations?
>>>>
>>>>>
>>>>> Calling the retry ioctl without a pending command will result in -EINVAL.
>>>>>
>>>>> Let me know what you think.
>>>>
>>>> I think that sounds reasonable, but there are some catches. You will need
>>>> to ensure that the caller that is supposed to retry does actually retry
>>>> and that a caller that does retry is the same caller that was told to retry.
>>>
>>> Whats the issue with the guest driver taking some time?
>>>
>>> This sounds complex because there may be many users of the driver. How
>>> do multiple users coordinate when they need to use the retry ioctl?
>>>
>>>>
>>>> Thanks,
>>>> Tom
>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Tom
>>>>>
>>>>>
>>>>>