linux-kernel - Re: [PATCH] virt: Prevent AES-GCM IV reuse in SNP guest driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <10497679-8323-b830-56dc-4a6dc2581a27@amd.com>
Date:   Wed, 19 Oct 2022 16:05:11 -0500
From:   Tom Lendacky <thomas.lendacky@....com>
To:     Dionna Amalie Glaze <dionnaglaze@...gle.com>
Cc:     Peter Gonda <pgonda@...gle.com>, Borislav Petkov <bp@...e.de>,
        Michael Roth <michael.roth@....com>,
        Haowen Bai <baihaowen@...zu.com>,
        Yang Yingliang <yangyingliang@...wei.com>,
        Marc Orr <marcorr@...gle.com>,
        David Rientjes <rientjes@...gle.com>,
        Ashish Kalra <Ashish.Kalra@....com>,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH] virt: Prevent AES-GCM IV reuse in SNP guest driver

On 10/19/22 15:40, Dionna Amalie Glaze wrote:
> On Wed, Oct 19, 2022 at 12:56 PM Tom Lendacky <thomas.lendacky@....com> wrote:
>>
>> On 10/19/22 14:17, Dionna Amalie Glaze wrote:
>>> On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <thomas.lendacky@....com> wrote:
>>>>
>>>> On 10/19/22 12:40, Peter Gonda wrote:
>>>>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <thomas.lendacky@....com> wrote:
>>>>>>
>>>>>> On 10/19/22 10:03, Peter Gonda wrote:
>>>>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
>>>>>>> communicate securely with each other. The IV to this scheme is a
>>>>>>> sequence number that both the ASP and the guest track. Currently this
>>>>>>> sequence number in a guest request must exactly match the sequence
>>>>>>> number tracked by the ASP. This means that if the guest sees an error
>>>>>>> from the host during a request it can only retry that exact request or
>>>>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
>>>>>>> reuse see:
>>>>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
>>>>>>
>>>>
>>>> I think I've wrapped my head around this now. Any non-zero return code
>>>> from the hypervisor for an SNP Guest Request is either a hypervisor error
>>>> or an sev-guest driver error, and so the VMPCK should be disabled. The
>>>> sev-guest driver is really doing everything (message headers, performing
>>>> the encryption, etc.) and is only using userspace data that will be part
>>>> of the response message and can't result in a non-zero hypervisor return code.
>>>>
>>>> For the SNP Extended Guest Request, we only need to special case a return
>>>> code of SNP_GUEST_REQ_INVALID_LEN. See below for my responses on that.
>>>>
>>>>
>>>>>> I wonder if we can at least still support the extended report length query
>>>>>> by having the kernel allocate the required pages when the error is
>>>>>> SNP_GUEST_REQ_INVALID_LEN and retry the exact request again. If there are
>>>>>> no errors on the second request, the sequence numbers can be safely
>>>>>> updated, but the kernel returns the original error (which will provide the
>>>>>> caller with the number of pages required).
>>>>>
>>>>> I think we can but I thought fixing the security bug could come first,
>>>>> then the usability fix after. Dionna was planning on working on that
>>>>> fix.
>>>>>
>>>>> In that flow how does userspace get the data? Its called the ioctl
>>>>> with not enough output buffer space. What if the userspace calls the
>>>>> ioctl with no buffers space allocated, so its trying to query the
>>>>> length. We just send the host the request without any encrypted data.
>>>>
>>>> In the case of SNP_GUEST_REQ_INVALID_LEN, userspace wouldn't get the data
>>>> if it hasn't supplied enough buffer space. But, the sev-guest driver can
>>>> supply enough buffer space and invoke the SNP Extended Guest Request again
>>>> in order to successfully complete the call and update the sequence
>>>> numbers. The sev-guest driver would just discard the data in this case,
>>>> but pass back the original "not enough buffer space" error to the caller,
>>>> who could now allocate space and retry. This then allows the sequence
>>>> numbers to be bumped properly.
>>>>
>>>
>>> The way I thought to solve this was to make certificate length
>>> querying a part of the specified protocol.
>>>
>>> The first ext_guest_request command /must/ query the certificate
>>> buffer length with req.certs_len == 0.
>>
>> This becomes an incompatible change to the GHCB specification.
>>
>>> By making this part of the protocol, the sev-guest driver can check if
>>> the certificate length has been requested before.
>>> If so, emulate the host's VMM error code for invalid length without
>>> sending an encrypted message.
>>
>> On the hypervisor side, the certificate blob can be replaced at any time
>> with a new blob that is larger. So you may still have to handle the case
>> where you get a SNP_GUEST_REQ_INVALID_LEN even if you previously asked before.
>>
>>> If not, then send an all zeroes request buffer with the req.certs_len
>>> = 0 values to the VMM.
>>>
>>> The VMM will respond with the size if indeed the expected_pages are >
>>> 0. In the case that the host has not set the certificate buffer yet,
>>> then the host will inspect the header of the request page for a zero
>>> sequence number. If so, then we know that we don't have a valid
>>> request. We treat this also as the INVALID_LEN case but still return
>>> the size of 0. The driver will have the expected pages value stored as
>>> 0 at this point, so subsequent calls will not have this behavior.
>>>
>>> The way /dev/sev-guest user code has been written, I don't think this
>>> will break any existing software package.
> 
>>
>> I think having the sev-guest driver re-issue the request with the internal
>> buffer when it receives SNP_GUEST_REQ_INVALID_LEN is the better way to go.
> 
> I take it you mean in the case that the host's certs_len == 0?

Not sure what you mean. The sev-guest driver has an internal buffer for 
receiving the certs, snp_dev->certs_data, and it would use that whenever 
it receives an SNP_GUEST_REQ_INVALID_LEN return code.

> 
>> You could still cache the size request and always return that to
>> user-space when a request is received with a 0 length. The user-space
>> program must be able to handle receiving multiple
>> SNP_GUEST_REQ_INVALID_LEN in succession anyway, because of the fact that
>> the hypervisor can be updating the certs asynchronously. And if you get a
>> request that is not 0 length, then you issue it as such and re-use the
>> logic of the first 0 length request that was received if you get an
>> SNP_GUEST_REQ_INVALID_LEN with the user-space supplied value.
>>
> 
> A request that gets SNP_GUEST_REQ_INVALID_LEN when the guest expects
> that it is providing a sufficiently sized certificate buffer means
> that the guest has encrypted its report request.

Correct.

> We then have a harder problem than throttling because not only do we
> have to reissue the same request, it must be with different
> certificate arguments provided from user space.

Correct. But before returning the error to userspace, the sev-guest driver 
will issue the request again with its internal buffer so that the sequence 
numbers are updated and a new request can be issued.

> 
>> Peter, is this something you could change the patch to do?
>>
>>>
>>>>>
>>>>>>
>>>>>> For the rate-limiting patch series [1], the rate-limiting will have to be
>>>>>> performed within the kernel, while the mutex is held, and then retry the
>>>>>> exact request again. Otherwise, that error will require disabling the
>>>>>> VMPCK. Either that, or the hypervisor must provide the rate limiting.
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>> [1] https://lore.kernel.org/lkml/20221013160040.2858732-1-dionnaglaze@google.com/
>>>>>
>>>>> Yes I think if the host rate limits the guest. The guest kernel should
>>>>> retry the exact message. Which mutex are you referring too?
>>>>
>>>> Or the host waits and then submits the request and the guest kernel
>>>> doesn't have to do anything. The mutex I'm referring to is the
>>>> snp_cmd_mutex that is taken in snp_guest_ioctl().
>>>
>>> I think that either the host kernel or guest kernel waiting can lead
>>> to unacceptable delays.
>>> I would recommend that we add a zero argument ioctl to /dev/sev-guest
>>> specifically for retrying the last request.
>>>
>>> We can know what the last request is due to the sev_cmd_mutex serialization.
>>> The driver will just keep a scratch buffer for this. Any other request
>>> that comes in without resolving the retry will get an -EBUSY error
>>> code.
>>
>> And the first caller will have received an -EAGAIN in order to
>> differentiate between the two situations?
> 
> Yes, the throttled caller gets -EAGAIN, and other ioctls other than
> retry after that get -EBUSY.
> 
>>
>> I think that sounds reasonable, but there are some catches. You will need
>> to ensure that the caller that is supposed to retry does actually retry
>> and that a caller that does retry is the same caller that was told to retry.
>>
> 
> I think that constitutes a change to task_struct, the way that there's
> a buffer for interrupted system calls.
> That seems a bit much. Do we have to model for protocol-breaking user
> tasks that have access to /dev/sev-guest?
> The caller that gets -EAGAIN knows to retry. There's no reason for
> other tasks to retry due to command serialization and the -EBUSY
> behavior.

Maybe for well-behaving user-space applications, but that's not 
guaranteed. I agree with Peter and think waiting in the sev-guest driver 
is the simplest and fairest thing to do in the case of throttling.

Thanks,
Tom

> 
>