linux-kernel - Re: [PATCH RFC v7 62/64] x86/sev: Add KVM commands for instance certs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d6b835e0-43e7-e5ae-e291-6ff0611cc817@amd.com>
Date:   Wed, 11 Jan 2023 08:32:32 -0600
From:   Tom Lendacky <thomas.lendacky@....com>
To:     Dov Murik <dovmurik@...ux.ibm.com>,
        Dionna Amalie Glaze <dionnaglaze@...gle.com>
Cc:     Michael Roth <michael.roth@....com>, kvm@...r.kernel.org,
        linux-coco@...ts.linux.dev, linux-mm@...ck.org,
        linux-crypto@...r.kernel.org, x86@...nel.org,
        linux-kernel@...r.kernel.org, tglx@...utronix.de, mingo@...hat.com,
        jroedel@...e.de, hpa@...or.com, ardb@...nel.org,
        pbonzini@...hat.com, seanjc@...gle.com, vkuznets@...hat.com,
        wanpengli@...cent.com, jmattson@...gle.com, luto@...nel.org,
        dave.hansen@...ux.intel.com, slp@...hat.com, pgonda@...gle.com,
        peterz@...radead.org, srinivas.pandruvada@...ux.intel.com,
        rientjes@...gle.com, tobin@....com, bp@...en8.de, vbabka@...e.cz,
        kirill@...temov.name, ak@...ux.intel.com, tony.luck@...el.com,
        marcorr@...gle.com, sathyanarayanan.kuppuswamy@...ux.intel.com,
        alpergun@...gle.com, dgilbert@...hat.com, jarkko@...nel.org,
        ashish.kalra@....com, harald@...fian.com
Subject: Re: [PATCH RFC v7 62/64] x86/sev: Add KVM commands for instance certs

On 1/11/23 00:00, Dov Murik wrote:
> 
> 
> On 10/01/2023 17:10, Tom Lendacky wrote:
>> On 1/10/23 01:10, Dov Murik wrote:
>>> Hi Tom,
>>>
>>> On 10/01/2023 0:27, Tom Lendacky wrote:
>>>> On 1/9/23 10:55, Dionna Amalie Glaze wrote:
>>>>>>> +
>>>>>>> +static int snp_set_instance_certs(struct kvm *kvm, struct
>>>>>>> kvm_sev_cmd *argp)
>>>>>>> +{
>>>>>> [...]
>>>>>>
>>>>>> Here we set the length to the page-aligned value, but we copy only
>>>>>> params.cert_len bytes.  If there are two subsequent
>>>>>> snp_set_instance_certs() calls where the second one has a shorter
>>>>>> length, we might "keep" some leftover bytes from the first call.
>>>>>>
>>>>>> Consider:
>>>>>> 1. snp_set_instance_certs(certs_addr point to "AAA...",
>>>>>> certs_len=8192)
>>>>>> 2. snp_set_instance_certs(certs_addr point to "BBB...",
>>>>>> certs_len=4097)
>>>>>>
>>>>>> If I understand correctly, on the second call we'll copy 4097 "BBB..."
>>>>>> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE -
>>>>>> 1) & PAGE_MASK which will be 8192.
>>>>>>
>>>>>> Later when fetching the certs (for the extended report or in
>>>>>> snp_get_instance_certs()) the user will get a buffer of 8192 bytes
>>>>>> filled with 4097 BBBs and 4095 leftover AAAs.
>>>>>>
>>>>>> Maybe zero sev->snp_certs_data entirely before writing to it?
>>>>>>
>>>>>
>>>>> Yes, I agree it should be zeroed, at least if the previous length is
>>>>> greater than the new length. Good catch.
>>>>>
>>>>>
>>>>>> Related question (not only for this patch) regarding snp_certs_data
>>>>>> (host or per-instance): why is its size page-aligned at all? why is it
>>>>>> limited by 16KB or 20KB? If I understand correctly, for SNP, this
>>>>>> buffer
>>>>>> is never sent to the PSP.
>>>>>>
>>>>>
>>>>> The buffer is meant to be copied into the guest driver following the
>>>>> GHCB extended guest request protocol. The data to copy back are
>>>>> expected to be in 4K page granularity.
>>>>
>>>> I don't think the data has to be in 4K page granularity. Why do you
>>>> think it does?
>>>>
>>>
>>> I looked at AMD publication 56421 SEV-ES Guest-Hypervisor Communication
>>> Block Standardization (July 2022), page 37.  The table says:
>>>
>>> --------------
>>>
>>> NAE Event: SNP Extended Guest Request
>>>
>>> Notes:
>>>
>>> RAX will have the guest physical address of the page(s) to hold returned
>>> data
>>>
>>> RBX
>>> State to Hypervisor: will contain the number of guest contiguous
>>> pages supplied to hold returned data
>>> State from Hypervisor: on error will contain the number of guest
>>> contiguous pages required to hold the data to be returned
>>>
>>> ...
>>>
>>> The request page, response page and data page(s) must be assigned to the
>>> hypervisor (shared).
>>>
>>> --------------
>>>
>>>
>>> According to this spec, it looks like the sizes are communicated as
>>> number of pages in RBX.  So the data should start at a 4KB alignment
>>> (this is verified in snp_handle_ext_guest_request()) and its length
>>> should be 4KB-aligned, as Dionna noted.
>>
>> That only indicates how many pages are required to hold the data, but
>> the hypervisor only has to copy however much data is present. If the
>> data is 20 bytes, then you only have to copy 20 bytes. If the user
>> supplied 0 for the number of pages, then the code returns 1 in RBX to
>> indicate that one page is required to hold the 20 bytes.
>>
> 
> 
> Maybe it should only copy 20 bytes, but current implementation copies
> whole 4KB pages:
> 
> 
>          if (sev->snp_certs_len)
>                  data_npages = sev->snp_certs_len >> PAGE_SHIFT;
>          ...
>          ...
>          /* Copy the certificate blob in the guest memory */
>          if (data_npages &&
>              kvm_write_guest(kvm, data_gpa, sev->snp_certs_data, data_npages << PAGE_SHIFT))
>                  rc = SEV_RET_INVALID_ADDRESS;
> 
> 
> (elsewhere we ensure that sev->snp_certs_len is page-aligned, so the assignment
> to data_npages is in fact correct even though looks off-by-one; aside, maybe it's
> better to use some DIV_ROUND_UP macro anywhere we calculate the number of
> needed pages.)

Hmmm... yeah, not sure why it was implemented that way, I guess it can 
always be changed later if desired.

> 
> Also -- how does the guest know they got only 20 bytes and not 4096? Do they have
> to read all the 'struct cert_table' entries at the beginning of the received data?

Yes, they should walk the cert table entries.

Thanks,
Tom

> 
> -Dov
> 
> 
>>>
>>> I see no reason (in the spec and in the kernel code) for the data length
>>> to be limited to 16KB (SEV_FW_BLOB_MAX_SIZE) but I might be missing some
>>> flow because Dionna ran into this limit.
>>
>> Correct, there is no limit. I believe that SEV_FW_BLOB_MAX_SIZE is a way
>> to keep the memory usage controlled because data is coming from
>> userspace and it isn't expected that the data would be larger than that.
>>
>> I'm not sure if that was in from the start or as a result of a review
>> comment. Not sure what is the best approach is.
>>
>> Thanks,
>> Tom
>>
>>>
>>>
>>> -Dov
>>>
>>>
>>>
>>>> Thanks,
>>>> Tom
>>>>
>>>>>
>>>>>> [...]
>>>>>>>
>>>>>>> -#define SEV_FW_BLOB_MAX_SIZE 0x4000  /* 16KB */
>>>>>>> +#define SEV_FW_BLOB_MAX_SIZE 0x5000  /* 20KB */
>>>>>>>
>>>>>>
>>>>>> This has effects in drivers/crypto/ccp/sev-dev.c
>>>>>>                                                                   (for
>>>>>> example in alloc_snp_host_map).  Is that OK?
>>>>>>
>>>>>
>>>>> No, this was a mistake of mine because I was using a bloated data
>>>>> encoding that needed 5 pages for the GUID table plus 4 small
>>>>> certificates. I've since fixed that in our user space code.
>>>>> We shouldn't change this size and instead wait for a better size
>>>>> negotiation protocol between the guest and host to avoid this awkward
>>>>> hard-coding.
>>>>>
>>>>>