linux-kernel - Re: [PATCH v4 6/6] virt: tdx-guest: Add Quote generation support using TSM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <651cb7aa2ea48_ae7e7294a7@dwillia2-xfh.jf.intel.com.notmuch>
Date:   Tue, 3 Oct 2023 17:54:02 -0700
From:   Dan Williams <dan.j.williams@...el.com>
To:     Peter Gonda <pgonda@...gle.com>,
        Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>
CC:     Dan Williams <dan.j.williams@...el.com>,
        <linux-coco@...ts.linux.dev>, Erdem Aktas <erdemaktas@...gle.com>,
        <peterz@...radead.org>, <linux-kernel@...r.kernel.org>,
        <x86@...nel.org>, <dave.hansen@...ux.intel.com>
Subject: Re: [PATCH v4 6/6] virt: tdx-guest: Add Quote generation support
 using TSM_REPORTS

Peter Gonda wrote:
> On Tue, Oct 3, 2023 at 1:29 PM Kuppuswamy Sathyanarayanan
> <sathyanarayanan.kuppuswamy@...ux.intel.com> wrote:
> >
> > Hi,
> >
> > On 10/3/2023 11:37 AM, Peter Gonda wrote:
> > > On Fri, Sep 29, 2023 at 11:26 AM Peter Gonda <pgonda@...gle.com> wrote:
> > >>
> > >> On Thu, Sep 28, 2023 at 4:49 PM Dan Williams <dan.j.williams@...el.com> wrote:
> > >>>
> > >>> Peter Gonda wrote:
> > >>>> On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <dan.j.williams@...el.com> wrote:
> > >>>>>
> > >>>>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
> > >>>>>
> > >>>>> In TDX guest, the attestation process is used to verify the TDX guest
> > >>>>> trustworthiness to other entities before provisioning secrets to the
> > >>>>> guest. The first step in the attestation process is TDREPORT
> > >>>>> generation, which involves getting the guest measurement data in the
> > >>>>> format of TDREPORT, which is further used to validate the authenticity
> > >>>>> of the TDX guest. TDREPORT by design is integrity-protected and can
> > >>>>> only be verified on the local machine.
> > >>>>>
> > >>>>> To support remote verification of the TDREPORT in a SGX-based
> > >>>>> attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> > >>>>> (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> > >>>>> only run outside of the TDX guest (i.e. in a host process or in a
> > >>>>> normal VM) and guest can use communication channels like vsock or
> > >>>>> TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> > >>>>> TDX guest may not support these communication channels. To handle such
> > >>>>> cases, TDX defines a GetQuote hypercall which can be used by the guest
> > >>>>> to request the host VMM to communicate with the SGX QE. More details
> > >>>>> about GetQuote hypercall can be found in TDX Guest-Host Communication
> > >>>>> Interface (GHCI) for Intel TDX 1.0, section titled
> > >>>>> "TDG.VP.VMCALL<GetQuote>".
> > >>>>>
> > >>>>> Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> > >>>>> Computing Guest platforms to get the measurement data via ConfigFS.
> > >>>>> Extend the TSM framework and add support to allow an attestation agent
> > >>>>> to get the TDX Quote data (included usage example below).
> > >>>>>
> > >>>>>   report=/sys/kernel/config/tsm/report/report0
> > >>>>>   mkdir $report
> > >>>>>   dd if=/dev/urandom bs=64 count=1 > $report/inblob
> > >>>>>   hexdump -C $report/outblob
> > >>>>>   rmdir $report
> > >>>>>
> > >>>>> GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> > >>>>> with TDREPORT data as input, which is further used by the VMM to copy
> > >>>>> the TD Quote result after successful Quote generation. To create the
> > >>>>> shared buffer, allocate a large enough memory and mark it shared using
> > >>>>> set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> > >>>>> for GetQuote requests in the TDX TSM handler.
> > >>>>>
> > >>>>> Although this method reserves a fixed chunk of memory for GetQuote
> > >>>>> requests, such one time allocation can help avoid memory fragmentation
> > >>>>> related allocation failures later in the uptime of the guest.
> > >>>>>
> > >>>>> Since the Quote generation process is not time-critical or frequently
> > >>>>> used, the current version uses a polling model for Quote requests and
> > >>>>> it also does not support parallel GetQuote requests.
> > >>>>>
> > >>>>> Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
> > >>>>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
> > >>>>> Reviewed-by: Erdem Aktas <erdemaktas@...gle.com>
> > >>>>> Signed-off-by: Dan Williams <dan.j.williams@...el.com>
> > >>>>
> > >>>> Hey Dan,
> > >>>>
> > >>>> I tried running your test commands on an SNP enabled guest. To build
> > >>>> the kernel I just checked out linus/master and applied your series. I
> > >>>> haven't done any debugging yet, so I will update with what I find.
> > >>>>
> > >>>> root@...ntu2004:~#   hexdump -C $report/outblob
> > >>>> [  219.871875] ------------[ cut here ]------------
> > >>>> [  219.876642] kernel BUG at include/linux/scatterlist.h:187!
> > >>>
> > >>> Ok, it does not like virtual address of one of the buffers, but my
> > >>> changes "should" not have affected that as get_ext_report() internally
> > >>> uses snp_dev->certs_data and snp_dev->response for bounce buffering the
> > >>> actual request / response memory. First test I want to try once I can
> > >>> get on an SNP system is compare this to the ioctl path just make sure
> > >>> that succeeds.
> > >>
> > >
> > > I think there may be an issue with CONFIG_DEBUG_SG. That was the
> > > warning we were getting in my above stack trace:
> > >
> > >> [  219.876642] kernel BUG at include/linux/scatterlist.h:187!
> > >
> > > This was for this line in enc_dec_message():
> > >
> > >         sg_set_buf(&src[1], src_buf, hdr->msg_sz);
> > >
> > > I am not sure why in sg_set_buf() virt_addr_valid() returns false for
> > > the address given in the sev_report_new() which is from the variable
> > > 'ext_req' which is stack allocated?
> > >
> > > static inline void sg_set_buf(struct scatterlist *sg, const void *buf,
> > >      unsigned int buflen)
> > > {
> > > #ifdef CONFIG_DEBUG_SG
> > >     BUG_ON(!virt_addr_valid(buf));
> > > #endif
> > >     sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
> > > }
> > >
> > > When I disable CONFIG_DEBUG_SG in my config. Your patch seems to work,
> > > well at least it doesn't crash the guest. I haven't checked if the
> > > report is valid yet.
> > >
> >
> > Dan, do you think it is related to not allocating direct mapped memory (using
> > kvalloc)?
> 
> But I think the issue is the stack allocated variable 'ext_req' here:
> 
> sev_report_new()
> +       void *buf __free(kvfree) = kvzalloc(size, GFP_KERNEL);
> +       if (!buf)
> +               return -ENOMEM;
> +
> +       guard(mutex)(&snp_cmd_mutex);
> +       certs_address = buf + report_size;
> +       struct snp_ext_report_req ext_req = {
> +               .data = { .vmpl = desc->privlevel },
> +               .certs_address = (__u64)certs_address,
> +               .certs_len = ext_size,
> +       };
> +       memcpy(&ext_req.data.user_data, desc->inblob, desc->inblob_len);

If the failure is coming from:

	sg_set_buf(&src[1], src_buf, hdr->msg_sz);

...then that is always coming from the stack as get_ext_report()
internally copies either from the user ioctl() address or the kernel
stack into the local stack copy in both cases:

get_ext_report(...)
	...
	struct snp_ext_report_req req;
	...
        if (copy_from_sockptr(&req, io->req_data, sizeof(req)))
                return -EFAULT;
	...
        ret = handle_guest_request(..., &req.data, ...);

...where that "&req.data" always becomes the @src_buf argument to
enc_dec_message(). So while I do understand why sg_set_buf() is
complaining, I don't understand why it is not *always* complaining,
regardless of configfs-tsm or ioctl() with CONFIG_DEBUG_SG=y builds.

I will be able to dig deeper once I can test on hardware, but I am
thinking that the entire scheme to pass the source buffer on the kernel
stack is broken and is only happening to work because there are no
crypto-accelerators attached that require that the virtual addresses be
virt_addr_valid() for a later dma_map_sg() event.

...or my eyes are overlooking how the ioctl() path is succeeding.