[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YdV1BpMiAUGrwASv@zn.tnic>
Date:   Wed, 5 Jan 2022 11:37:58 +0100
From:   Borislav Petkov <bp@...en8.de>
To:     "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc:     tglx@...utronix.de, mingo@...hat.com, dave.hansen@...el.com,
        luto@...nel.org, peterz@...radead.org,
        sathyanarayanan.kuppuswamy@...ux.intel.com, aarcange@...hat.com,
        ak@...ux.intel.com, dan.j.williams@...el.com, david@...hat.com,
        hpa@...or.com, jgross@...e.com, jmattson@...gle.com,
        joro@...tes.org, jpoimboe@...hat.com, knsathya@...nel.org,
        pbonzini@...hat.com, sdeep@...are.com, seanjc@...gle.com,
        tony.luck@...el.com, vkuznets@...hat.com, wanpengli@...cent.com,
        x86@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 08/26] x86/tdx: Handle in-kernel MMIO
On Tue, Dec 14, 2021 at 06:02:46PM +0300, Kirill A. Shutemov wrote:
> In non-TDX VMs, MMIO is implemented by providing the guest a mapping
> which will cause a VMEXIT on access and then the VMM emulating the
> instruction that caused the VMEXIT. That's not possible in TDX guests
> because it requires exposing guest register and memory state to
> potentially malicious VMM.
What does that mean exactly? Aren't TDX registers encrypted just like
SEV-ES ones? If so, they can't really be exposed...
> In TDX the MMIO regions are instead configured to trigger a #VE
> exception in the guest. The guest #VE handler then emulates the MMIO
> instruction inside the guest and converts them into a controlled
s/them/it/
> hypercall to the host.
> 
> MMIO addresses can be used with any CPU instruction that accesses the
s/the //
> memory. This patch, however, covers only MMIO accesses done via io.h
"Here are covered only the MMIO accesses ... "
> helpers, such as 'readl()' or 'writeq()'.
> 
> MMIO access via other means (like structure overlays) may result in
> MMIO_DECODE_FAILED and an oops.
Why? They won't cause a EXIT_REASON_EPT_VIOLATION #VE or?
> AMD SEV has the same limitations to MMIO handling.
See, the other guy is no better here. :-P
> === Potential alternative approaches ===
> 
> == Paravirtualizing all MMIO ==
> 
> An alternative to letting MMIO induce a #VE exception is to avoid
> the #VE in the first place. Similar to the port I/O case, it is
> theoretically possible to paravirtualize MMIO accesses.
> 
> Like the exception-based approach offered by this patch, a fully
"... offered here, a fully ..."
> paravirtualized approach would be limited to MMIO users that leverage
> common infrastructure like the io.h macros.
> 
> However, any paravirtual approach would be patching approximately
> 120k call sites. With a conservative overhead estimation of 5 bytes per
> call site (CALL instruction), it leads to bloating code by 600k.
> 
> Many drivers will never be used in the TDX environment and the bloat
> cannot be justified.
I like the conservative approach here.
 
> == Patching TDX drivers ==
> 
> Rather than touching the entire kernel, it might also be possible to
> just go after drivers that use MMIO in TDX guests.  Right now, that's
> limited only to virtio and some x86-specific drivers.
> 
> All virtio MMIO appears to be done through a single function, which
> makes virtio eminently easy to patch. Future patches will implement this
> idea,
"This will be implemented in the future, ... "
> +static int tdx_handle_mmio(struct pt_regs *regs, struct ve_info *ve)
> +{
> +	char buffer[MAX_INSN_SIZE];
> +	unsigned long *reg, val = 0;
> +	struct insn insn = {};
> +	enum mmio_type mmio;
> +	int size;
> +	u8 sign_byte;
> +	bool err;
> +
> +	if (copy_from_kernel_nofault(buffer, (void *)regs->ip, MAX_INSN_SIZE))
> +		return -EFAULT;
> +
> +	insn_init(&insn, buffer, MAX_INSN_SIZE, 1);
> +	insn_get_length(&insn);
There is insn_decode() - see how it is used and use it here pls.
> +	case MMIO_READ_SIGN_EXTEND:
> +		err = tdx_mmio_read(size, ve->gpa, &val);
> +		if (err)
> +			break;
> +
> +		if (size == 1)
> +			sign_byte = (val & 0x80) ? 0xff : 0x00;
> +		else
> +			sign_byte = (val & 0x8000) ? 0xff : 0x00;
> +
> +		/* Sign extend based on operand size */
> +		memset(reg, sign_byte, insn.opnd_bytes);
> +		memcpy(reg, &val, size);
> +		break;
You can simplify this a bit:
        case MMIO_READ_SIGN_EXTEND: {
                u8 sign_byte = 0, msb = 7;
                err = tdx_mmio_read(size, ve->gpa, &val);
                if (err)
                        break;
                if (size > 1)
                        msb = 15;
                if (val & BIT(msb))
                        sign_byte = -1;
                /* Sign extend based on operand size */
                memset(reg, sign_byte, insn.opnd_bytes);
                memcpy(reg, &val, size);
                break;
        }
> +	case MMIO_MOVS:
> +	case MMIO_DECODE_FAILED:
> +		return -EFAULT;
> +	}
> +
> +	if (err)
> +		return -EFAULT;
<---- newline here.
> +	return insn.length;
> +}
> +
-- 
Regards/Gruss,
    Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists
 
