[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4iOJJjghTPTLCkvT-Y_SJOhCbfm66m_NO5Ue+eVr_0NZA@mail.gmail.com>
Date: Mon, 7 Jun 2021 10:17:14 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: "Kuppuswamy, Sathyanarayanan"
<sathyanarayanan.kuppuswamy@...ux.intel.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Andy Lutomirski <luto@...nel.org>,
Dave Hansen <dave.hansen@...el.com>,
Tony Luck <tony.luck@...el.com>,
Andi Kleen <ak@...ux.intel.com>,
Kirill Shutemov <kirill.shutemov@...ux.intel.com>,
Kuppuswamy Sathyanarayanan <knsathya@...nel.org>,
Raj Ashok <ashok.raj@...el.com>,
Sean Christopherson <seanjc@...gle.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC v2-fix-v1 3/3] x86/tdx: Handle port I/O
On Mon, Jun 7, 2021 at 9:25 AM Kuppuswamy, Sathyanarayanan
<sathyanarayanan.kuppuswamy@...ux.intel.com> wrote:
>
>
>
> On 6/5/21 2:08 PM, Dan Williams wrote:
> > On Sat, Jun 5, 2021 at 1:08 PM Kuppuswamy, Sathyanarayanan
> > <sathyanarayanan.kuppuswamy@...ux.intel.com> wrote:
> >>
> >>
> >>
> >> On 6/5/21 11:52 AM, Dan Williams wrote:
> >>> On Wed, May 26, 2021 at 9:24 PM Kuppuswamy Sathyanarayanan
> >>> <sathyanarayanan.kuppuswamy@...ux.intel.com> wrote:
> >>>>
> >>>> From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
> >>>>
> >>>> TDX hypervisors cannot emulate instructions directly. This
> >>>> includes port IO which is normally emulated in the hypervisor.
> >>>> All port IO instructions inside TDX trigger the #VE exception
> >>>> in the guest and would be normally emulated there.
> >>>>
> >>>> For the really early code in the decompressor, #VE cannot be
> >>>> used because the IDT needed for handling the exception is not
> >>>> set-up, and some other infrastructure needed by the handler
> >>>> is missing. So to support port IO in decompressor code, add
> >>>> support for paravirt based I/O port virtualization.
> >>>>
> >>>> Also string I/O is not supported in TDX guest. So, unroll the
> >>>> string I/O operation into a loop operating on one element at
> >>>> a time. This method is similar to AMD SEV, so just extend the
> >>>> support for TDX guest platform.
> >>>
> >>> Given early port IO is broken out in its own previous I think it makes
> >>> sense to break out the decompressor port IO enabling from final
> >>> runtime port IO support.
> >>
> >> Patch titled "x86/tdx: Handle early IO operations" mainly adds
> >> IO #VE support in early exception handler. Decompression code IO
> >> support does not have dependency on it. You still think it is
> >> better to move it that patch?
> >>
> >
> > No, I was suggesting three patches instead of 2:
>
> Ok. I will move it to separate patch.
>
> >
>
> snip
>
> >>
> >> Currently decompression code cannot use #VE based IO emulation. It does
> >> not know how to handle #VE exceptions. Also, It is much easier to replace
> >> IO calls with TDX hypercalls in decompression code when compared with
> >> teaching how to handle #VE exceptions in decompression code.
> >
> > Ok, but that does not answer the background behind the decision to use
> > emulation rather than direct replacement of port IO instructions in
> > the final kernel runtime image.
>
> The reason for using #VE based emulation is,
>
> 1. It does not require changes to all usages of emulated instructions in
> all the drivers.
> 2. Directly replacing instructions with TDX hypercalls will lead to bloated
> image. So we cannot universally adapt this approach.
>
> Reason for not adapting to use #VE approach for decompression code is,
>
> 1. #VE handler support does not exist for de-compressor code.
> 2. Adding such support is more complex and in-efficient (just for
> single use case of handling IO instructions).
>
> So we have replaced IO instructions with TDX hypercalls in decompression code.
>
> Did it answer your query?
Yes, all but the concern of printk recursion.
>
> >
> > This patch mixes those 2 concerns and I think it deserves to be broken
> > out and explained.
> >
>
>
>
> >>>> +/*
> >>>> + * Helper function used for making hypercall for "out"
> >>>> + * instruction. It will be called from __out IO
> >>>> + * macro (in tdx.h).
> >>>> + */
> >>>> +void tdg_out(int size, int port, unsigned int value)
> >>>> +{
> >>>> + __tdx_hypercall(EXIT_REASON_IO_INSTRUCTION, size, 1,
> >>>> + port, value, NULL);
> >>>> +}
> >>>> +
> >>>> +/*
> >>>> + * Helper function used for making hypercall for "in"
> >>>> + * instruction. It will be called from __in IO macro
> >>>> + * (in tdx.h). If IO is failed, it will return all 1s.
> >>>> + */
> >>>> +unsigned int tdg_in(int size, int port)
> >>>> +{
> >>>> + struct tdx_hypercall_output out = {0};
> >>>> + int err;
> >>>> +
> >>>> + err = __tdx_hypercall(EXIT_REASON_IO_INSTRUCTION, size, 0,
> >>>> + port, 0, &out);
> >>>> +
> >>>> + return err ? UINT_MAX : out.r11;
> >>>> +}
> >>>
> >>> The previous patch open coded tdg_{in,out} and this one provides
> >>> helpers. I think at a minimum they should be consistent and pick one
> >>> style.
> >>
> >> As I have mentioned above, early IO #VE handler is a special case. we
> >> don't want to complicate its code path with debug or tracing support.
> >> So it is not a good comparison target.
> >
> > This patch and the last do the same thing in 2 different ways. One of
> > them should match the other even if the helpers are not directly
> > reused.
>
> I am not sure whether I understand your question. But if the question is
> about different implementation, the difference is due to where it is
> being called.
>
> In early IO support patch, tdx_early_io() is called from #VE handler.
> So there is extra buffer code to extract exception information from
> struct ve_info.
>
> In this case, the caller is __in/__out macros. So there is no need
> for above mentioned buffer code.
>
> Actual hypercall usage is similar in both cases.
It's similar when it can be identical. It simply looks like 2 authors
wrote 2 different code paths which is the case. Just huddle and code
it one way, I don't much care which.
>
> >
> >> In this case, the reason for adding helper function is to make it easier
> >> for calling it from tdx.h.
> >>>
> >>>> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
> >>>> index ef7a686a55a9..daa75c8eef5d 100644
> >>>> --- a/arch/x86/include/asm/io.h
> >>>> +++ b/arch/x86/include/asm/io.h
> >>>> @@ -40,6 +40,7 @@
> >>>>
> >>
> >> snip
> >>
> >>>> +
> >>>> +/* Helper function for converting {b,w,l} to byte size */
> >>>> +static inline int tdx_get_iosize(char *str)
> >>>> +{
> >>>> + if (str[0] == 'w')
> >>>> + return 2;
> >>>> + else if (str[0] == 'l')
> >>>> + return 4;
> >>>> +
> >>>> + return 1;
> >>>> +}
> >>>
> >>> This seems like an unnecessary novelty. The BUILDIO() macro in
> >>> arch/x86/include/asm/io.h takes a type argument, why can't the size be
> >>> explicitly specified rather than inferred from string parsing?
> >>
> >> I don't want to make changes to generic macros in io.h if it can be
> >> avoided. It follows similar argument/type in all arch/* code. Also, it
> >> is easier to handle TDX as a special case here.
> >>
> >
> > What changes are you talking about to the generic macros? The BUILDIO
> > macro passes in a size parameter explicitly rather than inferring the
> > size from the string name of an argument. BUILDIO does not need to
> > change, it's backend just needs to do the right thing in the TDX case.
> >
> > Otherwise, "I don't want to" is not a sufficient justification for
> > avoiding needlessly new design patterns.
>
> I hope this is what you meant?
>
> --- a/arch/x86/include/asm/io.h
> +++ b/arch/x86/include/asm/io.h
> @@ -273,25 +273,25 @@ static inline bool sev_key_active(void) { return false; }
> #endif /* CONFIG_AMD_MEM_ENCRYPT */
>
> #ifndef __out
> -#define __out(bwl, bw) \
> +#define __out(bwl, bw, sz) \
> asm volatile("out" #bwl " %" #bw "0, %w1" : : "a"(value), "Nd"(port))
> #endif
>
> #ifndef __in
> -#define __in(bwl, bw) \
> +#define __in(bwl, bw, sz) \
> asm volatile("in" #bwl " %w1, %" #bw "0" : "=a"(value) : "Nd"(port))
> #endif
>
> #define BUILDIO(bwl, bw, type) \
> static inline void out##bwl(unsigned type value, int port) \
> { \
> - __out(bwl, bw); \
> + __out(bwl, bw, sizeof(type)); \
> } \
> \
> static inline unsigned type in##bwl(int port) \
> { \
> unsigned type value; \
> - __in(bwl, bw); \
> + __in(bwl, bw, sizeof(type)); \
> return value; \
> }
Looks good.
Powered by blists - more mailing lists