linux-kernel - Re: [RFC v2 14/32] x86/tdx: Handle port I/O

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4iuRdXooQvCzEWd9babzPij4nXpM-z5fai9+SGaeFYswQ@mail.gmail.com>
Date:   Mon, 10 May 2021 16:34:34 -0700
From:   Dan Williams <dan.j.williams@...el.com>
To:     Andi Kleen <ak@...ux.intel.com>
Cc:     Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Tony Luck <tony.luck@...el.com>,
        Kirill Shutemov <kirill.shutemov@...ux.intel.com>,
        Kuppuswamy Sathyanarayanan <knsathya@...nel.org>,
        Raj Ashok <ashok.raj@...el.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC v2 14/32] x86/tdx: Handle port I/O

On Mon, May 10, 2021 at 4:08 PM Andi Kleen <ak@...ux.intel.com> wrote:
>
>
> On 5/10/2021 2:57 PM, Dan Williams wrote:
> >
> > There is a mix of direct-TDVMCALL usage and handling #VE when and why
> > is either approached used?
>
> For the really early code in the decompressor or the main kernel we
> can't use #VE because the IDT needed for handling the exception is not
> set up, and some other infrastructure needed by the handler is missing.
> The early code needs to do port IO to be able to write the early serial
> console. To keep it all common it ended up that all port IO is paravirt.
> Actually for most the main kernel port IO calls we could just use #VE
> and it would result in smaller binaries, but then we would need to
> annotate all early portio with some special name. That's why port IO is
> all TDCALL.

Thanks Andi. Sathya, please include the above in the next posting.

>
> For some others the only thing that really has to be #VE is MMIO because
> we don't want to annotate every MMIO read*/write* with an alternative
> (which would result in incredible binary bloat) For the others they have
> mostly become now direct calls.
>
>
> >
> >> Decompression code uses port IO for earlyprintk. We must use
> >> paravirt calls there too if we want to allow earlyprintk.
> > What is the tradeoff between teaching the decompression code to handle
> > #VE (the implied assumption) vs teaching it to avoid #VE with direct
> > TDVMCALLs (the chosen direction)?
>
> The decompression code only really needs it to output something. But you
> couldn't debug anything until #VE is set up. Also the decompression code
> has a very basic environment that doesn't supply most kernel services,
> and the #VE handler is relatively complicated. It would probably need to
> be duplicated and the instruction decoder be ported to work in this
> environment. It would be all a lot of work, just to make the debug
> output work.
>
> >
> >> Co-developed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
> >> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
> >> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> >> Reviewed-by: Andi Kleen <ak@...ux.intel.com>
> >> ---
> >>   arch/x86/boot/compressed/Makefile |   1 +
> >>   arch/x86/boot/compressed/tdcall.S |   9 ++
> >>   arch/x86/include/asm/io.h         |   5 +-
> >>   arch/x86/include/asm/tdx.h        |  46 ++++++++-
> >>   arch/x86/kernel/tdcall.S          | 154 ++++++++++++++++++++++++++++++
> > Why is this named "tdcall" when it is implementing tdvmcalls? I must
> > say those names don't really help me understand what they do. Can we
> > have Linux names that don't mandate keeping the spec terminology in my
> > brain's translation cache?
>
> The instruction is called TDCALL. It's always the same instruction
>
> TDVMCALL is the variant when the host processes it (as opposed to the
> TDX module), but it's just a different name space in the call number.
>
>

Ok.

>              \
>
> > Is there a unified Linux name these can be given to stop the
> > proliferation of poor vendor names for similar concepts?
>
> We could use protected_guest()

Looks good.

>
>
> >
> > Does it also not know how to handle #VE to keep it aligned with the
> > runtime code?
>
>
> Not sure I understand the question, but the decompression code supports
> neither alternatives nor #VE. It's a very limited environment.

Yes, that addresses the question.

>
> >
> > Outside the boot decompression code isn't this branch of the "ifdef
> > BOOT_COMPRESSED_MISC_H"  handled by #VE? I also don't see any usage of
> > __{in,out}() in this patch.
>
> I thought it was all alternative after decompression, so the #VE code
> shouldn't be called. We still have it for some reason though.

Right, I'm struggling to understand where these spurious in/out
instructions are coming from that are not replaced by the
alternative's code? Shouldn't those be dropped on the floor and warned
about rather than handled? I.e. shouldn't port-io instruction escapes
that would cause #VE be precluded at build-time?

>
>
> >
> > Perhaps "PAYLOAD_SIZE" since it is used for both input and output?
> >
> > If the ABI does not include the size of the payload then how would
> > code detect if even 80 bytes was violated in the future?
>
>
> The payload in memory is just a Linux concept. At the TDCALL level it's
> only registers.
>

If it's only a Linux concept why does this code need to "prepare for
the future"?


> >
> > 5
> > Surely there's an existing macro for this pattern? Would
> > PUSH_AND_CLEAR_REGS + POP_REGS be suitable? Besides code sharing it
> > would eliminate clearing of %r8.
>
>
> There used to be SAVE_ALL/SAVE_REGS, but they have been all removed in
> some past refactorings.

Not a huge deal, but at a minimum it seems a generic construct that
deserves to be declared centrally rather than tdx-guest-port-io local.