linux-kernel - Re: [PATCH v22 16/24] x86/vdso: Add support for exception fixup in vDSO functions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191002231804.GA14315@linux.intel.com>
Date:   Thu, 3 Oct 2019 02:18:04 +0300
From:   Jarkko Sakkinen <jarkko.sakkinen@...ux.intel.com>
To:     linux-kernel@...r.kernel.org, x86@...nel.org,
        linux-sgx@...r.kernel.org
Cc:     akpm@...ux-foundation.org, dave.hansen@...el.com,
        sean.j.christopherson@...el.com, nhorman@...hat.com,
        npmccallum@...hat.com, serge.ayoun@...el.com,
        shay.katz-zamir@...el.com, haitao.huang@...el.com,
        andriy.shevchenko@...ux.intel.com, tglx@...utronix.de,
        kai.svahn@...el.com, bp@...en8.de, josh@...htriplett.org,
        luto@...nel.org, kai.huang@...el.com, rientjes@...gle.com,
        cedric.xing@...el.com, Andy Lutomirski <luto@...capital.net>,
        Dave Hansen <dave.hansen@...ux.intel.com>
Subject: Re: [PATCH v22 16/24] x86/vdso: Add support for exception fixup in
 vDSO functions

On Tue, Sep 03, 2019 at 05:26:47PM +0300, Jarkko Sakkinen wrote:
> From: Sean Christopherson <sean.j.christopherson@...el.com>
> 
> The basic concept and implementation is very similar to the kernel's
> exception fixup mechanism.  The key differences are that the kernel
> handler is hardcoded and the fixup entry addresses are relative to
> the overall table as opposed to individual entries.

The commit message is missing description of what the commit does.
Please explain "what" before "why". Now "what" is completely lacking.
This paragraph starts as if there was an invisible paragraph before it.

You should start by explaining briefly about this:

1. A brief description of what vdso2c is.
2. A brief description of what changes you do to vdso2.
3. A brief description of what kernel change you do.
4. A brief description of the flow how the expection gets delivered
   to the user space.

All of this is completely missing.

> Hardcoding the kernel handler avoids the need to figure out how to
> get userspace code to point at a kernel function.  Given that the
> expected usage is to propagate information to userspace, dumping all
> fault information into registers is likely the desired behavior for
> the vast majority of yet-to-be-created functions.  Use registers
> DI, SI and DX to communicate fault information, which follows Linux's
> ABI for register consumption and hopefully avoids conflict with
> hardware features that might leverage the fixup capabilities, e.g.
> register usage for SGX instructions was at least partially designed
> with calling conventions in mind.

No description of what is stored in DI, SI and DX. Also there is two
space bars after *every* sentence. Your text editor is totally broken
somehow. Also DB/BP exception is not described.

> Making fixup addresses relative to the overall table allows the table
> to be stripped from the final vDSO image (it's a kernel construct)
> without complicating the offset logic, e.g. entry-relative addressing
> would also need to account for the table's location relative to the
> image.
> 
> Regarding stripping the table, modify vdso2c to extract the table from
> the raw, a.k.a. unstripped, data and dump it as a standalone byte array
> in the resulting .c file.  The original base of the table, its length
> and a pointer to the byte array are captured in struct vdso_image.
> Alternatively, the table could be dumped directly into the struct,
> but because the number of entries can vary per image, that would
> require either hardcoding a max sized table into the struct definition
> or defining the table as a flexible length array.  The flexible length
> array approach has zero benefits, e.g. the base/size are still needed,
> and prevents reusing the extraction code, while hardcoding the max size
> adds ongoing maintenance just to avoid exporting the explicit size.
> 
> The immediate use case is for Intel Software Guard Extensions (SGX).
> SGX introduces a new CPL3-only "enclave" mode that runs as a sort of
> black box shared object that is hosted by an untrusted "normal" CPl3
> process.
> 
> Entering an enclave can only be done through SGX-specific instructions,
> EENTER and ERESUME, and is a non-trivial process.  Because of the
> complexity of transitioning to/from an enclave, the vast majority of
> enclaves are expected to utilize a library to handle the actual
> transitions.  This is roughly analogous to how e.g. libc implementations
> are used by most applications.
> 
> Another crucial characteristic of SGX enclaves is that they can generate
> exceptions as part of their normal (at least as "normal" as SGX can be)
> operation that need to be handled *in* the enclave and/or are unique
> to SGX.
> 
> And because they are essentially fancy shared objects, a process can
> host any number of enclaves, each of which can execute multiple threads
> simultaneously.
> 
> Putting everything together, userspace enclaves will utilize a library
> that must be prepared to handle any and (almost) all exceptions any time
> at least one thread may be executing in an enclave.  Leveraging signals
> to handle the enclave exceptions is unpleasant, to put it mildly, e.g.
> the SGX library must constantly (un)register its signal handler based
> on whether or not at least one thread is executing in an enclave, and
> filter and forward exceptions that aren't related to its enclaves.  This
> becomes particularly nasty when using multiple levels of libraries that
> register signal handlers, e.g. running an enclave via cgo inside of the
> Go runtime.
> 
> Enabling exception fixup in vDSO allows the kernel to provide a vDSO
> function that wraps the low-level transitions to/from the enclave, i.e.
> the EENTER and ERESUME instructions.  The vDSO function can intercept
> exceptions that would otherwise generate a signal and return the fault
> information directly to its caller, thus avoiding the need to juggle
> signal handlers.
> 
> Note that unlike the kernel's _ASM_EXTABLE_HANDLE implementation, the
> 'C' version of _ASM_VDSO_EXTABLE_HANDLE doesn't use a pre-compiled
> assembly macro.  Duplicating four lines of code is simpler than adding
> the necessary infrastructure to generate pre-compiled assembly and the
> intended benefit of massaging GCC's inlining algorithm is unlikely to
> realized in the vDSO any time soon, if ever.

Rest of the story is a mess with bits of pieces of "what" and "why"
and mixed together. You could probably make the whole like 50% smaller
with a better organization.

I never understood anything of this commit message. Only by looking
at the code change and completely ignoring the commit message I could
understand what the heck is going on. The commit message in the current
for makes me understand *less* the code change.

It would be even better to delete it completely than have it in the
current form. I would suggest that you do that and concentrate writing
steps 1-4 that I described above.

/Jarkko