linux-kernel - Re: [RFC PATCH] x86/arch_prctl: Add ARCH_SET

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABV8kRzDk7a6QP7mba0xq7da+H-ks=Py29ZtMP1uyDp9mUvutg@mail.gmail.com>
Date:   Mon, 18 Jun 2018 14:16:00 -0400
From:   Keno Fischer <keno@...iacomputing.com>
To:     Dave Hansen <dave.hansen@...ux.intel.com>
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, x86@...nel.org,
        "H. Peter Anvin" <hpa@...or.com>, Borislav Petkov <bp@...e.de>,
        Andi Kleen <andi@...stfloor.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        Kyle Huey <khuey@...ehuey.com>,
        "Robert O'Callahan" <robert@...llahan.org>
Subject: Re: [RFC PATCH] x86/arch_prctl: Add ARCH_SET_XCR0 to mask XCR0 per-thread

> So, to be useful, this interface needs to be called before an
> application can run XGETBV or XSAVE for the first time and caches a
> "bad" value.  I think that means that it might not be feasible to use
> outside of cases where you ptrace() something and inject things before
> it has a chance to run any real instructions.
>
> Fundamentally, I think that makes _this_ interface pretty useless in
> practice.  The only practical option is to have a _future_ XCR0 value
> set by the prctl() and then have it get made active by the kernel at
> execve().

Fair enough, but it don't see this as really fundamentally different
from the cpuid masking use case, which has the same problem and
the same interface. I'm also not convinced that there is *no* use case
where one may want to turn on certain XCR0 features while the process
is running and then turn them off again. To give a concrete example in
this context, it can useful to write a small program into the memory space
of the replayed program and use it to analyze the memory state of the
program (e.g. to checksum the memory - or in our case to perform a
GC state validation). Such implants may want to use the AVX512
registers for performance, so it would be nice if that was possible.

> IMNHO, if you haven't guessed yet, I think this whole exercise is a dead
> end.  Just boot an identical XCR0 VM on your new hardware and do replay
> there.  Done.

I had a hunch ;). However, there are a couple considerations that
make me still want this in the kernel proper:
1. The recording side application of this feature - getting our users
    to do everything in a VM to send us a recording is not easy. Part
    of the appeal of rr over VM-based record/replay techniques
    is that it "just works" on basically any linux hosts.
2. Starting a VM generally requires root permissions, which may
    not be available.
3. And probably the biggest from my perspective is performance. rr
    needs to do a lot twiddling with the performance counters, which
    I've seen have significant performance overhead in a virtualized
    environment. There's of course also a per-VM resource consumption,
    but presumably we could keep one VM per-XCR0 value and replay
    multiple traces per VM.