linux-kernel - Re: [PATCH 3/3] x86/efi: Use efi_switch_mm() rather than manually twiddling with cr3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrVhLmntPArQiuOcQeNf9Y2kDxa+mUY=v1P8rVOkeCZn4Q@mail.gmail.com>
Date:   Thu, 17 Aug 2017 08:52:38 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     Will Deacon <will.deacon@....com>
Cc:     Mark Rutland <mark.rutland@....com>,
        Andy Lutomirski <luto@...nel.org>,
        Matt Fleming <matt@...eblueprint.co.uk>,
        Ard Biesheuvel <ard.biesheuvel@...aro.org>,
        Sai Praneeth Prakhya <sai.praneeth.prakhya@...el.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "linux-efi@...r.kernel.org" <linux-efi@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        joeyli <jlee@...e.com>, Borislav Petkov <bp@...en8.de>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        "Neri, Ricardo" <ricardo.neri@...el.com>,
        "Ravi V. Shankar" <ravi.v.shankar@...el.com>
Subject: Re: [PATCH 3/3] x86/efi: Use efi_switch_mm() rather than manually
 twiddling with cr3

On Thu, Aug 17, 2017 at 3:35 AM, Will Deacon <will.deacon@....com> wrote:
> On Tue, Aug 15, 2017 at 11:35:41PM +0100, Mark Rutland wrote:
>> On Wed, Aug 16, 2017 at 09:14:41AM -0700, Andy Lutomirski wrote:
>> > On Wed, Aug 16, 2017 at 5:57 AM, Matt Fleming <matt@...eblueprint.co.uk> wrote:
>> > > On Wed, 16 Aug, at 12:03:22PM, Mark Rutland wrote:
>> > >>
>> > >> I'd expect we'd abort at a higher level, not taking any sample. i.e.
>> > >> we'd have the core overflow handler check in_funny_mm(), and if so, skip
>> > >> the sample, as with the skid case.
>> > >
>> > > FYI, this is my preferred solution for x86 too.
>> >
>> > One option for the "funny mm" flag would be literally the condition
>> > current->mm != current->active_mm.  I *think* this gets all the cases
>> > right as long as efi_switch_mm is careful with its ordering and that
>> > the arch switch_mm() code can handle the resulting ordering.  (x86's
>> > can now, I think, or at least will be able to in 4.14 -- not sure
>> > about other arches).
>>
>> For arm64 we'd have to rework things a bit to get the ordering right
>> (especially when we flip to/from the idmap), but otherwise this sounds sane to
>> me.
>>
>> > That being said, there's a totally different solution: run EFI
>> > callbacks in a kernel thread.  This has other benefits: we could run
>> > those callbacks in user mode some day, and doing *that* in a user
>> > thread seems like a mistake.
>>
>> I think that wouldn't work for CPU-bound perf events (which are not
>> ctx-switched with the task).
>>
>> It might be desireable to do that anyway, though.
>
> I'm still concerned that we're treating perf specially here -- are we
> absolutely sure that nobody else is going to attempt user accesses off the
> back of an interrupt?

Reasonably sure?  If nothing else, an interrupt taken while mmap_sem()
is held for write that tries to access user memory is asking for
serious trouble.  There are still a few callers of pagefault_disable()
and copy...inatomic(), though.

> If not, then I'd much prefer a solution that catches
> anybody doing that with the EFI page table installed, rather than trying
> to play whack-a-mole like this.

Using a kernel thread solves the problem for real.  Anything that
blindly accesses user memory in kernel thread context is terminally
broken no matter what.

>
> Will