linux-kernel - Re: [PATCH 3/3] x86/efi: Use efi_switch_mm() rather than manually twiddling with cr3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1503528742.30475.17.camel@intel.com>
Date:   Wed, 23 Aug 2017 15:52:22 -0700
From:   Sai Praneeth Prakhya <sai.praneeth.prakhya@...el.com>
To:     Andy Lutomirski <luto@...capital.net>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>,
        Will Deacon <will.deacon@....com>,
        Mark Rutland <mark.rutland@....com>,
        Matt Fleming <matt@...eblueprint.co.uk>,
        Ard Biesheuvel <ard.biesheuvel@...aro.org>,
        "linux-efi@...r.kernel.org" <linux-efi@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        joeyli <jlee@...e.com>, Borislav Petkov <bp@...en8.de>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        "Neri, Ricardo" <ricardo.neri@...el.com>,
        "Shankar, Ravi V" <ravi.v.shankar@...el.com>,
        "Luck, Tony" <tony.luck@...el.com>
Subject: Re: [PATCH 3/3] x86/efi: Use efi_switch_mm() rather than manually
 twiddling with cr3

On Mon, 2017-08-21 at 08:23 -0700, Andy Lutomirski wrote:
> 
> > On Aug 21, 2017, at 7:08 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> > 
> >> On Mon, Aug 21, 2017 at 06:56:01AM -0700, Andy Lutomirski wrote:
> >> 
> >> 
> >>> On Aug 21, 2017, at 3:33 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> > 
> >>>> 
> >>>> Using a kernel thread solves the problem for real.  Anything that
> >>>> blindly accesses user memory in kernel thread context is terminally
> >>>> broken no matter what.
> >>> 
> >>> So perf-callchain doesn't do it 'blindly', it wants either:
> >>> 
> >>> - user_mode(regs) true, or
> >>> - task_pt_regs() set.
> >>> 
> >>> However I'm thinking that if the kernel thread has ->mm == &efi_mm, the
> >>> EFI code running could very well have user_mode(regs) being true.
> >>> 
> >>> intel_pmu_pebs_fixup() OTOH 'blindly' assumes that the LBR addresses are
> >>> accessible. It bails on error though. So while its careful, it does
> >>> attempt to access the 'user' mapping directly. Which should also trigger
> >>> with the EFI code.
> >>> 
> >>> And I'm not seeing anything particularly broken with either. The PEBS
> >>> fixup relies on the CPU having just executed the code, and if it could
> >>> fetch and execute the code, why shouldn't it be able to fetch and read?
> >> 
> >> There are two ways this could be a problem.  One is that u privileged
> >> user apps shouldn't be able to read from EFI memory.
> > 
> > Ah, but only root can create per-cpu events or attach events to kernel
> > threads (with sensible paranoia levels).
> 
> But this may not need to be percpu.  If a non root user can trigger, say, an EFI variable read in their own thread context, boom.
> 
+ Tony

Hi Andi,

I am trying to reproduce the issue that we are discussing and hence
tried an experiment like this:
A user process continuously reads efi variable by
"cat /sys/firmware/efi/efivars/Boot0000-8be4df61-93ca-11d2-aa0d-00e098032b8c" for specified time (Eg: 100 seconds) and simultaneously I ran "perf top" as root (which I suppose should trigger NMI's). I see that everything is fine, no lockups, no kernel crash, no warnings/errors in dmesg.

I see that perf top reports 50% of time is spent in efi function
(probably efi_get_variable()).
Overhead	Shared Object	Symbol
50%		[unknown]	[k] 0xfffffffeea967416

50% is max, on avg it's 35%.

I have tested this on two kernels v4.12 and v3.19. My machine has 8
cores and to stress test, I further offlined all cpus except cpu0.

Could you please let me know a way to reproduce the issue that we are
discussing here.
I think the issue we are concerned here is, when kernel is in efi
context and an NMI happens and if the NMI handler tries to access user
space, boom! we don't have user space in efi context. Am I right in
understanding the issue or is it something else?

Regards,
Sai