linux-kernel - Re: [RFC v2 00/27] Kernel Address Space Isolation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <bfd62213-c7c0-4a90-b377-0de7d9557c4c@oracle.com>
Date:   Fri, 12 Jul 2019 16:06:34 +0200
From:   Alexandre Chartre <alexandre.chartre@...cle.com>
To:     Dave Hansen <dave.hansen@...el.com>, pbonzini@...hat.com,
        rkrcmar@...hat.com, tglx@...utronix.de, mingo@...hat.com,
        bp@...en8.de, hpa@...or.com, dave.hansen@...ux.intel.com,
        luto@...nel.org, peterz@...radead.org, kvm@...r.kernel.org,
        x86@...nel.org, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Cc:     konrad.wilk@...cle.com, jan.setjeeilers@...cle.com,
        liran.alon@...cle.com, jwadams@...gle.com, graf@...zon.de,
        rppt@...ux.vnet.ibm.com
Subject: Re: [RFC v2 00/27] Kernel Address Space Isolation


On 7/12/19 3:51 PM, Dave Hansen wrote:
> On 7/12/19 1:09 AM, Alexandre Chartre wrote:
>> On 7/12/19 12:38 AM, Dave Hansen wrote:
>>> I don't see the per-cpu areas in here.  But, the ASI macros in
>>> entry_64.S (and asi_start_abort()) use per-cpu data.
>>
>> We don't map all per-cpu areas, but only the per-cpu variables we need. ASI
>> code uses the per-cpu cpu_asi_session variable which is mapped when an ASI
>> is created (see patch 15/26):
> 
> No fair!  I had per-cpu variables just for PTI at some point and had to
> give them up! ;)
> 
>> +    /*
>> +     * Map the percpu ASI sessions. This is used by interrupt handlers
>> +     * to figure out if we have entered isolation and switch back to
>> +     * the kernel address space.
>> +     */
>> +    err = ASI_MAP_CPUVAR(asi, cpu_asi_session);
>> +    if (err)
>> +        return err;
>>
>>
>>> Also, this stuff seems to do naughty stuff (calling C code, touching
>>> per-cpu data) before the PTI CR3 writes have been done.  But, I don't
>>> see anything excluding PTI and this code from coexisting.
>>
>> My understanding is that PTI CR3 writes only happens when switching to/from
>> userland. While ASI enter/exit/abort happens while we are already in the
>> kernel,
>> so asi_start_abort() is not called when coming from userland and so not
>> interacting with PTI.
> 
> OK, that makes sense.  You only need to call C code when interrupted
> from something in the kernel (deeper than the entry code), and those
> were already running kernel C code anyway.
> 

Exactly.

> If this continues to live in the entry code, I think you have a good
> clue where to start commenting.

Yeah, lot of writing to do... :-)
  
> BTW, the PTI CR3 writes are not *strictly* about the interrupt coming
> from user vs. kernel.  It's tricky because there's a window both in the
> entry and exit code where you are in the kernel but have a userspace CR3
> value.  You end up needing a CR3 write when you have a userspace CR3
> value when the interrupt occurred, not only when you interrupt userspace
> itself.
> 

Right. ASI is simpler because it comes from the kernel and return to the
kernel. There's just a small window (on entry) where we have the ASI CR3
but we quickly switch to the full kernel CR3.

alex.