linux-kernel - Re: [RFC][PATCH v2 00/21] x86/pti: Defer CR3 switch to C code

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <890f6b7e-a268-2257-edcb-5eacc7db3d8e@oracle.com>
Date:   Tue, 17 Nov 2020 19:12:07 +0100
From:   Alexandre Chartre <alexandre.chartre@...cle.com>
To:     Borislav Petkov <bp@...en8.de>
Cc:     tglx@...utronix.de, mingo@...hat.com, hpa@...or.com,
        x86@...nel.org, dave.hansen@...ux.intel.com, luto@...nel.org,
        peterz@...radead.org, linux-kernel@...r.kernel.org,
        thomas.lendacky@....com, jroedel@...e.de, konrad.wilk@...cle.com,
        jan.setjeeilers@...cle.com, junaids@...gle.com, oweisse@...gle.com,
        rppt@...ux.vnet.ibm.com, graf@...zon.de, mgross@...ux.intel.com,
        kuzuno@...il.com
Subject: Re: [RFC][PATCH v2 00/21] x86/pti: Defer CR3 switch to C code

On 11/17/20 5:55 PM, Borislav Petkov wrote:
> On Tue, Nov 17, 2020 at 08:56:23AM +0100, Alexandre Chartre wrote:
>> The main goal of ASI is to provide KVM address space isolation to
>> mitigate guest-to-host speculative attacks like L1TF or MDS.
> 
> Because the current L1TF and MDS mitigations are lacking or why?
> 

Yes. L1TF/MDS allow some inter cpu-thread attacks which are not mitigated at
the moment. In particular, this allows a guest VM to attack another guest VM
or the host kernel running on a sibling cpu-thread. Core Scheduling will
mitigate the guest-to-guest attack but not the guest-to-host attack. Address
Space Isolation provides a mitigation for guest-to-host attack.

>> Current proposal of ASI is plugged into the CR3 switch assembly macro
>> which make the code brittle and complex. (see [1])
>>
>> I am also expected this might help with some other ideas like having
>> syscall (or interrupt handler) which can run without switching the
>> page-table.
> 
> I still fail to see why we need all that. I read, "this does this and
> that" but I don't read "the current problem is this" and "this is our
> suggested solution for it".
> 
> So what is the issue which needs addressing in the current kernel which
> is going to justify adding all that code?

The main issue this is trying to address is that the CR3 switch is currently
done in assembly code from contexts which are very restrictive: the CR3 switch
is often done when only one or two registers are available for use, sometimes
no stack is available. For example, the syscall entry switches CR3 with a single
register available (%sp) and no stack.

Because of this, it is fairly tricky to expand the logic for switching CR3.
This is a problem that we have faced while implementing Address Space Isolation
(ASI) where we need extra logic to drive the page-table switch. We have successfully
implement ASI with the current CR3 switching assembly code, but this requires
complex assembly construction. Hence this proposal to defer CR3 switching to C
code so that it can be more easily expandable.

Hopefully this can also contribute to make the assembly entry code less complex,
and be beneficial to other projects.

>> PTI has a measured overhead of roughly 5% for most workloads, but it can
>> be much higher in some cases.
> 
> "it can be"? Where? Actual use case?

Some benchmarks are available, in particular from phoronix:

https://www.phoronix.com/scan.php?page=article&item=linux-more-x86pti
https://www.phoronix.com/scan.php?page=news_item&px=x86-PTI-Initial-Gaming-Tests
https://www.phoronix.com/scan.php?page=article&item=linux-kpti-kvm
https://medium.com/@loganaden/linux-kpti-performance-hit-on-real-workloads-8da185482df3

>> The latest ASI RFC (RFC v4) is here [1]. This RFC has ASI plugged
>> directly into the CR3 switch assembly macro. We are working on a new
>> implementation, based on these changes which avoid having to deal with
>> assembly code and makes the implementation more robust.
> 
> This still doesn't answer my questions. I read a lot of "could be used
> for" formulations but I still don't know why we need that. So what is
> the problem that the kernel currently has which you're trying to address
> with this?
> 

Hopefully this is clearer with the answer I provided above.

Thanks,

alex.