lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190828160114.GE17205@worktop.programming.kicks-ass.net>
Date:   Wed, 28 Aug 2019 18:01:14 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Phil Auld <pauld@...hat.com>
Cc:     Matthew Garrett <mjg59@...f.ucam.org>,
        Vineeth Remanan Pillai <vpillai@...italocean.com>,
        Nishanth Aravamudan <naravamudan@...italocean.com>,
        Julien Desfossez <jdesfossez@...italocean.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>, mingo@...nel.org,
        tglx@...utronix.de, pjt@...gle.com, torvalds@...ux-foundation.org,
        linux-kernel@...r.kernel.org, subhra.mazumdar@...cle.com,
        fweisbec@...il.com, keescook@...omium.org, kerrnel@...gle.com,
        Aaron Lu <aaron.lwe@...il.com>,
        Aubrey Li <aubrey.intel@...il.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [RFC PATCH v3 00/16] Core scheduling v3

On Wed, Aug 28, 2019 at 11:30:34AM -0400, Phil Auld wrote:
> On Tue, Aug 27, 2019 at 11:50:35PM +0200 Peter Zijlstra wrote:

> > And given MDS, I'm still not entirely convinced it all makes sense. If
> > it were just L1TF, then yes, but now...
> 
> I was thinking MDS is really the reason for this. L1TF has mitigations but
> the only current mitigation for MDS for smt is ... nosmt. 

L1TF has no known mitigation that is SMT safe. The moment you have
something in your L1, the other sibling can read it using L1TF.

The nice thing about L1TF is that only (malicious) guests can exploit
it, and therefore the synchronizatin context is VMM. And it so happens
that VMEXITs are 'rare' (and already expensive and thus lots of effort
has already gone into avoiding them).

If you don't use VMs, you're good and SMT is not a problem.

If you do use VMs (and do/can not trust them), _then_ you need
core-scheduling; and in that case, the implementation under discussion
misses things like synchronization on VMEXITs due to interrupts and
things like that.

But under the assumption that VMs don't generate high scheduling rates,
it can work.

> The current core scheduler implementation, I believe, still has (theoretical?) 
> holes involving interrupts, once/if those are closed it may be even less 
> attractive.

No; so MDS leaks anything the other sibling (currently) does, this makes
_any_ privilidge boundary a synchronization context.

Worse still, the exploit doesn't require a VM at all, any other task can
get to it.

That means you get to sync the siblings on lovely things like system
call entry and exit, along with VMM and anything else that one would
consider a privilidge boundary. Now, system calls are not rare, they
are really quite common in fact. Trying to sync up siblings at the rate
of system calls is utter madness.

So under MDS, SMT is completely hosed. If you use VMs exclusively, then
it _might_ work because a 'pure' host doesn't schedule that often
(maybe, same assumption as for L1TF).

Now, there have been proposals of moving the privilidge boundary further
into the kernel. Just like PTI exposes the entry stack and code to
Meltdown, the thinking is, lets expose more. By moving the priv boundary
the hope is that we can do lots of common system calls without having to
sync up -- lots of details are 'pending'.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ