[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190829143050.GA7262@pauld.bos.csb>
Date: Thu, 29 Aug 2019 10:30:51 -0400
From: Phil Auld <pauld@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Matthew Garrett <mjg59@...f.ucam.org>,
Vineeth Remanan Pillai <vpillai@...italocean.com>,
Nishanth Aravamudan <naravamudan@...italocean.com>,
Julien Desfossez <jdesfossez@...italocean.com>,
Tim Chen <tim.c.chen@...ux.intel.com>, mingo@...nel.org,
tglx@...utronix.de, pjt@...gle.com, torvalds@...ux-foundation.org,
linux-kernel@...r.kernel.org, subhra.mazumdar@...cle.com,
fweisbec@...il.com, keescook@...omium.org, kerrnel@...gle.com,
Aaron Lu <aaron.lwe@...il.com>,
Aubrey Li <aubrey.intel@...il.com>,
Valentin Schneider <valentin.schneider@....com>,
Mel Gorman <mgorman@...hsingularity.net>,
Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [RFC PATCH v3 00/16] Core scheduling v3
On Wed, Aug 28, 2019 at 06:01:14PM +0200 Peter Zijlstra wrote:
> On Wed, Aug 28, 2019 at 11:30:34AM -0400, Phil Auld wrote:
> > On Tue, Aug 27, 2019 at 11:50:35PM +0200 Peter Zijlstra wrote:
>
> > > And given MDS, I'm still not entirely convinced it all makes sense. If
> > > it were just L1TF, then yes, but now...
> >
> > I was thinking MDS is really the reason for this. L1TF has mitigations but
> > the only current mitigation for MDS for smt is ... nosmt.
>
> L1TF has no known mitigation that is SMT safe. The moment you have
> something in your L1, the other sibling can read it using L1TF.
>
> The nice thing about L1TF is that only (malicious) guests can exploit
> it, and therefore the synchronizatin context is VMM. And it so happens
> that VMEXITs are 'rare' (and already expensive and thus lots of effort
> has already gone into avoiding them).
>
> If you don't use VMs, you're good and SMT is not a problem.
>
> If you do use VMs (and do/can not trust them), _then_ you need
> core-scheduling; and in that case, the implementation under discussion
> misses things like synchronization on VMEXITs due to interrupts and
> things like that.
>
> But under the assumption that VMs don't generate high scheduling rates,
> it can work.
>
> > The current core scheduler implementation, I believe, still has (theoretical?)
> > holes involving interrupts, once/if those are closed it may be even less
> > attractive.
>
> No; so MDS leaks anything the other sibling (currently) does, this makes
> _any_ privilidge boundary a synchronization context.
>
> Worse still, the exploit doesn't require a VM at all, any other task can
> get to it.
>
> That means you get to sync the siblings on lovely things like system
> call entry and exit, along with VMM and anything else that one would
> consider a privilidge boundary. Now, system calls are not rare, they
> are really quite common in fact. Trying to sync up siblings at the rate
> of system calls is utter madness.
>
> So under MDS, SMT is completely hosed. If you use VMs exclusively, then
> it _might_ work because a 'pure' host doesn't schedule that often
> (maybe, same assumption as for L1TF).
>
> Now, there have been proposals of moving the privilidge boundary further
> into the kernel. Just like PTI exposes the entry stack and code to
> Meltdown, the thinking is, lets expose more. By moving the priv boundary
> the hope is that we can do lots of common system calls without having to
> sync up -- lots of details are 'pending'.
Thanks for clarifying. My understanding is (somewhat) less fuzzy now. :)
I think, though, that you were basically agreeing with me that the current
core scheduler does not close the holes, or am I reading that wrong.
Cheers,
Phil
--
Powered by blists - more mailing lists