[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171114163159.GD3165@worktop.lehotels.local>
Date: Tue, 14 Nov 2017 17:31:59 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Andy Lutomirski <luto@...nel.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Avi Kivity <avi@...lladb.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
linux-api <linux-api@...r.kernel.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Boqun Feng <boqun.feng@...il.com>,
Andrew Hunter <ahh@...gle.com>,
maged michael <maged.michael@...il.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Michael Ellerman <mpe@...erman.id.au>,
Dave Watson <davejwatson@...com>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>,
Andrea Parri <parri.andrea@...il.com>,
"Russell King, ARM Linux" <linux@...linux.org.uk>,
Greg Hackmann <ghackmann@...gle.com>,
Will Deacon <will.deacon@....com>,
David Sehr <sehr@...gle.com>, x86 <x86@...nel.org>
Subject: Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration
On Tue, Nov 14, 2017 at 08:16:09AM -0800, Andy Lutomirski wrote:
> What guarantees that there's an IPI? Do we never do a syscall, get
> migrated during syscall processing (due to cond_resched(), for
> example), and land on another CPU that just happened to already be
> scheduling?
Possible, the other CPU could've pulled the task because it went idle.
No IPIs involved in that scenario.
And if it was running a different thread of the same process prior to
that, we'll also not do switch_mm().
So yes, it is possible to construct a migration scenario without core
serializing instructions (of the CPUID/MOV-CR kind, not the LOCK prefix
kind).
Note that that still requires a multi-threaded process.
There is another scenario; where the NOHZ load-balancer moves the task;
such that the NOHZ load balancing CPU is a 3rd CPU. In that case there
is an interrupt (to affect the load-balancing) but it will not land on
the CPU that's going to run the task.
This could happen for a single threaded task; since I suppose the NOHZ
idle CPU that's going to be the victim could have ran our task last and
still lazily have the mm.
Very tricky to make work, not to mention that I suspect actually going
idle will kill a whole bunch of state real quick.
Powered by blists - more mailing lists