linux-kernel - Re: [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from userspace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 12 Nov 2014 00:09:26 +0100
From:	Borislav Petkov <bp@...en8.de>
To:	Andy Lutomirski <luto@...capital.net>
Cc:	X86 ML <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Tony Luck <tony.luck@...el.com>,
	Andi Kleen <andi@...stfloor.org>
Subject: Re: [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from
 userspace

On Tue, Nov 11, 2014 at 02:40:12PM -0800, Andy Lutomirski wrote:
> I wonder what the IRET is for.  There had better not be another magic
> IRET unmask thing.  I'm guessing that the actual semantics are that
> nothing whatsoever can mask #MC, but that a second #MC when MCIP is
> still set is a shutdown condition.

Hmmm, both manuals are unclear as to what exactly reenables #MC. So
forget about IRET and look at this: "When the processor receives a
machine check when MCIP is set, it automatically enters the shutdown
state." so this really reads like a second #MC while the first is
happening would shutdown the system - regardless whether I'm still in
#MC context or not, running the first #MC handler.

I guess I needz me some hw people to actually confirm.

> Define "atomic".
> 
> You're still running with irqs off and MCIP set.  At some point,

Yes, I need to be atomic wrt to another #MC so that I can be able to
read out the MCA MSRs in time and undisturbed.

> you're presumably done with all of the machine check registers, and
> you can clear MCIP.  Now, if current == victim, you can enable irqs
> and do whatever you want.

This is the key: if I enable irqs and the process gets scheduled on
another CPU, I lose. So I have to be able to say: before you run this
task on any CPU, kill it.

> In my mind, the benefit is that you don't need to think about how to
> save your information and arrange to get called back the next time
> that the victim task is a non-atomic context, since you *are* the
> victim task and you're running in normal irqs-disabled kernel mode.
> 
> In contrast, with the current entry code, if you enable IRQs or so
> anything that could sleep, you're on the wrong stack, so you'll crash.
> That means that taking mutexes, even after clearing MCIP, is
> impossible.

Hmm, it is late here and I need to think about this on a clear head
again but I think I can see the benefit of this to a certain extent.
However(!), I need to be able to run undisturbed and do the minimum work
in the #MC handler before I reenable MCEs.

But Tony also has a valid point as in what is going to happen if I
get another MCE while doing the memory_failure() dance. I guess if
memory_failure() takes proper locks, the second #MC will get to wait
until the first is done. But who knows in reality ...

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/