linux-kernel - Re: [PATCH v2 4/5] x86/mce: Simplify flow when handling recoverable memory errors

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 11 Nov 2014 17:13:09 +0100
From:	Borislav Petkov <bp@...en8.de>
To:	Andy Lutomirski <luto@...capital.net>
Cc:	Chen Gong <gong.chen@...ux.intel.com>, X86 ML <x86@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Tony Luck <tony.luck@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 4/5] x86/mce: Simplify flow when handling recoverable
 memory errors

On Tue, Nov 11, 2014 at 07:42:48AM -0800, Andy Lutomirski wrote:
> The last time I looked at the MCE code, I got a bit lost in the
> control flow.  Is there ever a userspace-killing MCE that's delivered
> from kernel mode?

Yep, so while you're executing a userspace process, you get
an #MC raised which reports an error for which action is
required, i.e. look at all those MCE_AR_SEVERITY errors in
arch/x86/kernel/cpu/mcheck/mce-severity.c.

It happened within the context of current so we go and run the #MC
handler which decides that the process needs to be killed in order to
contain the error. So after we exit the handler and before we return to
try to sched in the process again on any core, we want to actually kill
it and poison all its memory.

> By that, I mean that I think that all userspace-killing MCEs go have
> user_mode_vm(regs) and go through paranoid_exit.

Yes.

> If so, why do you need to jump through hoops at all?  You can't call
> do_exit, but it should be completely safe to force a fatal signal and
> let the scheduler and signal code take care of killing the process,
> right?  For that matter, you should also be able to poke at vm
> structures, etc.

Well, we do that already. memory-failure.c does kill the processes when
it decides to.

The only question is whether adding two new members to task_struct is
ok. It is nicely convenient and it all falls into place.

In the #MC handler we do:

 		if (worst == MCE_AR_SEVERITY) {
 			/* schedule action before return to userland */
+			current->paddr = m.addr;
+			current->restartable = !!(m.mcgstatus & MCG_STATUS_RIPV);
			set_thread_flag(TIF_MCE_NOTIFY);
		}

and then before we return to userspace we do:

+	if (!current->restartable)
 		flags |= MF_MUST_KILL;
 	if (memory_failure(pfn, MCE_VECTOR, flags) < 0) {

and the MF_MUST_KILL makes sure memory_failure() does a force_sig().

So I think this is ok, I only think that people might oppose the two new
members to task_struct but it looks clean to me this way. IMHO at least.

> Or is there a meaningful case where mce_notify_process needs to help
> with recovery but the original MCE happened with !user_mode_vm(regs)?

Well, for the !user_mode_vm(regs) case we panic anyway.

Thanks Andy.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/