linux-kernel - Re: [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from userspace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrU8uyc4nu-5MBXQR0PY0rexUhPkuPRLK+gWt0gEWMDhTA@mail.gmail.com>
Date:	Wed, 12 Nov 2014 16:02:37 -0800
From:	Andy Lutomirski <luto@...capital.net>
To:	"Luck, Tony" <tony.luck@...el.com>
Cc:	Oleg Nesterov <oleg@...hat.com>, Borislav Petkov <bp@...en8.de>,
	X86 ML <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Andi Kleen <andi@...stfloor.org>
Subject: Re: [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from userspace

On Wed, Nov 12, 2014 at 3:41 PM, Luck, Tony <tony.luck@...el.com> wrote:
>> v2 coming soon with these changes and some additional comment cleanups.
>

v2's not going to make a difference unless you're using uprobes at the
same time.

> So v1 + do_machine_check change is not surviving some real testing.  I'm injecting and
> consuming errors sequentially with a small delay in between - so no fancy corner cases with
> multiple errors being processed ... we get all the way done with one error before we start
> the next.  Test only survives about 400ish recoveries before Linux dies complaining:
>     "Timeout synchronizing machine check over CPUs".
> This probably means that some cpu wandered into the weeds and never showed up in the
> handler.

In the interest of my sanity, can you add something like
BUG_ON(!user_mode_vm(regs)) or the mce_panic equivalent before calling
memory_failure?

What happens if there's a shared bank but the actual offender has a
higher order than the cpu that finds the error?

Is this something I can try under KVM?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/