[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrU8uyc4nu-5MBXQR0PY0rexUhPkuPRLK+gWt0gEWMDhTA@mail.gmail.com>
Date: Wed, 12 Nov 2014 16:02:37 -0800
From: Andy Lutomirski <luto@...capital.net>
To: "Luck, Tony" <tony.luck@...el.com>
Cc: Oleg Nesterov <oleg@...hat.com>, Borislav Petkov <bp@...en8.de>,
X86 ML <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Andi Kleen <andi@...stfloor.org>
Subject: Re: [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from userspace
On Wed, Nov 12, 2014 at 3:41 PM, Luck, Tony <tony.luck@...el.com> wrote:
>> v2 coming soon with these changes and some additional comment cleanups.
>
v2's not going to make a difference unless you're using uprobes at the
same time.
> So v1 + do_machine_check change is not surviving some real testing. I'm injecting and
> consuming errors sequentially with a small delay in between - so no fancy corner cases with
> multiple errors being processed ... we get all the way done with one error before we start
> the next. Test only survives about 400ish recoveries before Linux dies complaining:
> "Timeout synchronizing machine check over CPUs".
> This probably means that some cpu wandered into the weeds and never showed up in the
> handler.
In the interest of my sanity, can you add something like
BUG_ON(!user_mode_vm(regs)) or the mce_panic equivalent before calling
memory_failure?
What happens if there's a shared bank but the actual offender has a
higher order than the cpu that finds the error?
Is this something I can try under KVM?
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists