lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrV+mchcQ5SgBO8+xLvw9rrHm4yvq+wZtQsD1EMMGzAMEw@mail.gmail.com>
Date:	Mon, 5 Jan 2015 17:01:46 -0800
From:	Andy Lutomirski <luto@...capital.net>
To:	"Luck, Tony" <tony.luck@...el.com>
Cc:	Borislav Petkov <bp@...en8.de>,
	Paul McKenney <paulmck@...ux.vnet.ibm.com>,
	X86 ML <x86@...nel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Andi Kleen <andi@...stfloor.org>,
	Josh Triplett <josh@...htriplett.org>,
	Frédéric Weisbecker <fweisbec@...il.com>
Subject: Re: [PATCH] x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricks

On Mon, Jan 5, 2015 at 4:44 PM, Luck, Tony <tony.luck@...el.com> wrote:
> We now switch to the kernel stack when a machine check interrupts
> during user mode.  This means that we can perform recovery actions
> in the tail of do_machine_check()
>
> Signed-off-by: Tony Luck <tony.luck@...el.com>
>
> ---
> On top of Andy's x86/paranoid branch
> Andy: Should I really move that:
>         pr_err("Uncorrected hardware memory error ...
> inside the ist_begin_non_atomic() section?
>

I think I like it as is.

[...]

> @@ -1220,6 +1177,26 @@ void do_machine_check(struct pt_regs *regs, long error_code)
>         mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
>  out:
>         sync_core();
> +
> +       if (recover_paddr == ~0ull)
> +               goto done;
> +
> +       pr_err("Uncorrected hardware memory error in user-access at %llx",
> +                recover_paddr);

printk is safe from IRQ context, so this should be okay unless we've
totally screwed up.  And, if we totally screwed up, seeing this before
the BUGs in ist_begin_non_atomic would be nice.

> +       /*
> +        * We must call memory_failure() here even if the current process is
> +        * doomed. We still need to mark the page as poisoned and alert any
> +        * other users of the page.
> +        */
> +       ist_begin_non_atomic(regs);
> +       local_irq_enable();
> +       if (memory_failure(recover_paddr >> PAGE_SHIFT, MCE_VECTOR, flags) < 0) {
> +               pr_err("Memory error not recovered");
> +               force_sig(SIGBUS, current);
> +       }
> +       local_irq_disable();
> +       ist_end_non_atomic();
> +done:
>         ist_exit(regs, prev_state);
>  }

For the context-related bits:

Reviewed-by: Andy Lutomirski <luto@...capital.net>

Should I stick this in my -next branch so it can stew?

--Andy


-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ