lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 25 Feb 2016 14:11:11 -0800
From:	Andy Lutomirski <luto@...capital.net>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Tony Luck <tony.luck@...el.com>,
	Borislav Petkov <bp@...en8.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Tony Luck <tony.luck@...il.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ingo Molnar <mingo@...nel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH v13] x86, mce: Add memcpy_trap()

On Feb 25, 2016 12:39 PM, "Linus Torvalds"
<torvalds@...ux-foundation.org> wrote:
>
> But doing things like
>
> +       if (r.trap_nr == X86_TRAP_MC) {
> +               volatile void *fault_addr = (volatile void *)from + n
> - r.bytes_left;
> +               phys_addr_t p = virt_to_phys(fault_addr);
> +
> +               memory_failure(p >> PAGE_SHIFT, MCE_VECTOR, 0);
> +       }
>
> in the copying code is insane, because dammit, that should be done by
> the codethat sets X86_TRAP_MC in the first place.

Impossible as such, I think :(

do_machine_check uses IST, the memory failure code can sleep, and you
can't sleep in IST context.  There's a special escape that lets
memory_failure sleep *if* it came from user mode.

Here's the solution I'd prefer.  Change all the copy string to/from
user code to use the new enhanced fixup code.  Have the new fixup
handler (which can be a short C function!) fix up regs->ip to point to
copy_user_handle_tail and add a new parameter to copy_user_handle_tail
indicating the fault type.  Then put whatever fixup logic is needed in
copy_user_handle_tail -- it knows the failing address (obviously), and
it's running in process context with interrupts on (unless we're in a
pagefault_disable section), and it can do whatever it needs to do.

Linus, it's kind of like yours, except with the trap info explicitly
passed to the fixup handler instead of having the fixup handler fish
it out of some per-thread structure.

Here are different some ideas I don't like.:

1. The machine check does an IPI-to-self and the failure code runs in
IRQ context.

2. The machine check code rewrites the return stack to inject a
function call.  I don't love this.

3. Drop the idea of sending an immediate sigbus and do it with
task_work.  Maybe this is bad for some reason other than code
messiness.

4. Change the entry code so machine check runs on the normal stack if
it hits with IRQs on.

>

> And if there is hardware that raises a machine check without actually
> telling you why - including the address - then it's laugable to talk
> about "recoverability" and "hardening" and things like that. Then the
> hardware is just broken.
>
>                       Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ