linux-kernel - Re: [PATCH v13] x86, mce: Add memcpy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrWY+fBPgc+bKHp3BzAsT7bvSbdLJf-h5fEV+_rj+JwxCA@mail.gmail.com>
Date:	Thu, 25 Feb 2016 14:11:11 -0800
From:	Andy Lutomirski <luto@...capital.net>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Tony Luck <tony.luck@...el.com>,
	Borislav Petkov <bp@...en8.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Tony Luck <tony.luck@...il.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ingo Molnar <mingo@...nel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH v13] x86, mce: Add memcpy_trap()

On Feb 25, 2016 12:39 PM, "Linus Torvalds"
<torvalds@...ux-foundation.org> wrote:
>
> But doing things like
>
> +       if (r.trap_nr == X86_TRAP_MC) {
> +               volatile void *fault_addr = (volatile void *)from + n
> - r.bytes_left;
> +               phys_addr_t p = virt_to_phys(fault_addr);
> +
> +               memory_failure(p >> PAGE_SHIFT, MCE_VECTOR, 0);
> +       }
>
> in the copying code is insane, because dammit, that should be done by
> the codethat sets X86_TRAP_MC in the first place.

Impossible as such, I think :(

do_machine_check uses IST, the memory failure code can sleep, and you
can't sleep in IST context.  There's a special escape that lets
memory_failure sleep *if* it came from user mode.

Here's the solution I'd prefer.  Change all the copy string to/from
user code to use the new enhanced fixup code.  Have the new fixup
handler (which can be a short C function!) fix up regs->ip to point to
copy_user_handle_tail and add a new parameter to copy_user_handle_tail
indicating the fault type.  Then put whatever fixup logic is needed in
copy_user_handle_tail -- it knows the failing address (obviously), and
it's running in process context with interrupts on (unless we're in a
pagefault_disable section), and it can do whatever it needs to do.

Linus, it's kind of like yours, except with the trap info explicitly
passed to the fixup handler instead of having the fixup handler fish
it out of some per-thread structure.

Here are different some ideas I don't like.:

1. The machine check does an IPI-to-self and the failure code runs in
IRQ context.

2. The machine check code rewrites the return stack to inject a
function call.  I don't love this.

3. Drop the idea of sending an immediate sigbus and do it with
task_work.  Maybe this is bad for some reason other than code
messiness.

4. Change the entry code so machine check runs on the normal stack if
it hits with IRQs on.

>

> And if there is hardware that raises a machine check without actually
> telling you why - including the address - then it's laugable to talk
> about "recoverability" and "hardening" and things like that. Then the
> hardware is just broken.
>
>                       Linus