linux-kernel - Re: [PATCH 5/8] x86/mce: Avoid tail copy when machine check terminated a copy from user

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200917170406.GN31960@zn.tnic>
Date:   Thu, 17 Sep 2020 19:04:06 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     "Luck, Tony" <tony.luck@...el.com>
Cc:     Youquan Song <youquan.song@...el.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 5/8] x86/mce: Avoid tail copy when machine check
 terminated a copy from user

Replying a bit out of order:

> Both the code and the commit comment need much more of this
> description.

Hell yeah!

On Wed, Sep 16, 2020 at 12:26:59PM -0700, Luck, Tony wrote:
> So we take a another machine check on the same address when
> fault_in_pages_readable() does __get_user().  That ought to break
> us out ... but for some reason I still don't understand didn't.
> But even if it did ... the second machine check is not at all
> a good idea.

And I think this is the important point: for MCEs you absolutely don't
want to take another MCE and even walk into those fields. So what
fault_in_pages_readable() does normally, would be totally wrong. Imagine
you're playing minesweeper - you can't just pre-fault blocks without
counting the mines. :-P

So actually, I'm thinking:

.LMCE_during_user_access:
	mov $-ENODEV, %eax
	ASM_CLAC
	ret

I have no clue which error code we should put there but it should be an
error code which tells you not to retry and to back off immediately.

> Returning zero bytes left to say we completed avoids that. The
> user is guaranteed a SIGBUS when the task_work does fire. So whatever
> system call was in progress is not going to see the apparent
> successful return.

Yes, my only proposition with the error code is in case you're looking
at traces, to recognize that the copying encountered an MCE. In addition
to the "back off immediately" semantics, if there even is such defined
for users of copy_*_user().

> Unless you have some better way out of the dilemmma that the
> real fixup is only scheduled at the point that the extable
> fixup just arranges for a simple local return from the copy.

Right, see above: I think it is imperative *not* to walk into that area
again and not do any retrying.

> When the return to user happens the task_work that was scheduled
> in the machine check handler takes care of the error return to the
> user.

Yeah, let's write that whole flow down somewhere - not in a commit
message - so that it is clear what happens.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette