linux-kernel - Re: [PATCH v4 06/10] x86/alternative: use temporary mm for text poking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181112004140.GF3056@worktop>
Date:   Mon, 12 Nov 2018 01:41:40 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Nadav Amit <namit@...are.com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Andy Lutomirski <luto@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Masami Hiramatsu <mhiramat@...nel.org>
Subject: Re: [PATCH v4 06/10] x86/alternative: use temporary mm for text
 poking

On Mon, Nov 12, 2018 at 12:09:32AM +0000, Nadav Amit wrote:
> > On Sun, Nov 11, 2018 at 08:53:07PM +0000, Nadav Amit wrote:
> > 
> >>>> +	/*
> >>>> +	 * The lock is not really needed, but this allows to avoid open-coding.
> >>>> +	 */
> >>>> +	ptep = get_locked_pte(poking_mm, poking_addr, &ptl);
> >>>> +
> >>>> +	/*
> >>>> +	 * If we failed to allocate a PTE, fail. This should *never* happen,
> >>>> +	 * since we preallocate the PTE.
> >>>> +	 */
> >>>> +	if (WARN_ON_ONCE(!ptep))
> >>>> +		goto out;
> >>> 
> >>> Since we hard rely on init getting that right; can't we simply get rid
> >>> of this?

> I understand. So the question is - what would you prefer: something like
> PARANOID_WARN_ON_ONCE() or should I just remove the assertion?

Something like:

	/*
	 * @ptep cannot be NULL per construction in poking_init().
	 */

And then leave it at that. If it ever comes unstuck we'll get the NULL
deref, which is just as good as a BUG_ON().

> >>>> +out:
> >>>> +	if (memcmp(addr, opcode, len))
> >>>> +		r = -EFAULT;
> >>> 
> >>> How could this ever fail? And how can we reliably recover from that?
> >> 
> >> This code has been there before (with slightly uglier code). Before this
> >> patch, a BUG_ON() was used here. However, I noticed that kgdb actually
> >> checks that text_poke() succeeded after calling it and gracefully fail.
> >> However, this was useless, since text_poke() would panic before kgdb gets
> >> the chance to do anything (see patch 7).
> > 
> > Yes, I know it was there before, and I did see kgdb do it too. But aside
> > from that out-label case, which we also should never hit, how can we
> > realistically ever fail that memcmp()?
> > 
> > If we fail here, something is _seriously_ buggered.
> 
> I agree. But as it may be useful at least to warn in such a case, as
> debugging of SMC/CMC is hard. For example, if there is some sort of a race
> between module (un)loading and static-keys - such a check might be
> beneficial to indicate so. Having said that, changing it into VM_BUG_ON() or
> something similar may make more sense.
> 
> Personally, I don’t care much - I’m just worried that I made some intrusive
> changes *and* you want me to remove the assertion that checks that I didn’t
> screw up.

Ah, so I'm perfectly fine with something like:

	VM_BUG_ON(memcmp());

I just don't see value in the whole return code here. If this comes
unstuck, we're buggered beyond repair.