lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <201906142012.C18377C5@keescook>
Date:   Fri, 14 Jun 2019 20:19:38 -0700
From:   Kees Cook <keescook@...omium.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>, x86@...nel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Dave Hansen <dave.hansen@...el.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/2] x86/asm: Pin sensitive CR4 bits

On Fri, Jun 14, 2019 at 04:57:36PM +0200, Thomas Gleixner wrote:
> Wouldn't it make sense to catch situations where a regular caller provides
> @val with pinned bits unset? I.e. move the OR into this code path after
> storing bits_missing.

I mentioned this in the commit log, but maybe I wasn't very clear:

> > Since these bits are global state (once established by the boot CPU
> > and kernel boot parameters), they are safe to write to secondary CPUs
> > before those CPUs have finished feature detection. As such, the bits are
> > written with an "or" performed before the register write as that is both
> > easier and uses a few bytes less storage of a location we don't have:
> > read-only per-CPU data. (Note that initialization via cr4_init_shadow()
> > isn't early enough to avoid early native_write_cr4() calls.)

Basically, there are calls of native_write_cr4() made very early in
secondary CPU init that would hit the "eek missing bit" case, and using
the cr4_init_shadow() trick mentioned by Linus still wasn't early
enough. So, since feature detection for these bits is "done" (i.e.
secondary CPUs will match the boot CPU's for these bits), it's safe to
set them "early". To avoid this, we'd need a per-cpu "am I ready to set
these bits?" state, and it'd need to be read-only-after-init, which is
not a section that exists and seems wasteful to create (4095 bytes
unused) for this feature alone.

> Something like this:
> 
> 	unsigned long bits_missing = 0;
> 
> again:
> 	asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits));
> 
> 	if (static_branch_likely(&cr_pinning)) {
> 		if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) {
> 			bits_missing = ~val & cr4_pinned_bits;
> 			val |= cr4_pinned_bits;
> 			goto again;
> 		}
> 		/* Warn after we've set the missing bits. */
> 		WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n",
> 			  bits_missing);
> 	}

Yup, that's actually exactly what I had tried. Should I try to track down
the very early callers and OR in the bits at the call sites instead? (This
had occurred to me also, but it seemed operationally equivalent to ORing
at the start of native_write_cr4(), so I didn't even bother trying it).

-- 
Kees Cook

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ