lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <5223311D.2040608@zytor.com>
Date:	Sun, 01 Sep 2013 05:20:45 -0700
From:	"H. Peter Anvin" <hpa@...or.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Randy Dunlap <rdunlap@...otime.net>,
	Ingo Molnar <mingo@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Arjan van de Ven <arjan@...ux.intel.com>
Subject: On the correctness of dbe3ed1c078c193be34326728d494c5c4bc115e2

A truly ancient commit (v2.6.23), dbe3ed1c078c193be34326728d494c5c4bc115e2:

    x86-64: page faults from user mode are always user faults

    Randy Dunlap noticed an interesting "crashme" behaviour on his dual
    Prescott Xeon setup, where he gets page faults with the error code
    having a zero "user" bit, but the register state points back to user
    mode.

    This may be a CPU microcode buglet triggered by some strange
    instruction pattern that crashme generates, and loading a microcode
    update seems to possibly have fixed it.

    Regardless, we really should trust the register state more than the
    error code, since it's really the register state that determines
    whether we can actually send a signal, or whether we're in kernel
    mode and need to oops/kill the process in the case of a page fault.

... introduced the following code (since slightly modified):

+	/*
+	 * User-mode registers count as a user access even for any
+	 * potential system fault or CPU buglet.
+	 */
+	if (user_mode_vm(regs))
+		error_code |= PF_USER;
+

This has the end result that we treat a user space instruction which
touches a privileged data structure that then page faults (e.g. a
segment load which causes #PF on the GDT) as a user-space fault.

This seems very wrong to me, since such a #PF would indicate a serious
error in the kernel.

If this was a buglet introduced by a specific processor ("Prescott Xeon"
I presume means Nocona) and then even fixed in a patch, I'm concerned
that we are putting the cart before the horse with this change.

I went through the errata sheets for the CPUs of the time, but nothing
jumped out at me as causing this kind of problem, although there is a
mention of a couple of undefined opcodes which ought to #UD being able
to generate a "load to an incorrect address".  Kind of a stretch, though.

	-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ