lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 24 May 2011 16:31:45 -0400
From:	Dan Rosenberg <drosenberg@...curity.com>
To:	Dan Rosenberg <drosenberg@...curity.com>,
	Tony Luck <tony.luck@...il.com>, linux-kernel@...r.kernel.org,
	davej@...hat.com, kees.cook@...onical.com, davem@...emloft.net,
	eranian@...gle.com, torvalds@...ux-foundation.org,
	adobriyan@...il.com, penberg@...nel.org, hpa@...or.com,
	Arjan van de Ven <arjan@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Valdis.Kletnieks@...edu, Ingo Molnar <mingo@...e.hu>,
	pageexec@...email.hu
Subject: [RFC][PATCH] Randomize kernel base address on boot

This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
which the kernel is decompressed at boot as a security feature that
deters exploit attempts relying on knowledge of the location of kernel
internals.  The default values of the kptr_restrict and dmesg_restrict
sysctls are set to (1) when this is enabled, since hiding kernel
pointers is necessary to preserve the secrecy of the randomized base
address.

This feature also uses a fixed mapping to move the IDT (if not already
done as a fix for the F00F bug), to avoid exposing the location of
kernel internals relative to the original IDT.  This has the additional
security benefit of marking the new virtual address of the IDT
read-only.

Entropy is generated using the RDRAND instruction if it is supported. If
not, then RDTSC is used, if supported. If neither RDRAND nor RDTSC are
supported, then no randomness is introduced. Support for the CPUID
instruction is required to check for the availability of these two
instructions.

Thanks to everyone who contributed helpful suggestions and feedback so
far.

Comments/Questions:

* Since RDRAND is relatively new, only the most recent version of
binutils supports assembling it.  To avoid breaking builds for people
who use older toolchains but want this feature, I hardcoded the opcodes.
If anyone has a better approach, please let me know.

* I chose to mimic the F00F bugfix behavior for moving the IDT, since it
required very little code and has the additional benefit of making the
IDT read-only. Ingo Molnar's suggestion of allocating per-cpu IDTs
instead is still on the table, and I'd like to get feedback on this.

* In order to increase the entropy for the randomized base, I changed
the default value of CONFIG_PHYSICAL_ALIGN back to 2mb.  It had
previously been raised to 16mb as a hack so that relocatable kernels
wouldn't load below that minimum.  I address this by changing the
meaning of CONFIG_PHYSICAL_START such that it now represents a minimum
address that relocatable kernels can be loaded at (rather than being
ignored by relocatable kernels).  So, if a relocatable kernel determines
it should be loaded at an address below CONFIG_PHYSICAL_START (which
defaults to 16mb), I just bump it up.

* I would appreciate guidance on safe values for the highest addresses
we can safely load the kernel at, on both 32-bit and 64-bit. This
version uses 64mb (0x4000000) for 32-bit, and worked well in testing.

* CONFIG_RANDOMIZE_BASE automatically sets the default value of
kptr_restrict and dmesg_restrict to 1, since it's nonsensical to use
this without the other two.  I considered removing
CONFIG_SECURITY_DMESG_RESTRICT altogether (it currently sets the default
value for dmesg_restrict), but just in case distros want to keep the
CONFIG as a toggle switch but don't want to use CONFIG_RANDOMIZE_BASE, I
kept it around.  So, now CONFIG_RANDOMIZE_BASE sets the default value
for CONFIG_SECURITY_DMESG_RESTRICT.

* x86-64 is still "to-do". Because it calculates the kernel text address
twice, this may be a little trickier.

* Finding a middle ground instead of the current "all-or-nothing"
behavior of kptr_restrict that allows perf users to use this feature is
future work.

* Tested by repeatedly booting and observing kallsyms output on both
i386.  Passed the "looks random to me" test, and saw no bad behavior.
Tested that changing CONFIG_PHYSICAL_ALIGN to 2mb still boots and runs
fine on amd64.

* Is it worth bothering to look for alternate sources of entropy if
RDTSC isn't available?

* Could use testing of CPU hotplugging and suspend/resume.

Signed-off-by: Dan Rosenberg <drosenberg@...curity.com>
---
 Documentation/sysctl/kernel.txt    |   13 ++++---
 arch/x86/Kconfig                   |   32 ++++++++++++++++--
 arch/x86/boot/compressed/head_32.S |   63 ++++++++++++++++++++++++++++++++++++
 arch/x86/boot/compressed/head_64.S |   16 ++++++++-
 arch/x86/include/asm/fixmap.h      |    4 ++
 arch/x86/kernel/traps.c            |    7 ++++
 kernel/printk.c                    |    4 +-
 lib/vsprintf.c                     |    4 ++
 security/Kconfig                   |    2 +-
 9 files changed, 132 insertions(+), 13 deletions(-)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 36f0075..ed91ae3 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -267,11 +267,14 @@ kptr_restrict:
 This toggle indicates whether restrictions are placed on
 exposing kernel addresses via /proc and other interfaces.  When
 kptr_restrict is set to (0), there are no restrictions.  When
-kptr_restrict is set to (1), the default, kernel pointers
-printed using the %pK format specifier will be replaced with 0's
-unless the user has CAP_SYSLOG.  When kptr_restrict is set to
-(2), kernel pointers printed using %pK will be replaced with 0's
-regardless of privileges.
+kptr_restrict is set to (1), kernel pointers printed using the
+%pK format specifier will be replaced with 0's unless the user
+has CAP_SYSLOG.  When kptr_restrict is set to (2), kernel
+pointers printed using %pK will be replaced with 0's regardless
+of privileges.
+
+Enabling the CONFIG_RANDOMIZE_BASE kernel config sets the default
+kptr_restrict value to (1).  Otherwise, the default is (0).
 
 ==============================================================
 
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 880fcb6..999ea82 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1548,8 +1548,8 @@ config PHYSICAL_START
 	  If kernel is a not relocatable (CONFIG_RELOCATABLE=n) then
 	  bzImage will decompress itself to above physical address and
 	  run from there. Otherwise, bzImage will run from the address where
-	  it has been loaded by the boot loader and will ignore above physical
-	  address.
+	  it has been loaded by the boot loader, using the above physical
+	  address as a lower bound.
 
 	  In normal kdump cases one does not have to set/change this option
 	  as now bzImage can be compiled as a completely relocatable image
@@ -1595,7 +1595,31 @@ config RELOCATABLE
 
 	  Note: If CONFIG_RELOCATABLE=y, then the kernel runs from the address
 	  it has been loaded at and the compile time physical address
-	  (CONFIG_PHYSICAL_START) is ignored.
+	  (CONFIG_PHYSICAL_START) is solely used as a lower bound.
+
+config RANDOMIZE_BASE
+	bool "Randomize the address of the kernel image"
+	depends on X86_32 && RELOCATABLE
+	default n
+	---help---
+	  Randomizes the address at which the kernel image is decompressed, as
+	  a security feature that deters exploit attempts relying on knowledge
+	  of the location of kernel internals. The default values of the
+	  kptr_restrict and dmesg_restrict sysctls are set to (1) when this is
+	  enabled, since hiding kernel pointers is necessary to preserve the
+	  secrecy of the randomized base address.
+
+	  This feature also uses a fixed mapping to move the IDT (if not
+	  already done as a fix for the F00F bug), to avoid exposing the
+	  location of kernel internals relative to the original IDT. This has
+	  the additional security benefit of marking the new virtual address of
+	  the IDT read-only.
+
+	  Entropy is generated using the RDRAND instruction if it is supported.
+	  If not, then RDTSC is used, if supported. If neither RDRAND nor RDTSC
+	  are supported, then no randomness is introduced. Support for the
+	  CPUID instruction is required to check for the availability of these
+	  two instructions.
 
 # Relocation on x86-32 needs some additional build support
 config X86_NEED_RELOCS
@@ -1604,7 +1628,7 @@ config X86_NEED_RELOCS
 
 config PHYSICAL_ALIGN
 	hex "Alignment value to which kernel should be aligned" if X86_32
-	default "0x1000000"
+	default "0x200000"
 	range 0x2000 0x1000000
 	---help---
 	  This value puts the alignment restrictions on physical address
diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 67a655a..2680db0 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -69,12 +69,75 @@ ENTRY(startup_32)
  */
 
 #ifdef CONFIG_RELOCATABLE
+#ifdef CONFIG_RANDOMIZE_BASE
+
+	/* Standard check for cpuid */
+	pushfl
+	popl	%eax
+	movl	%eax, %ebx
+	xorl	$0x200000, %eax
+	pushl	%eax
+	popfl
+	pushfl
+	popl	%eax
+	cmpl	%eax, %ebx
+	jz	4f
+
+	/* Check for cpuid 1 */
+	movl	$0x0, %eax
+	cpuid
+	cmpl	$0x1, %eax
+	jb	4f
+
+	movl	$0x1, %eax
+	cpuid
+	xor	%eax, %eax
+
+	/* RDRAND is bit 30 */
+	testl	$0x4000000, %ecx
+	jnz	1f
+
+	/* RDTSC is bit 4 */
+	testl	$0x10, %edx
+	jnz	3f
+
+	/* Nothing is supported */
+	jmp	4f
+1:
+	/* RDRAND sets carry bit on success, otherwise we should try
+	 * again. */
+	movl	$0x10, %ecx
+2:
+	/* rdrand %eax */
+	.byte	0x0f, 0xc7, 0xf0
+	jc	4f
+	loop	2b
+
+	/* Fall through: if RDRAND is supported but fails, use RDTSC,
+	 * which is guaranteed to be supported. */
+3:
+	rdtsc
+	shll	$0xc, %eax
+4:
+	/* Maximum offset at 64mb to be safe */
+	andl	$0x3ffffff, %eax
+	movl	%ebp, %ebx
+	addl	%eax, %ebx
+#else
 	movl	%ebp, %ebx
+#endif
 	movl	BP_kernel_alignment(%esi), %eax
 	decl	%eax
 	addl    %eax, %ebx
 	notl	%eax
 	andl    %eax, %ebx
+
+	/* LOAD_PHSYICAL_ADDR is the minimum safe address we can
+	 * decompress at. */
+	cmpl	$LOAD_PHYSICAL_ADDR, %ebx
+	jae	1f
+	movl	$LOAD_PHYSICAL_ADDR, %ebx
+1:
 #else
 	movl	$LOAD_PHYSICAL_ADDR, %ebx
 #endif
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 35af09d..6a05219 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -90,6 +90,13 @@ ENTRY(startup_32)
 	addl	%eax, %ebx
 	notl	%eax
 	andl	%eax, %ebx
+
+	/* LOAD_PHYSICAL_ADDR is the minimum safe address we can
+	 * decompress at. */
+	cmpl	$LOAD_PHYSICAL_ADDR, %ebx
+	jae	1f
+	movl	$LOAD_PHYSICAL_ADDR, %ebx
+1:
 #else
 	movl	$LOAD_PHYSICAL_ADDR, %ebx
 #endif
@@ -191,7 +198,7 @@ no_longmode:
 	 * it may change in the future.
 	 */
 	.code64
-	.org 0x200
+	.org 0x300
 ENTRY(startup_64)
 	/*
 	 * We come here either from startup_32 or directly from a
@@ -232,6 +239,13 @@ ENTRY(startup_64)
 	addq	%rax, %rbp
 	notq	%rax
 	andq	%rax, %rbp
+
+	/* LOAD_PHYSICAL_ADDR is the minimum safe address we can
+	 * decompress at. */
+	cmpq	$LOAD_PHYSICAL_ADDR, %rbp
+	jae	1f
+	movq	$LOAD_PHYSICAL_ADDR, %rbp
+1:
 #else
 	movq	$LOAD_PHYSICAL_ADDR, %rbp
 #endif
diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index 4729b2b..d1fabba 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -100,6 +100,10 @@ enum fixed_addresses {
 #endif
 #ifdef CONFIG_X86_F00F_BUG
 	FIX_F00F_IDT,	/* Virtual mapping for IDT */
+#else
+#ifdef CONFIG_RANDOMIZE_BASE
+	FIX_RANDOM_IDT, /* Virtual mapping for IDT */
+#endif
 #endif
 #ifdef CONFIG_X86_CYCLONE_TIMER
 	FIX_CYCLONE_TIMER, /*cyclone timer register*/
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index b9b6716..5672ad0 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -872,6 +872,13 @@ void __init trap_init(void)
 	set_bit(SYSCALL_VECTOR, used_vectors);
 #endif
 
+#if defined(CONFIG_RANDOMIZE_BASE) && !defined(CONFIG_X86_F00F_BUG)
+	__set_fixmap(FIX_RANDOM_IDT, __pa(&idt_table), PAGE_KERNEL_RO);
+
+	/* Update the IDT descriptor. It will be reloaded in cpu_init() */
+	idt_descr.address = fix_to_virt(FIX_RANDOM_IDT);
+#endif
+
 	/*
 	 * Should be a barrier for any external CPU state:
 	 */
diff --git a/kernel/printk.c b/kernel/printk.c
index da8ca81..283434f 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -262,9 +262,9 @@ static inline void boot_delay_msec(void)
 #endif
 
 #ifdef CONFIG_SECURITY_DMESG_RESTRICT
-int dmesg_restrict = 1;
+int dmesg_restrict __read_mostly = 1;
 #else
-int dmesg_restrict;
+int dmesg_restrict __read_mostly;
 #endif
 
 static int syslog_action_restricted(int type)
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 1d659d7..0d8da65 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -797,7 +797,11 @@ char *uuid_string(char *buf, char *end, const u8 *addr,
 	return string(buf, end, uuid, spec);
 }
 
+#ifdef CONFIG_RANDOMIZE_BASE
+int kptr_restrict __read_mostly = 1;
+#else
 int kptr_restrict __read_mostly;
+#endif
 
 /*
  * Show a '%p' thing.  A kernel extension is that the '%p' is followed
diff --git a/security/Kconfig b/security/Kconfig
index 95accd4..ffabef0 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -72,7 +72,7 @@ config KEYS_DEBUG_PROC_KEYS
 
 config SECURITY_DMESG_RESTRICT
 	bool "Restrict unprivileged access to the kernel syslog"
-	default n
+	default RANDOMIZE_BASE
 	help
 	  This enforces restrictions on unprivileged users reading the kernel
 	  syslog via dmesg(8).


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ