lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171115140426.bgvcd3bmegqadm5q@node.shutemov.name>
Date:   Wed, 15 Nov 2017 17:04:26 +0300
From:   "Kirill A. Shutemov" <kirill@...temov.name>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Ingo Molnar <mingo@...hat.com>, x86@...nel.org,
        "H. Peter Anvin" <hpa@...or.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andy Lutomirski <luto@...capital.net>,
        Nicholas Piggin <npiggin@...il.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCHv2 1/2] x86/mm: Do not allow non-MAP_FIXED mapping across
 DEFAULT_MAP_WINDOW border

On Wed, Nov 15, 2017 at 03:10:42PM +0300, Kirill A. Shutemov wrote:
> On Wed, Nov 15, 2017 at 12:39:40PM +0100, Thomas Gleixner wrote:
> > On Wed, 15 Nov 2017, Kirill A. Shutemov wrote:
> > > On Wed, Nov 15, 2017 at 12:00:46AM +0100, Thomas Gleixner wrote:
> > > > On Wed, 15 Nov 2017, Kirill A. Shutemov wrote:
> > > > > On Tue, Nov 14, 2017 at 09:54:52PM +0100, Thomas Gleixner wrote:
> > > > > > On Tue, 14 Nov 2017, Kirill A. Shutemov wrote:
> > > > > > 
> > > > > > > On Tue, Nov 14, 2017 at 05:01:50PM +0100, Thomas Gleixner wrote:
> > > > > > > > @@ -198,11 +199,14 @@ arch_get_unmapped_area_topdown(struct fi
> > > > > > > >  	/* requesting a specific address */
> > > > > > > >  	if (addr) {
> > > > > > > >  		addr = PAGE_ALIGN(addr);
> > > > > > > > +		if (!mmap_address_hint_valid(addr, len))
> > > > > > > > +			goto get_unmapped_area;
> > > > > > > > +
> > > > > > > 
> > > > > > > Here and in hugetlb_get_unmapped_area(), we should align the addr after
> > > > > > > the check, not before. Otherwise the alignment itself can bring us over
> > > > > > > the borderline as we align up.
> > > > > > 
> > > > > > Hmm, then I wonder whether the next check against vm_start_gap() which
> > > > > > checks against the aligned address is correct:
> > > > > > 
> > > > > >                 addr = PAGE_ALIGN(addr);
> > > > > >                 vma = find_vma(mm, addr);
> > > > > > 
> > > > > >                 if (end - len >= addr &&
> > > > > >                     (!vma || addr + len <= vm_start_gap(vma)))
> > > > > >                         return addr;
> > > > > 
> > > > > I think the check is correct. The check is against resulting addresses
> > > > > that end up in vm_start/vm_end. In our case we want to figure out what
> > > > > user asked for.
> > > > 
> > > > Well, but then checking just against the user supplied addr is only half of
> > > > the story.
> > > > 
> > > >     addr = boundary - PAGE_SIZE - PAGE_SIZE / 2;
> > > >     len = PAGE_SIZE - PAGE_SIZE / 2;
> > > > 
> > > > That fits, but then after alignment we end up with
> > > > 
> > > >     addr = boudary - PAGE_SIZE;
> > > > 
> > > > and due to len > PAGE_SIZE this will result in a mapping which crosses the
> > > > boundary, right? So checking against the PAGE_ALIGN(addr) should be the
> > > > right thing to do.
> > > 
> > > IIUC, this is only the case if 'len' is not aligned, right?
> > > 
> > > >From what I see we expect caller to align it (and mm/mmap.c does this, I
> > > haven't checked other callers).
> > > 
> > > And hugetlb would actively reject non-aligned len.
> > > 
> > > I *think* we should be fine with checking unaligned 'addr'.
> > 
> > I think we should keep it consistent for the normal and the huge case and
> > just check aligned and be done with it.
> 
> Aligned 'addr'? Or 'len'? Both?
> 
> We would have problem with checking aligned addr. I steped it in hugetlb
> case:
> 
>   - User asks for mmap((1UL << 47) - PAGE_SIZE, 2 << 20, MAP_HUGETLB);
> 
>   - On 4-level paging machine this gives us invalid hint address as
>     'TASK_SIZE - len' is more than 'addr'. Goto get_unmapped_area.
> 
>   - On 5-level paging machine hint address gets rounded up to next 2MB
>     boundary that is exactly 1UL << 47 and we happily allocate from full
>     address space which may lead to trouble.

Below is updated patch with self-test.

Output on 5-level paging machine:

mmap(NULL): 0x7fbbad1f3000 - OK
mmap(LOW_ADDR): 0x40000000 - OK
mmap(HIGH_ADDR): 0x4000000000000 - OK
mmap(HIGH_ADDR) again: 0xffffbbad1fb000 - OK
mmap(HIGH_ADDR, MAP_FIXED): 0x4000000000000 - OK
mmap(-1): 0xffffbbad1f9000 - OK
mmap(-1) again: 0xffffbbad1f7000 - OK
mmap((1UL << 47), 2 * PAGE_SIZE): 0x7fbbad1f3000 - OK
mmap((1UL << 47), 2 * PAGE_SIZE / 2): 0x7fbbad1f1000 - OK
mmap((1UL << 47) - PAGE_SIZE, 2 * PAGE_SIZE, MAP_FIXED): 0x7ffffffff000 - OK
mmap(NULL, MAP_HUGETLB): 0x7fbbac400000 - OK
mmap(LOW_ADDR, MAP_HUGETLB): 0x40000000 - OK
mmap(HIGH_ADDR, MAP_HUGETLB): 0x4000000000000 - OK
mmap(HIGH_ADDR, MAP_HUGETLB) again: 0xffffbbace00000 - OK
mmap(HIGH_ADDR, MAP_FIXED | MAP_HUGETLB): 0x4000000000000 - OK
mmap(-1, MAP_HUGETLB): (nil) - OK
mmap(-1, MAP_HUGETLB) again: 0x7fbbac400000 - OK
mmap((1UL << 47), 2UL << 20, MAP_HUGETLB): 0x800000000000 - FAILED
mmap((1UL << 47) - (2UL << 20), 4UL << 20, MAP_FIXED | MAP_HUGETLB): 0x7fffffe00000 - OK

So, only hugetlb is problematic. mmap() aligns addr to PAGE_SIZE.
See round_hint_to_min(). In this case we round address *down* and it works
fine.

Replacing 'addr = ALIGN(addr, huge_page_size(h))' in hugetlbpage.c with
'addr &= huge_page_mask(h)' fixes the issue.

>From 8645d0052b5919ee682a04f705f1668c2b281425 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Date: Wed, 8 Nov 2017 12:55:32 +0300
Subject: [PATCH] x86/selftests: Add test for mapping placement for 5-level
 paging

With 5-level paging, we have 56-bit virtual address space available for
userspace. But we don't want to expose userspace to addresses above
47-bits, unless it asked specifically for it.

We use mmap(2) hint address as a way for kernel to know if it's okay to
allocate virtual memory above 47-bit.

Let's add a self-test that covers few corner cases of the interface.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
---
 tools/testing/selftests/x86/5lvl.c   | 179 +++++++++++++++++++++++++++++++++++
 tools/testing/selftests/x86/Makefile |   2 +-
 2 files changed, 180 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/x86/5lvl.c

diff --git a/tools/testing/selftests/x86/5lvl.c b/tools/testing/selftests/x86/5lvl.c
new file mode 100644
index 000000000000..6c396f0c869d
--- /dev/null
+++ b/tools/testing/selftests/x86/5lvl.c
@@ -0,0 +1,179 @@
+#include <stdio.h>
+#include <sys/mman.h>
+
+#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
+
+#define PAGE_SIZE	4096
+#define SIZE		(2 * PAGE_SIZE)
+#define LOW_ADDR	((void *) (1UL << 30))
+#define HIGH_ADDR	((void *) (1UL << 50))
+#define TASK_SIZE	((void *) (1UL << 47))
+
+struct testcase {
+	void *addr;
+	unsigned long size;
+	unsigned long flags;
+	const char *msg;
+	unsigned int low_addr_required:1;
+	unsigned int keep_mapped:1;
+};
+
+static struct testcase testcases[] = {
+	{
+		.addr = NULL,
+		.size = 2 * PAGE_SIZE,
+		.flags = MAP_PRIVATE | MAP_ANONYMOUS,
+		.msg = "mmap(NULL)",
+		.low_addr_required = 1,
+	},
+	{
+		.addr = LOW_ADDR,
+		.size = 2 * PAGE_SIZE,
+		.flags = MAP_PRIVATE | MAP_ANONYMOUS,
+		.msg = "mmap(LOW_ADDR)",
+		.low_addr_required = 1,
+	},
+	{
+		.addr = HIGH_ADDR,
+		.size = 2 * PAGE_SIZE,
+		.flags = MAP_PRIVATE | MAP_ANONYMOUS,
+		.msg = "mmap(HIGH_ADDR)",
+		.keep_mapped = 1,
+	},
+	{
+		.addr = HIGH_ADDR,
+		.size = 2 * PAGE_SIZE,
+		.flags = MAP_PRIVATE | MAP_ANONYMOUS,
+		.msg = "mmap(HIGH_ADDR) again",
+		.keep_mapped = 1,
+	},
+	{
+		.addr = HIGH_ADDR,
+		.size = 2 * PAGE_SIZE,
+		.flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED,
+		.msg = "mmap(HIGH_ADDR, MAP_FIXED)",
+	},
+	{
+		.addr = (void*) -1,
+		.size = 2 * PAGE_SIZE,
+		.flags = MAP_PRIVATE | MAP_ANONYMOUS,
+		.msg = "mmap(-1)",
+		.keep_mapped = 1,
+	},
+	{
+		.addr = (void*) -1,
+		.size = 2 * PAGE_SIZE,
+		.flags = MAP_PRIVATE | MAP_ANONYMOUS,
+		.msg = "mmap(-1) again",
+	},
+	{
+		.addr = (void *)((1UL << 47) - PAGE_SIZE),
+		.size = 2 * PAGE_SIZE,
+		.flags = MAP_PRIVATE | MAP_ANONYMOUS,
+		.msg = "mmap((1UL << 47), 2 * PAGE_SIZE)",
+		.low_addr_required = 1,
+		.keep_mapped = 1,
+	},
+	{
+		.addr = (void *)((1UL << 47) - PAGE_SIZE / 2),
+		.size = 2 * PAGE_SIZE,
+		.flags = MAP_PRIVATE | MAP_ANONYMOUS,
+		.msg = "mmap((1UL << 47), 2 * PAGE_SIZE / 2)",
+		.low_addr_required = 1,
+		.keep_mapped = 1,
+	},
+	{
+		.addr = (void *)((1UL << 47) - PAGE_SIZE),
+		.size = 2 * PAGE_SIZE,
+		.flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED,
+		.msg = "mmap((1UL << 47) - PAGE_SIZE, 2 * PAGE_SIZE, MAP_FIXED)",
+	},
+	{
+		.addr = NULL,
+		.size = 2UL << 20,
+		.flags = MAP_HUGETLB | MAP_PRIVATE | MAP_ANONYMOUS,
+		.msg = "mmap(NULL, MAP_HUGETLB)",
+		.low_addr_required = 1,
+	},
+	{
+		.addr = LOW_ADDR,
+		.size = 2UL << 20,
+		.flags = MAP_HUGETLB | MAP_PRIVATE | MAP_ANONYMOUS,
+		.msg = "mmap(LOW_ADDR, MAP_HUGETLB)",
+		.low_addr_required = 1,
+	},
+	{
+		.addr = HIGH_ADDR,
+		.size = 2UL << 20,
+		.flags = MAP_HUGETLB | MAP_PRIVATE | MAP_ANONYMOUS,
+		.msg = "mmap(HIGH_ADDR, MAP_HUGETLB)",
+		.keep_mapped = 1,
+	},
+	{
+		.addr = HIGH_ADDR,
+		.size = 2UL << 20,
+		.flags = MAP_HUGETLB | MAP_PRIVATE | MAP_ANONYMOUS,
+		.msg = "mmap(HIGH_ADDR, MAP_HUGETLB) again",
+		.keep_mapped = 1,
+	},
+	{
+		.addr = HIGH_ADDR,
+		.size = 2UL << 20,
+		.flags = MAP_HUGETLB | MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED,
+		.msg = "mmap(HIGH_ADDR, MAP_FIXED | MAP_HUGETLB)",
+	},
+	{
+		.addr = (void*) -1,
+		.size = 2UL << 20,
+		.flags = MAP_HUGETLB | MAP_PRIVATE | MAP_ANONYMOUS,
+		.msg = "mmap(-1, MAP_HUGETLB)",
+		.keep_mapped = 1,
+	},
+	{
+		.addr = (void*) -1,
+		.size = 2UL << 20,
+		.flags = MAP_HUGETLB | MAP_PRIVATE | MAP_ANONYMOUS,
+		.msg = "mmap(-1, MAP_HUGETLB) again",
+	},
+	{
+		.addr = (void *)((1UL << 47) - PAGE_SIZE),
+		.size = 4UL << 20,
+		.flags = MAP_HUGETLB | MAP_PRIVATE | MAP_ANONYMOUS,
+		.msg = "mmap((1UL << 47), 4UL << 20, MAP_HUGETLB)",
+		.low_addr_required = 1,
+		.keep_mapped = 1,
+	},
+	{
+		.addr = (void *)((1UL << 47) - (2UL << 20)),
+		.size = 4UL << 20,
+		.flags = MAP_HUGETLB | MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED,
+		.msg = "mmap((1UL << 47) - (2UL << 20), 4UL << 20, MAP_FIXED | MAP_HUGETLB)",
+	},
+};
+
+int main(int argc, char **argv)
+{
+	int i;
+	void *p;
+
+	for (i = 0; i < ARRAY_SIZE(testcases); i++) {
+		struct testcase *t = testcases + i;
+
+		p = mmap(t->addr, t->size, PROT_NONE, t->flags, -1, 0);
+
+		printf("%s: %p - ", t->msg, p);
+
+		if (p == MAP_FAILED) {
+			printf("FAILED\n");
+			continue;
+		}
+
+		if (t->low_addr_required && p >= (void *)(1UL << 47))
+			printf("FAILED\n");
+		else
+			printf("OK\n");
+		if (!t->keep_mapped)
+			munmap(p, t->size);
+	}
+	return 0;
+}
diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index 7b1adeee4b0f..939a337128db 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -11,7 +11,7 @@ TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt ptrace_sysc
 TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault test_syscall_vdso unwind_vdso \
 			test_FCMOV test_FCOMI test_FISTTP \
 			vdso_restorer
-TARGETS_C_64BIT_ONLY := fsgsbase sysret_rip
+TARGETS_C_64BIT_ONLY := fsgsbase sysret_rip 5lvl
 
 TARGETS_C_32BIT_ALL := $(TARGETS_C_BOTHBITS) $(TARGETS_C_32BIT_ONLY)
 TARGETS_C_64BIT_ALL := $(TARGETS_C_BOTHBITS) $(TARGETS_C_64BIT_ONLY)
-- 
 Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ