lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 25 Dec 2009 10:51:25 +0900 (JST)
From:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	kosaki.motohiro@...fujitsu.com,
	Hugh Dickins <hugh.dickins@...cali.co.uk>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Nick Piggin <npiggin@...e.de>, Ingo Molnar <mingo@...e.hu>,
	LKML <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Darren Hart <dvhltc@...ibm.com>,
	Ulrich Drepper <drepper@...il.com>
Subject: Re: [PATCH] futex: Fix ZERO_PAGE cause infinite loop

> > diff --git a/kernel/futex.c b/kernel/futex.c
> > index 8e3c3ff..79b89cc 100644
> > --- a/kernel/futex.c
> > +++ b/kernel/futex.c
> > @@ -254,6 +254,8 @@ again:
> >  
> >  	page = compound_head(page);
> >  	lock_page(page);
> > +	if (is_zero_pfn(page_to_pfn(page)))
> > +		goto anon_key;
> >  	if (!page->mapping) {
> >  		unlock_page(page);
> >  		put_page(page);
> > @@ -268,6 +270,7 @@ again:
> >  	 * the object not the particular process.
> >  	 */
> >  	if (PageAnon(page)) {
> > + anon_key:
> >  		key->both.offset |= FUT_OFF_MMSHARED; /* ref taken on mm */
> >  		key->private.mm = mm;
> >  		key->private.address = address;
> 
> Doesn't it make more sense to simply fail the futex op?
> 
> What I mean is, all (?) futex ops assume userspace actually wrote
> something to the address in question to begin with, therefore it can
> never be the zero page.
> 
> So it being the zero page means someone goofed up, and we should bail,
> no?

Hm. probably we need to discuss more.

Firstly, if we assume current glibc implimentation, you are right,
we can assume userland always initialize the page explicitly before using
futex. then we never seen zero page in futex.

but, I don't think futex itself assume it now. at least man page
doesn't describe such limilation. so, if you prefer bail and man fix,
I'm acceptable maybe.

I don't know any library except glibc use futex directly, or not. 
but we don't have any reason to assume futex is used only glibc.
(plus, glibc might change its implementation in future release, perhaps)

following testcase is more practice than last mail's one.
if we add zero page bail logic, this testcase mihgt be fail.

----------------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <syscall.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <linux/futex.h>
#include <pthread.h>

void * futex_wait(void *arg)
{
	void *addr = arg;
	int ret;

	fprintf(stderr, "futex wait\n");
	ret = syscall( SYS_futex, addr, FUTEX_WAIT, 0, NULL, NULL, NULL);
	if (ret != 0 && errno != EWOULDBLOCK) {
		perror("futex error.\n");
		exit(1);
	}
	fprintf(stderr, "futex_wait: ret = %d, errno = %d\n", ret, errno);

	return NULL;
}

int main(int argc, char **argv)
{
	long page_size;
	pthread_t thr;
	int ret;
	void *buf;

	page_size = sysconf(_SC_PAGESIZE);

	buf = mmap(NULL, page_size, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 0, 0);
	if (buf == (void *)-1) {
		perror("mmap error.\n");
		exit(1);
	}

	ret = pthread_create(&thr, NULL, futex_wait, buf);
	if (ret < 0) {
		fprintf(stderr, "pthread_create error\n");
		exit(1);
	}

	sleep(10);

	fprintf(stderr, "futex wake\n");
	*((int*)buf) = 1;
	ret = syscall( SYS_futex, buf, FUTEX_WAKE, 1, NULL, NULL, NULL);
	fprintf(stderr, "futex_wake: ret = %d, errno = %d\n", ret, errno);	

	fprintf(stderr, "join\n");
	pthread_join(thr, NULL);

	return 0;
}
--------------------------------------------------------------------------

man page fix has another difficulty. in past days, zero pages was perfectly
transparent from userland. then any man page don't describe "what's zero page".
then some administrator don't know about zero page at all.

So, if your main disliked point is goto statement, following patch is ok
to me.

Finally, I agree this is really corner case issue. I can agree man page fix
too. but I hope to know which point you dislike in my patch.

Thanks.




>From 09b0a7426c707e2fab5ac8e1ef8feec14415ee62 Mon Sep 17 00:00:00 2001
From: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Date: Thu, 24 Dec 2009 18:07:42 +0900
Subject: [PATCH] futex: Fix ZERO_PAGE cause infinite loop

commit a13ea5b7 (mm: reinstate ZERO_PAGE) made the unfortunate regression.
following test program never finish and waste 100% cpu time.

At the making commit 38d47c1b7 (rely on get_user_pages() for shared
futexes). There isn't zero page in linux kernel. then, futex developers
thought gup retry is safe. but we reinstated zero page later...

This patch fixes it.

futex-zero.c
---------------------------------------------------------------------
 #include <stdio.h>
 #include <stdlib.h>
 #include <sys/mman.h>
 #include <syscall.h>
 #include <sys/time.h>
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <fcntl.h>
 #include <unistd.h>
 #include <errno.h>
 #include <linux/futex.h>
 #include <pthread.h>

int main(int argc, char **argv)
{
        long page_size;
        int ret;
        void *buf;

        page_size = sysconf(_SC_PAGESIZE);

        buf = mmap(NULL, page_size, PROT_READ|PROT_WRITE,
			MAP_PRIVATE|MAP_ANONYMOUS, 0, 0);
        if (buf == (void *)-1) {
                perror("mmap error.\n");
                exit(1);
        }

        fprintf(stderr, "futex wait\n");
        ret = syscall( SYS_futex, buf, FUTEX_WAIT, 1, NULL, NULL, NULL);
        if (ret != 0 && errno != EWOULDBLOCK) {
                perror("futex error.\n");
                exit(1);
        }
        fprintf(stderr, "futex_wait: ret = %d, errno = %d\n", ret, errno);

        return 0;
}
---------------------------------------------------------------------

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Cc: Hugh Dickins <hugh.dickins@...cali.co.uk>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc: Nick Piggin <npiggin@...e.de>
Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: Ingo Molnar <mingo@...e.hu>
---
 include/linux/mm.h |   16 ++++++++++++++++
 kernel/futex.c     |    6 ++++--
 mm/memory.c        |   14 --------------
 3 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2265f28..dd755ea 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -751,6 +751,22 @@ struct zap_details {
 	unsigned long truncate_count;		/* Compare vm_truncate_count */
 };
 
+#ifndef is_zero_pfn
+extern unsigned long zero_pfn;
+static inline int is_zero_pfn(unsigned long pfn)
+{
+	return pfn == zero_pfn;
+}
+#endif
+
+#ifndef my_zero_pfn
+extern unsigned long zero_pfn;
+static inline unsigned long my_zero_pfn(unsigned long addr)
+{
+	return zero_pfn;
+}
+#endif
+
 struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
 		pte_t pte);
 
diff --git a/kernel/futex.c b/kernel/futex.c
index 8e3c3ff..ad72989 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -222,6 +222,7 @@ get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key, int rw)
 	struct mm_struct *mm = current->mm;
 	struct page *page;
 	int err;
+	int is_zero_page;
 
 	/*
 	 * The futex address must be "naturally" aligned.
@@ -253,8 +254,9 @@ again:
 		return err;
 
 	page = compound_head(page);
+	is_zero_page = is_zero_pfn(page_to_pfn(page));
 	lock_page(page);
-	if (!page->mapping) {
+	if (!is_zero_page && !page->mapping) {
 		unlock_page(page);
 		put_page(page);
 		goto again;
@@ -267,7 +269,7 @@ again:
 	 * it's a read-only handle, it's expected that futexes attach to
 	 * the object not the particular process.
 	 */
-	if (PageAnon(page)) {
+	if (is_zero_page || PageAnon(page)) {
 		key->both.offset |= FUT_OFF_MMSHARED; /* ref taken on mm */
 		key->private.mm = mm;
 		key->private.address = address;
diff --git a/mm/memory.c b/mm/memory.c
index 09e4b1b..3743fb5 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -457,20 +457,6 @@ static inline int is_cow_mapping(unsigned int flags)
 	return (flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE;
 }
 
-#ifndef is_zero_pfn
-static inline int is_zero_pfn(unsigned long pfn)
-{
-	return pfn == zero_pfn;
-}
-#endif
-
-#ifndef my_zero_pfn
-static inline unsigned long my_zero_pfn(unsigned long addr)
-{
-	return zero_pfn;
-}
-#endif
-
 /*
  * vm_normal_page -- This function gets the "struct page" associated with a pte.
  *
-- 
1.6.5.2



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ