lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 22 Mar 2011 20:09:29 +0900 (JST)
From:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Cc:	kosaki.motohiro@...fujitsu.com, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	David Rientjes <rientjes@...gle.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Rik van Riel <riel@...hat.com>,
	Oleg Nesterov <oleg@...hat.com>, linux-mm <linux-mm@...ck.org>,
	Andrey Vagin <avagin@...nvz.org>,
	Hugh Dickins <hughd@...gle.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Subject: [PATCH 5/5] x86,mm: make pagefault killable

When oom killer occured, almost processes are getting stuck following
two points.

	1) __alloc_pages_nodemask
	2) __lock_page_or_retry

1) is not much problematic because TIF_MEMDIE lead to make allocation
failure and get out from page allocator. 2) is more problematic. When
OOM situation, Zones typically don't have page cache at all and Memory
starvation might lead to reduce IO performance largely. When fork bomb
occur, TIF_MEMDIE task don't die quickly mean fork bomb may create
new process quickly rather than oom-killer kill it. Then, the system
may become livelock.

This patch makes pagefault interruptible by SIGKILL.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
---
 arch/x86/mm/fault.c |    9 +++++++++
 include/linux/mm.h  |    1 +
 mm/filemap.c        |   22 +++++++++++++++++-----
 3 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 20e3f87..797c7d0 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1035,6 +1035,7 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code)
 	if (user_mode_vm(regs)) {
 		local_irq_enable();
 		error_code |= PF_USER;
+		flags |= FAULT_FLAG_KILLABLE;
 	} else {
 		if (regs->flags & X86_EFLAGS_IF)
 			local_irq_enable();
@@ -1138,6 +1139,14 @@ good_area:
 	}
 
 	/*
+	 * Pagefault was interrupted by SIGKILL. We have no reason to
+	 * continue pagefault.
+	 */
+	if ((flags & FAULT_FLAG_KILLABLE) && (fault & VM_FAULT_RETRY) &&
+	    fatal_signal_pending(current))
+		return;
+
+	/*
 	 * Major/minor page fault accounting is only done on the
 	 * initial attempt. If we go through a retry, it is extremely
 	 * likely that the page will be found in page cache at that point.
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0716517..9e7c567 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -152,6 +152,7 @@ extern pgprot_t protection_map[16];
 #define FAULT_FLAG_MKWRITE	0x04	/* Fault was mkwrite of existing pte */
 #define FAULT_FLAG_ALLOW_RETRY	0x08	/* Retry fault if blocking */
 #define FAULT_FLAG_RETRY_NOWAIT	0x10	/* Don't drop mmap_sem and wait when retrying */
+#define FAULT_FLAG_KILLABLE	0x20	/* The fault task is in SIGKILL killable region */
 
 /*
  * This interface is used by x86 PAT code to identify a pfn mapping that is
diff --git a/mm/filemap.c b/mm/filemap.c
index f5f9ac2..affba94 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -719,15 +719,27 @@ void __lock_page_nosync(struct page *page)
 int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 			 unsigned int flags)
 {
-	if (!(flags & FAULT_FLAG_ALLOW_RETRY)) {
-		__lock_page(page);
-		return 1;
-	} else {
+	int ret;
+
+	if (flags & FAULT_FLAG_ALLOW_RETRY) {
 		if (!(flags & FAULT_FLAG_RETRY_NOWAIT)) {
 			up_read(&mm->mmap_sem);
-			wait_on_page_locked(page);
+			if (flags & FAULT_FLAG_KILLABLE)
+				wait_on_page_locked_killable(page);
+			else
+				wait_on_page_locked(page);
 		}
 		return 0;
+	} else {
+		if (flags & FAULT_FLAG_KILLABLE) {
+			ret = __lock_page_killable(page);
+			if (ret) {
+				up_read(&mm->mmap_sem);
+				return 0;
+			}
+		} else
+			__lock_page(page);
+		return 1;
 	}
 }
 
-- 
1.6.5.2



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ