lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 8 Aug 2011 22:23:18 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Jan Kara <jack@...e.cz>, Christoph Hellwig <hch@....de>,
	Dave Chinner <david@...morbit.com>,
	Greg Thelen <gthelen@...gle.com>,
	Minchan Kim <minchan.kim@...il.com>,
	Vivek Goyal <vgoyal@...hat.com>,
	Andrea Righi <arighi@...eler.com>,
	linux-mm <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 4/5] writeback: per task dirty rate limit

On Mon, Aug 08, 2011 at 09:47:14PM +0800, Peter Zijlstra wrote:
> On Sat, 2011-08-06 at 16:44 +0800, Wu Fengguang wrote:
> > Add two fields to task_struct.
> > 
> > 1) account dirtied pages in the individual tasks, for accuracy
> > 2) per-task balance_dirty_pages() call intervals, for flexibility
> > 
> > The balance_dirty_pages() call interval (ie. nr_dirtied_pause) will
> > scale near-sqrt to the safety gap between dirty pages and threshold.
> > 
> > XXX: The main problem of per-task nr_dirtied is, if 10k tasks start
> > dirtying pages at exactly the same time, each task will be assigned a
> > large initial nr_dirtied_pause, so that the dirty threshold will be
> > exceeded long before each task reached its nr_dirtied_pause and hence
> > call balance_dirty_pages().
> > 
> > Signed-off-by: Wu Fengguang <fengguang.wu@...el.com>
> > ---
> >  include/linux/sched.h |    7 ++
> >  mm/memory_hotplug.c   |    3 -
> >  mm/page-writeback.c   |  106 +++++++++-------------------------------
> >  3 files changed, 32 insertions(+), 84 deletions(-) 
> 
> No fork() hooks? This way tasks inherit their parent's dirty count on
> clone().

btw, I do have another patch queued for improving the "leaked dirties
on exit" case :)

Thanks,
Fengguang
---
Subject: writeback: charge leaked page dirties to active tasks
Date: Tue Apr 05 13:21:19 CST 2011

It's a years long problem that a large number of short-lived dirtiers
(eg. gcc instances in a fast kernel build) may starve long-run dirtiers
(eg. dd) as well as pushing the dirty pages to the global hard limit.

The solution is to charge the pages dirtied by the exited gcc to the
other random gcc/dd instances. It sounds not perfect, however should
behave good enough in practice.

CC: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Signed-off-by: Wu Fengguang <fengguang.wu@...el.com>
---
 include/linux/writeback.h |    2 ++
 kernel/exit.c             |    2 ++
 mm/page-writeback.c       |   11 +++++++++++
 3 files changed, 15 insertions(+)

--- linux-next.orig/include/linux/writeback.h	2011-08-08 21:45:58.000000000 +0800
+++ linux-next/include/linux/writeback.h	2011-08-08 21:45:58.000000000 +0800
@@ -7,6 +7,8 @@
 #include <linux/sched.h>
 #include <linux/fs.h>
 
+DECLARE_PER_CPU(int, dirty_leaks);
+
 /*
  * The 1/4 region under the global dirty thresh is for smooth dirty throttling:
  *
--- linux-next.orig/mm/page-writeback.c	2011-08-08 21:45:58.000000000 +0800
+++ linux-next/mm/page-writeback.c	2011-08-08 22:21:50.000000000 +0800
@@ -190,6 +190,7 @@ int dirty_ratio_handler(struct ctl_table
 	return ret;
 }
 
+DEFINE_PER_CPU(int, dirty_leaks) = 0;
 
 int dirty_bytes_handler(struct ctl_table *table, int write,
 		void __user *buffer, size_t *lenp,
@@ -1150,6 +1151,7 @@ void balance_dirty_pages_ratelimited_nr(
 {
 	struct backing_dev_info *bdi = mapping->backing_dev_info;
 	int ratelimit;
+	int *p;
 
 	if (!bdi_cap_account_dirty(bdi))
 		return;
@@ -1158,6 +1160,15 @@ void balance_dirty_pages_ratelimited_nr(
 	if (bdi->dirty_exceeded)
 		ratelimit = 8;
 
+	preempt_disable();
+	p = &__get_cpu_var(dirty_leaks);
+	if (*p > 0 && current->nr_dirtied < ratelimit) {
+		nr_pages_dirtied = min(*p, ratelimit - current->nr_dirtied);
+		*p -= nr_pages_dirtied;
+		current->nr_dirtied += nr_pages_dirtied;
+	}
+	preempt_enable();
+
 	if (unlikely(current->nr_dirtied >= ratelimit))
 		balance_dirty_pages(mapping, current->nr_dirtied);
 }
--- linux-next.orig/kernel/exit.c	2011-08-08 21:43:37.000000000 +0800
+++ linux-next/kernel/exit.c	2011-08-08 21:45:58.000000000 +0800
@@ -1039,6 +1039,8 @@ NORET_TYPE void do_exit(long code)
 	validate_creds_for_do_exit(tsk);
 
 	preempt_disable();
+	if (tsk->nr_dirtied)
+		__this_cpu_add(dirty_leaks, tsk->nr_dirtied);
 	exit_rcu();
 	/* causes final put_task_struct in finish_task_switch(). */
 	tsk->state = TASK_DEAD;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ