lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1235728154.24401.55.camel@laptop>
Date:	Fri, 27 Feb 2009 10:49:14 +0100
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Lin Ming <ming.m.lin@...el.com>
Cc:	npiggin@...e.de, linux-kernel <linux-kernel@...r.kernel.org>,
	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
Subject: Re: iozone regression with 2.6.29-rc6

On Fri, 2009-02-27 at 17:13 +0800, Lin Ming wrote:
> bisect locates below commits,
> 
> commit 1cf6e7d83bf334cc5916137862c920a97aabc018
> Author: Nick Piggin <npiggin@...e.de>
> Date:   Wed Feb 18 14:48:18 2009 -0800
> 
>     mm: task dirty accounting fix
> 
>     YAMAMOTO-san noticed that task_dirty_inc doesn't seem to be called properly for
>     cases where set_page_dirty is not used to dirty a page (eg. mark_buffer_dirty).
> 
>     Additionally, there is some inconsistency about when task_dirty_inc is
>     called.  It is used for dirty balancing, however it even gets called for
>     __set_page_dirty_no_writeback.
> 
>     So rather than increment it in a set_page_dirty wrapper, move it down to
>     exactly where the dirty page accounting stats are incremented.
> 
>     Cc: YAMAMOTO Takashi <yamamoto@...inux.co.jp>
>     Signed-off-by: Nick Piggin <npiggin@...e.de>
>     Acked-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
>     Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
>     Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>
> 
> 
> below data in parenthesis is the result after above commit reverted, for example,
> -10% (+2%) means,
> iozone has ~10% regression with 2.6.29-rc6 compared with 2.6.29-rc5.
> and
> iozone has ~2% improvement with 2.6.29-rc6-revert-1cf6e7d compared with 2.6.29-rc5.
> 
> 
> 			4P dual-core HT	 	2P qual-core  	2P qual-core HT
> 			tulsa		   	stockley	Nehalem
> 			--------------------------------------------------------
> iozone-rewrite		-10% (+2%)		-8% (0%)	-10% (-7%)
> iozone-rand-write	-50% (0%)		-20% (+10%)
> iozone-read					-13% (0%)
> iozone-write					-28% (-1%)
> iozone-reread							-5% (-1%)
> iozone-mmap-read						-7% (+2%)
> iozone-mmap-reread						-7% (+2%)
> iozone-mmap-rand-read						-7% (+3%)
> iozone-mmap-rand-write						-5% (0%)

Ugh, that's unexpected..

So 'better' accounting leads to worse performance, which would indicate
we throttle more.

I take it you machine has gobs of memory.

Does something like the below help any?

---
Subject: mm: bdi: tweak task dirty penalty
From: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Date: Fri Feb 27 10:41:22 CET 2009

Penalizing heavy dirtiers with 1/8-th the total dirty limit might be rather
excessive on large memory machines. Use sqrt to scale it sub-linearly.

Update the comment while we're there.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
---
 mm/page-writeback.c |   12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

Index: linux-2.6/mm/page-writeback.c
===================================================================
--- linux-2.6.orig/mm/page-writeback.c
+++ linux-2.6/mm/page-writeback.c
@@ -293,17 +293,21 @@ static inline void task_dirties_fraction
 }
 
 /*
- * scale the dirty limit
+ * Task specific dirty limit:
  *
- * task specific dirty limit:
+ *   dirty -= 8 * sqrt(dirty) * p_{t}
  *
- *   dirty -= (dirty/8) * p_{t}
+ * Penalize tasks that dirty a lot of pages by lowering their dirty limit. This
+ * avoids infrequent dirtiers from getting stuck in this other guys dirty
+ * pages.
+ *
+ * Use a sub-linear function to scale the penalty, we only need a little room.
  */
 static void task_dirty_limit(struct task_struct *tsk, long *pdirty)
 {
 	long numerator, denominator;
 	long dirty = *pdirty;
-	u64 inv = dirty >> 3;
+	u64 inv = 8*int_sqrt(dirty);
 
 	task_dirties_fraction(tsk, &numerator, &denominator);
 	inv *= numerator;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ