lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 29 Sep 2011 19:05:25 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Jan Kara <jack@...e.cz>, Christoph Hellwig <hch@....de>,
	Dave Chinner <david@...morbit.com>,
	Greg Thelen <gthelen@...gle.com>,
	Minchan Kim <minchan.kim@...il.com>,
	Vivek Goyal <vgoyal@...hat.com>,
	Andrea Righi <arighi@...eler.com>,
	linux-mm <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 10/18] writeback: dirty position control - bdi reserve
 area

On Thu, Sep 29, 2011 at 04:49:57PM +0800, Peter Zijlstra wrote:
> On Thu, 2011-09-29 at 11:32 +0800, Wu Fengguang wrote:
> > > Now I guess the only problem is when nr_bdi * MIN_WRITEBACK_PAGES ~
> > > limit, at which point things go pear shaped.
> > 
> > Yes. In that case the global @dirty will always be drove up to @limit.
> > Once @dirty dropped reasonably below, whichever bdi task wakeup first
> > will take the chance to fill the gap, which is not fair for bdi's of
> > different speed.
> > 
> > Let me retry the thresh=1M,10M test cases without MIN_WRITEBACK_PAGES.
> > Hopefully the removal of it won't impact performance a lot. 
> 
> 
> Right, so alternatively we could try an argument that this is
> sufficiently rare and shouldn't happen. People with lots of disks tend
> to also have lots of memory, etc.

Right.

> If we do find it happens we can always look at it again.

Sure.  Now I got the results for single disk thresh=1M,8M,100M cases
and find no big differences if removing MIN_WRITEBACK_PAGES:

    3.1.0-rc4-bgthresh3+      3.1.0-rc4-bgthresh4+
------------------------  ------------------------
                 3988742        +1.9%      4063217  thresh=100M/ext4-10dd-4k-8p-4096M-100M:10-X
                 4758884        +1.5%      4829320  thresh=100M/ext4-1dd-4k-8p-4096M-100M:10-X
                 4621240        +1.6%      4693525  thresh=100M/ext4-2dd-4k-8p-4096M-100M:10-X
                 3420717        +0.1%      3423712  thresh=100M/xfs-10dd-4k-8p-4096M-100M:10-X
                 4361830        +1.4%      4423554  thresh=100M/xfs-1dd-4k-8p-4096M-100M:10-X
                 3964043        +0.2%      3972057  thresh=100M/xfs-2dd-4k-8p-4096M-100M:10-X
                 2937926        +0.6%      2956870  thresh=1M/ext4-10dd-4k-8p-4096M-1M:10-X
                 4472552        -1.9%      4387457  thresh=1M/ext4-1dd-4k-8p-4096M-1M:10-X
                 4085707        -3.0%      3961155  thresh=1M/ext4-2dd-4k-8p-4096M-1M:10-X
                 2206897        +2.1%      2253839  thresh=1M/xfs-10dd-4k-8p-4096M-1M:10-X
                 4207336        -2.1%      4119821  thresh=1M/xfs-1dd-4k-8p-4096M-1M:10-X
                 3739888        -3.6%      3604315  thresh=1M/xfs-2dd-4k-8p-4096M-1M:10-X
                 3279302        -0.2%      3273310  thresh=8M/ext4-10dd-4k-8p-4096M-8M:10-X
                 4834878        +1.6%      4912372  thresh=8M/ext4-1dd-4k-8p-4096M-8M:10-X
                 4511120        -1.7%      4435193  thresh=8M/ext4-2dd-4k-8p-4096M-8M:10-X
                 2443874        -0.5%      2432188  thresh=8M/xfs-10dd-4k-8p-4096M-8M:10-X
                 4308416        -0.6%      4283110  thresh=8M/xfs-1dd-4k-8p-4096M-8M:10-X
                 3739810        +0.6%      3763320  thresh=8M/xfs-2dd-4k-8p-4096M-8M:10-X

Or lowering the largest promotion ratio from 128 to 8:

    3.1.0-rc4-bgthresh4+      3.1.0-rc4-bgthresh5+
------------------------  ------------------------
                 4063217        -0.0%      4062022  thresh=100M/ext4-10dd-4k-8p-4096M-100M:10-X
                 4829320        +1.1%      4882829  thresh=100M/ext4-1dd-4k-8p-4096M-100M:10-X
                 4693525        +0.1%      4700537  thresh=100M/ext4-2dd-4k-8p-4096M-100M:10-X
                 3423712        +0.2%      3431603  thresh=100M/xfs-10dd-4k-8p-4096M-100M:10-X
                 4423554        -0.3%      4408912  thresh=100M/xfs-1dd-4k-8p-4096M-100M:10-X
                 3972057        -0.1%      3968535  thresh=100M/xfs-2dd-4k-8p-4096M-100M:10-X
                 2956870        -0.9%      2929605  thresh=1M/ext4-10dd-4k-8p-4096M-1M:10-X
                 4387457        -0.2%      4378233  thresh=1M/ext4-1dd-4k-8p-4096M-1M:10-X
                 3961155        -0.5%      3940075  thresh=1M/ext4-2dd-4k-8p-4096M-1M:10-X
                 2253839        -0.9%      2232976  thresh=1M/xfs-10dd-4k-8p-4096M-1M:10-X
                 4119821        -2.1%      4031983  thresh=1M/xfs-1dd-4k-8p-4096M-1M:10-X
                 3604315        -3.1%      3493042  thresh=1M/xfs-2dd-4k-8p-4096M-1M:10-X
                 3273310        -1.1%      3237060  thresh=8M/ext4-10dd-4k-8p-4096M-8M:10-X
                 4912372        -0.0%      4911287  thresh=8M/ext4-1dd-4k-8p-4096M-8M:10-X
                 4435193        +0.1%      4441581  thresh=8M/ext4-2dd-4k-8p-4096M-8M:10-X
                 2432188        +1.1%      2459249  thresh=8M/xfs-10dd-4k-8p-4096M-8M:10-X
                 4283110        +0.1%      4289456  thresh=8M/xfs-1dd-4k-8p-4096M-8M:10-X
                 3763320        -0.1%      3758938  thresh=8M/xfs-2dd-4k-8p-4096M-8M:10-X

As for the thresh=100M JBOD cases, I don't see much occurrences of promotion
ratio > 2. So the simplification should make no difference, too.

Thus the finalized code will be:

+       x_intercept = bdi_thresh / 2;
+       if (bdi_dirty < x_intercept) {
+               if (bdi_dirty > x_intercept / 8) {
+                       pos_ratio *= x_intercept;
+                       do_div(pos_ratio, bdi_dirty);
+               } else
+                       pos_ratio *= 8;
+       }

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ