lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 16 Sep 2015 11:16:21 -0400
From:	Chris Mason <clm@...com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
CC:	Dave Chinner <david@...morbit.com>, Josef Bacik <jbacik@...com>,
	LKML <linux-kernel@...r.kernel.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	Neil Brown <neilb@...e.de>, Jan Kara <jack@...e.cz>,
	Christoph Hellwig <hch@....de>
Subject: Re: [PATCH] fs-writeback: drop wb->list_lock during blk_finish_plug()

On Mon, Sep 14, 2015 at 01:06:25PM -0700, Linus Torvalds wrote:
> On Sun, Sep 13, 2015 at 4:12 PM, Dave Chinner <david@...morbit.com> wrote:
> >
> > Really need to run these numbers on slower disks where block layer
> > merging makes a difference to performance.
> 
> Yeah. We've seen plugging and io schedulers not make much difference
> for high-performance flash (although I think the people who argued
> that noop should generally be used for non-rotating media were wrong,
> I think - the elevator ends up still being critical to merging, and
> while merging isn't a life-or-death situation, it tends to still
> help).


Yeah, my big concern was that holding the plug longer would result in
lower overall perf because we weren't keeping the flash busy.  So I
started with the flash boxes to make sure we weren't regressing past 4.2
levels at least.

I'm still worried about that, but this probably isn't the right
benchmark to show it.  And if it's really a problem, it'll happen
everywhere we plug and not just here.

> 
> For rotating rust with nasty seek times, the plugging is likely to
> make the biggest difference.

For rotating storage, I grabbed a big box and did the fs_mark run
against 8 spindles.  These are all behind a megaraid card as jbods, so I
flipped the card's cache to write-through.

I changed around the run a bit, making enough files for fs_mark to run
for ~10 minutes, and I took out the sync.  I ran only xfs to cut down on
the iterations, and after the fs_mark run, I did short 30 second run with
blktrace in the background to capture the io sizes.

v4.2:    178K files/sec
Chinner: 192K files/sec
Mason:   192K files/sec
Linus:   193K files/sec

I added support to iowatcher to graph IO size, and attached the graph.

Short version, Linus' patch still gives bigger IOs and similar perf to
Dave's original.  I should have done the blktrace runs for 60 seconds
instead of 30, I suspect that would even out the average sizes between
the three patches.

-chris

Download attachment "fs_mark.png" of type "application/octet-stream" (34363 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ