linux-kernel - Re: XFS vs Elevators (was Re: [PATCH RFC] nilfs2: continuous snapshotting file system)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200808211700.30380.Martin@lichtvoll.de>
Date:	Thu, 21 Aug 2008 17:00:29 +0200
From:	Martin Steigerwald <Martin@...htvoll.de>
To:	linux-xfs@....sgi.com
Cc:	Szabolcs Szakacsits <szaka@...s-3g.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	xfs@....sgi.com
Subject: Re: XFS vs Elevators (was Re: [PATCH RFC] nilfs2: continuous snapshotting file system)

Am Donnerstag 21 August 2008 schrieb Martin Steigerwald:
> Am Donnerstag 21 August 2008 schrieb Dave Chinner:
> > On Thu, Aug 21, 2008 at 04:04:18PM +1000, Dave Chinner wrote:
> > > On Thu, Aug 21, 2008 at 03:15:08PM +1000, Dave Chinner wrote:
> > > > On Thu, Aug 21, 2008 at 05:46:00AM +0300, Szabolcs Szakacsits 
wrote:
> > > > > On Thu, 21 Aug 2008, Dave Chinner wrote:
> > > > > Everything is default.
> > > > >
> > > > >   % rpm -qf =mkfs.xfs
> > > > >   xfsprogs-2.9.8-7.1
> > > > >
> > > > > which, according to ftp://oss.sgi.com/projects/xfs/cmd_tars, is
> > > > > the latest stable mkfs.xfs. Its output is
> > > > >
> > > > > meta-data=/dev/sda8              isize=256    agcount=4,
> > > > > agsize=1221440 blks =                       sectsz=512   attr=2
> > > > > data     =                       bsize=4096   blocks=4885760,
> > > > > imaxpct=25 =                       sunit=0      swidth=0 blks
> > > > > naming   =version 2              bsize=4096
> > > > > log      =internal log           bsize=4096   blocks=2560,
> > > > > version=2 =                       sectsz=512   sunit=0 blks,
> > > > > lazy-count=0 realtime =none                   extsz=4096
> > > > > blocks=0, rtextents=0
> > > >
> > > > Ok, I thought it might be the tiny log, but it didn't improve
> > > > anything here when increased the log size, or the log buffer
> > > > size.
> > >
> > > One thing I just found out - my old *laptop* is 4-5x faster than
> > > the 10krpm scsi disk behind an old cciss raid controller.  I'm
> > > wondering if the long delays in dispatch is caused by an
> > > interaction with CTQ but I can't change it on the cciss raid
> > > controllers. Are you using ctq/ncq on your machine?  If so, can you
> > > reduce the depth to something less than 4 and see what difference
> > > that makes?
> >
> > Just to point out - this is not a new problem - I can reproduce
> > it on 2.6.24 as well as 2.6.26. Likewise, my laptop shows XFS
> > being faster than ext3 on both 2.6.24 and 2.6.26. So the difference
> > is something related to the disk subsystem on the server....
>
> Interesting. I switched from cfq to deadline some time ago, due to
> abysmal XFS performance on parallel IO - aptitude upgrade and doing
> desktop stuff. Just my subjective perception, but I have seen it crawl,
> even stall for 5-10 seconds easily at times. I found deadline to be way
> faster initially, but then it rarely happened that IO for desktop tasks
> is basically stalled for even longer, say 15 seconds or more, on
> parallel IO. However I can't remember having this problem with the last
> kernel 2.6.26.2.
>
> I am now testing with cfq again. On a ThinkPad T42 internal 160 GB
> harddisk with barriers enabled. But you tell, it only happens on
> certain servers, so I might have seen something different.
>
> Thus I had the rough feeling that something is wrong with at least CFQ
> and XFS together, but I couldn't prove it back then. I have no idea how
> to easily do a reproducable test case. Maybe having a script that
> unpacks kernel source archives while I try to use the desktop...

Okay, some numbers attached:

- On XFS: Barrier versus Nobarrier makes quite a difference with 
compilebench. Also on rm -rf'ing the large directory tree it leaves 
behind. While I did not measure the first barrier related compilebench 
directory deletion I am pretty sure it took way longer. Also vmstat 
throughput it higher without nobarriers.
 
- On XFS: CFQ versus NOOP does not seem to make that much of a difference, 
at least not with barriers enabled (didn't test without). With NOOP 
responsiveness was even weaker than with CFQ. Opening a context menu on a 
webpage link displayed in Konqueror could take easily a minute or more. I 
think it shall never ever take that long for the OS to respond to user 
input.

- Ext3, NILFS, BTRFS with CFQ: Perform quite well. Especially btrfs. nilfs 
text isn't complete, cause likely due to checkpoints those 4G I dedicated 
to it were not enough for the compilebench test to complete.

So at least here performance degration with XFS seems more related to 
barriers than scheduler decision - least when it comes to the two choices 
CFQ and NOOP. But no, I won't switch barriers off permanently on my 
laptop. ;) Would be fine if performance impact of barriers could be 
reduced a bit tough.

At last I appear to see something different than the I/O scheduler issue 
discussed here.

Anyway subjectively I am quite happy with XFS performance nonetheless. But 
then since I can't switch from XFS to ext3 or btrfs in a second I can't 
really compare subjective impressions. Maybe desktop would respond faster 
with ext3 or btrfs? Who knows?

I think a script which does extensive automated testing would be fine:

- have some basic settings like

SCRATCH_DEV=/dev/sda8 (this should be a real partition in order to be able 
to test barriers which do not work over LVM / device mapper)

SCRATCH_MNT=/mnt/test

- have an array of pre-pre-test setups like

[ echo "cfq" >/sys/block/sda/queue/scheduler ]
[ echo "deadline" >/sys/block/sda/queue/scheduler ]
[ echo "anticipatory" >/sys/block/sda/queue/scheduler ]
[ echo "noop" >/sys/block/sda/queue/scheduler ]

- have an array of pre-test setups like

[ mkfs.xfs -f $SCRATCH_DEV
mount $SCRATCH_DEV $SCRATCH_MNT ]
[ mkfs.xfs -f $SCRATCH_DEV
mount -o nobarrier $SCRATCH_DEV $SCRATCH_MNT ]
[ mkfs.xfs -f $SCRATCH_DEV
mount -o logbsize=256k $SCRATCH_DEV $SCRATCH_MNT ]
[ mkfs.btrfs $SCRATCH_DEV
mount $SCRATCH_DEV $SCRATCH_MNT ]

- have an array of tests like

[ ./compilebench -D /mnt/zeit-btrfs -i 5 -r 10 ]
[ postmark whatever ]
[ iozone whatever ]

- and let it run every combination of those array elements unattended 
(over night;-)

- have any results collected with settings for each patch and basic 
machine info in one easy to share text file

- then as additional feature let it test responsiveness during each 
running test. Let it makes sure there are some files that are not in the 
cache and let it access one of those files once in a while and measure 
how long it takes the filesystem to respond

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

View attachment "filesystem-benchmarks-compilebench-2008-08-21.txt" of type "text/plain" (20701 bytes)