lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 29 Jan 2007 15:22:27 -0800
From:	Andrew Morton <akpm@...l.org>
To:	"Matthew Kirk" <mkirk01@....com>
Cc:	<linux-kernel@...r.kernel.org>
Subject: Re: fsync occasionally very slow

On Mon, 29 Jan 2007 17:02:14 -0500
"Matthew Kirk" <mkirk01@....com> wrote:

> Regarding the long fsyncs, here's a trace...
> 
> I upgraded to a more recent kernel - 2.6.18.6 - and ran it on a workstation.
> This particular box has In this case the elevator is CFQ.
> 
> This sample came from a stall that lasted about 2.5 minutes(!) - the longest
> one I've seen yet.  The box is a bit more memory constrained than the
> original system but exhibits similar behavior.  It doesn't page.  Also,
> there is no raid card - simply striped PATA drives.

Using your little test app, the longest fsync() stall I can demonstrate on
2.6.20-rc4-mm1 on plain-old-sata-disk is 1.2 seconds.

What's the max stall you're able to see with the test app?

Perhaps the file is just super-fragmented.  If your production app does
something like: 

	for (a lot) {
		fd = open(name);
		write(fd, a little bit);
		close(fd);
	}

in multiple threads, or against a lot of different files then you might be
fragmenting the files a lot.  This is because ext3 discards its in-core
anti-fragmentation data structures on close().  So

- Please check your app, see whether or not it is frequently opening and
  closing the output files.

- Using `bmap' from
  http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz,
  determine whether the offending files are highly fragmented.

- Tell us how much dirty data is being written out by these fsyncs?

- Try mounting the filesystem with `-o data=writeback'.  Probably won't
  help much if it's also demonstrable on ext2.

- Can you reproduce the stalls on a plain-old-disk?  Get RAID out of the
  picture?

- You've seen the stalls with both CFQ and AS.  I guess you could try
  deadline and no-op, but it sounds like that won't help.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ