lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090326181106.GC17159@duck.suse.cz>
Date:	Thu, 26 Mar 2009 19:11:06 +0100
From:	Jan Kara <jack@...e.cz>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Theodore Tso <tytso@....edu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Arjan van de Ven <arjan@...radead.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Nick Piggin <npiggin@...e.de>,
	Jens Axboe <jens.axboe@...cle.com>,
	David Rees <drees76@...il.com>, Jesper Krogh <jesper@...gh.cc>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Roland McGrath <roland@...hat.com>
Subject: Re: ext3 IO latency measurements (was: Linux 2.6.29)

On Thu 26-03-09 10:06:30, Ingo Molnar wrote:
> 
> * Jan Kara <jack@...e.cz> wrote:
> 
> > > So tell me again how the VM can rely on the filesystem not 
> > > blocking at random points.
> >
> >   I can write a patch to make writepage() in the non-"mmapped 
> > creation" case non-blocking on journal. But I'll also have to find 
> > out whether it really helps something. But it's probably worth 
> > trying...
> 
> _all_ the problems i ever had with ext3 were 'collateral damage' 
> type of things: simple writes (sometimes even reads) getting 
> serialized on some large [but reasonable] dirtying activity 
> elsewhere - even if the system was still well within its 
> hard-dirty-limit threshold.
> 
> So it sure sounds like an area worth improving, and it's not that 
> hard to reproduce either. Take a system with enough RAM but only a 
> single disk, and do this in a kernel tree:
> 
>   sync
>   echo 3 > /proc/sys/vm/drop_caches
> 
>   while :; do
>     date
>     make mrproper      2>/dev/null >/dev/null
>     make defconfig     2>/dev/null >/dev/null
>     make -j32 bzImage  2>/dev/null >/dev/null
>   done &
  I've played with it a bit. I don't have a fast enough machine so that a
compile would feed my SATA drive fast enough (and I also have just 2 GB of
memory) but copying kernel tree there and back seemed to load it reasonably.
  I've tried a kernel with and without attached patch which makes writepage
not start a transaction when not needed.
 
> Plain old kernel build, no distcc and no icecream. Wait a few 
> minutes for the system to reach equilibrium. There's no tweaking 
> anywhere, kernel, distro and filesystem defaults used everywhere:
> 
>  aldebaran:/home/mingo/linux/linux> ./compile-test 
>  Thu Mar 26 10:33:03 CET 2009
>  Thu Mar 26 10:35:24 CET 2009
>  Thu Mar 26 10:36:48 CET 2009
>  Thu Mar 26 10:38:54 CET 2009
>  Thu Mar 26 10:41:22 CET 2009
>  Thu Mar 26 10:43:41 CET 2009
>  Thu Mar 26 10:46:02 CET 2009
>  Thu Mar 26 10:48:28 CET 2009
> 
> And try to use the system while this workload is going on. Use Vim 
> to edit files in this kernel tree. Use plain _cat_ - and i hit 
> delays all the time - and it's not the CPU scheduler but all IO 
> related.
  So I observed long delays when VIM was saving a file but in all cases it
was hanging in fsync() which was committing a large transaction (this was
both with and without patch) - not a big surprise. Working on the machine
seemed a bit better when the patch was applied - in the kernel with the
patch VIM at least didn't hang when just writing into the file.
  Reads are measurably better with the patch - the test with cat you
describe below took ~0.5s per file without the patch and always less than
0.02s with the patch. So it seems to help something. Can you check on your
machine whether you see some improvements? Thanks.

> I have such an ext3 based system where i can do such tests and where 
> i dont mind crashes and data corruption either, so if you send me 
> experimental patches against latet -git i can try them immediately. 
> The system has 16 CPUs, 12GB of RAM and a single disk.
> 
> Btw., i had this test going on that box while i wrote some simple 
> scripts in Vim - and it was a horrible experience. The worst wait 
> was well above one minute - Vim just hung there indefinitely. Not 
> even Ctrl-Z was possible. I captured one such wait, it was hanging 
> right here:
> 
>  aldebaran:~/linux/linux> cat /proc/3742/stack
>  [<ffffffff8034790a>] log_wait_commit+0xbd/0x110
>  [<ffffffff803430b2>] journal_stop+0x1df/0x20d
>  [<ffffffff8034421f>] journal_force_commit+0x28/0x2d
>  [<ffffffff80331c69>] ext3_force_commit+0x2b/0x2d
>  [<ffffffff80328b56>] ext3_write_inode+0x3e/0x44
>  [<ffffffff802ebb9d>] __sync_single_inode+0xc1/0x2ad
>  [<ffffffff802ebed6>] __writeback_single_inode+0x14d/0x15a
>  [<ffffffff802ebf0c>] sync_inode+0x29/0x34
>  [<ffffffff80327453>] ext3_sync_file+0xa7/0xb4
>  [<ffffffff802ef17d>] vfs_fsync+0x78/0xaf
>  [<ffffffff802ef1eb>] do_fsync+0x37/0x4d
>  [<ffffffff802ef228>] sys_fsync+0x10/0x14
>  [<ffffffff8020bd1b>] system_call_fastpath+0x16/0x1b
>  [<ffffffffffffffff>] 0xffffffffffffffff
> 
> It took about 120 seconds for it to recover.
> 
> And it's not just sys_fsync(). The script i wrote tests file read 
> latencies. I have created 1000 files with the same size (all copies 
> of kernel/sched.c ;-), and tested their cache-cold plain-cat 
> performance via:
> 
>   for ((i=0;i<1000;i++)); do
>     printf "file #%4d, plain reading it took: " $i
>     /usr/bin/time -f "%e seconds."  cat $i >/dev/null
>   done
> 
> I.e. plain, supposedly high-prio reads. The result is very common 
> hickups in read latencies:
> 
> file # 579 (253560 bytes), reading it took: 0.08 seconds.
> file # 580 (253560 bytes), reading it took: 0.05 seconds.
> file # 581 (253560 bytes), reading it took: 0.01 seconds.
> file # 582 (253560 bytes), reading it took: 0.01 seconds.
> file # 583 (253560 bytes), reading it took: 4.61 seconds.
> file # 584 (253560 bytes), reading it took: 1.29 seconds.
> file # 585 (253560 bytes), reading it took: 3.01 seconds.
> file # 586 (253560 bytes), reading it took: 7.74 seconds.
> file # 587 (253560 bytes), reading it took: 3.22 seconds.
> file # 588 (253560 bytes), reading it took: 0.05 seconds.
> file # 589 (253560 bytes), reading it took: 0.36 seconds.
> file # 590 (253560 bytes), reading it took: 7.39 seconds.
> file # 591 (253560 bytes), reading it took: 7.58 seconds.
> file # 592 (253560 bytes), reading it took: 7.90 seconds.
> file # 593 (253560 bytes), reading it took: 8.78 seconds.
> file # 594 (253560 bytes), reading it took: 8.01 seconds.
> file # 595 (253560 bytes), reading it took: 7.47 seconds.
> file # 596 (253560 bytes), reading it took: 11.52 seconds.
> file # 597 (253560 bytes), reading it took: 10.33 seconds.
> file # 598 (253560 bytes), reading it took: 8.56 seconds.
> file # 599 (253560 bytes), reading it took: 7.58 seconds.
  <snip>

								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR

View attachment "0001-ext3-Avoid-starting-a-transaction-in-writepage-when.patch" of type "text/x-patch" (2269 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ