[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090324133011.GB21720@elte.hu>
Date: Tue, 24 Mar 2009 14:30:11 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Theodore Tso <tytso@....edu>, Alan Cox <alan@...rguk.ukuu.org.uk>,
Arjan van de Ven <arjan@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Nick Piggin <npiggin@...e.de>,
Jens Axboe <jens.axboe@...cle.com>,
David Rees <drees76@...il.com>, Jesper Krogh <jesper@...gh.cc>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 2.6.29
* Theodore Tso <tytso@....edu> wrote:
> More recently (as in this past weekend), I went back to the ext3
> problem, and found a better solution, here:
>
> http://lkml.org/lkml/2009/3/21/304
> http://lkml.org/lkml/2009/3/21/302
> http://lkml.org/lkml/2009/3/21/303
>
> These patches cause the synchronous writes caused by an fsync() to
> be submitted using WRITE_SYNC, instead of WRITE, which definitely
> helps in the case where there is a heavy read workload in the
> background.
>
> They don't solve the problem where there is a *huge* amount of
> writes going on, though --- if something is dirtying pages at a
> rate far greater than the local disk can write it out, say, either
> "dd if=/dev/zero of=/mnt/make-lots-of-writes" or a massive distcc
> cluster driving a huge amount of data towards a single system or a
> wget over a local 100 megabit ethernet from a massive NFS server
> where everything is in cache, then you can have a major delay with
> the fsync().
Nice, thanks for the update! The situation isnt nearly as bleak as i
feared they are :)
> However, what I've found, though, is that if you're just doing a
> local copy from one hard drive to another, or downloading a huge
> iso file from an ftp server over a wide area network, the fsync()
> delays really don't get *that* bad, even with ext3. At least, I
> haven't found a workload that doesn't involve either dd
> if=/dev/zero or a massive amount of data coming in over the
> network that will cause fsync() delays in the > 1-2 second
> category. Ext3 has been around for a long time, and it's only
> been the last couple of years that people have really complained
> about this; my theory is that it was the rise of > 10 megabit
> ethernets and the use of systems like distcc that really made this
> problem really become visible. The only realistic workload I've
> found that triggers this requires a fast network dumping data to a
> local filesystem.
i think the problem became visible via the rise in memory size,
combined with the non-improvement of the performance of rotational
disks.
The disk speed versus RAM size ratio has become dramatically worse -
and our "5% of RAM" dirty ratio on a 32 GB box is 1.6 GB - which
takes an eternity to write out if you happen to sync on that. When
we had 1 GB of RAM 5% meant 51 MB - one or two seconds to flush out
- and worse than that, chances are that it's spread out widely on
the disk, the whole thing becoming seek-limited as well.
That's where the main difference in perception of this problem comes
from i believe. The problem was always there, but only in the last
1-2 years did 4G/8G systems become really common for people to
notice.
SSDs will save us eventually, but they will take up to a decade to
trickle through for us to forget about this problem altogether.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists