linux-kernel - Re: Linux 2.6.29

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <49C90B91.9050002@krogh.cc>
Date:	Tue, 24 Mar 2009 17:34:25 +0100
From:	Jesper Krogh <jesper@...gh.cc>
To:	Theodore Tso <tytso@....edu>, Ingo Molnar <mingo@...e.hu>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Arjan van de Ven <arjan@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Nick Piggin <npiggin@...e.de>,
	Jens Axboe <jens.axboe@...cle.com>,
	David Rees <drees76@...il.com>, Jesper Krogh <jesper@...gh.cc>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 2.6.29

Theodore Tso wrote:
> On Tue, Mar 24, 2009 at 02:30:11PM +0100, Ingo Molnar wrote:
>> i think the problem became visible via the rise in memory size, 
>> combined with the non-improvement of the performance of rotational 
>> disks.
>>
>> The disk speed versus RAM size ratio has become dramatically worse - 
>> and our "5% of RAM" dirty ratio on a 32 GB box is 1.6 GB - which 
>> takes an eternity to write out if you happen to sync on that. When 
>> we had 1 GB of RAM 5% meant 51 MB - one or two seconds to flush out 
>> - and worse than that, chances are that it's spread out widely on 
>> the disk, the whole thing becoming seek-limited as well.
> 
> That's definitely a problem too, but keep in mind that by default the
> journal gets committed every 5 seconds, so the data gets flushed out
> that often.  So the question is how quickly can you *dirty* 1.6GB of
> memory?

Say it's a file that you allready have in memory cache read in.. there
is plenty of space in 16GB for that.. then you can dirty it at 
memory-speed.. that about ½sec. (correct me if I'm wrong).

Ok, this is probably unrealistic, but memory grows the largest we have
at the moment is 32GB and its steadily growing with the core-counts.
Then the available memory is used to cache the "active" portion of the
filsystems. I would even say that in the NFS-servers I depend on it to
do this efficiently. (2.6.29-rc8 delivered 1050MB/s over af 10GbitE 
using nfsd - send speed to multiple clients).

The current workload is based of an active dataset of 600GB where
index'es are being generated and written back to the same disk. So
there is a fairly high read/write load on the machine (as you said was 
required). The majority (perhaps 550GB ) is only read once where the
rest of the time it is stuff in the last 50GB being rewritten.

> "dd if=/dev/zero of=/u1/dirty-me-harder" will certainly do it, but
> normally we're doing something useful, and so you're either copying
> data from local disk, at which point you're limited by the read speed
> of your local disk (I suppose it could be in cache, but how common of
> a case is that?), 

Increasingly the case as memory sizes grows.

 > *or*, you're copying from the network, and to copy
> in 1.6GB of data in 5 seconds, that means you're moving 320
> megabytes/second, which if we're copying in the data from the network,
> requires a 10 gigabit ethernet.

or just around being processed on the 16-32 cores on the system.

Jesper
-- 
Jesper
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/