linux-kernel - Re: -mm merge plans for 2.6.23

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 24 Jul 2007 21:46:51 -0700 (PDT)
From:	david@...g.hm
To:	Ray Lee <ray-lk@...rabbit.org>
cc:	Nick Piggin <nickpiggin@...oo.com.au>,
	Jesper Juhl <jesper.juhl@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	ck list <ck@....kolivas.org>, Ingo Molnar <mingo@...e.hu>,
	Paul Jackson <pj@....com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: -mm merge plans for 2.6.23

On Tue, 24 Jul 2007, Ray Lee wrote:

> On 7/23/07, Nick Piggin <nickpiggin@...oo.com.au> wrote:
>>  Ray Lee wrote:
>
>>  Looking at your past email, you have a 1GB desktop system and your
>>  overnight updatedb run is causing stuff to get swapped out such that
>>  swap prefetch makes it significantly better. This is really
>>  intriguing to me, and I would hope we can start by making this
>>  particular workload "not suck" without swap prefetch (and hopefully
>>  make it even better than it currently is with swap prefetch because
>>  we'll try not to evict useful file backed pages as well).
>
> updatedb is an annoying case, because one would hope that there would
> be a better way to deal with that highly specific workload. It's also
> pretty stat dominant, which puts it roughly in the same category as a
> git diff. (They differ in that updatedb does a lot of open()s and
> getdents on directories, git merely does a ton of lstat()s instead.)
>
> Anyway, my point is that I worry that tuning for an unusual and
> infrequent workload (which updatedb certainly is), is the wrong way to
> go.

updatedb pushing out program data may be able to be improved on with drop 
behind or similar.

however another scenerio that causes a similar problem is when a user is 
busy useing one of the big memory hogs and then switches to another (think 
switching between openoffice and firefox)

>>  After that we can look at other problems that swap prefetch helps
>>  with, or think of some ways to measure your "whole day" scenario.
>>
>>  So when/if you have time, I can cook up a list of things to monitor
>>  and possibly a patch to add some instrumentation over this updatedb
>>  run.
>
> That would be appreciated. Don't spend huge amounts of time on it,
> okay? Point me the right direction, and we'll see how far I can run
> with it.

you could make a synthetic test by writing a memory hog that allocates 3/4 
of your ram then pauses waiting for input and then randomly accesses the 
memory for a while (say randomly accessing 2x # of pages allocated) and 
then pausing again before repeating

run two of these, alternating which one is running at any one time. time 
how long it takes to do the random accesses.

the difference in this time should be a fair example of how much it would 
impact the user.

by the way, I've also seen comments on the Postgres performance mailing 
list about how slow linux is compared to other OS's in pulling data back 
in that's been pushed out to swap (not a factor on dedicated database 
machines, but a big factor on multi-purpose machines)

>>  Anyway, I realise swap prefetching has some situations where it will
>>  fundamentally outperform even the page replacement oracle. This is
>>  why I haven't asked for it to be dropped: it isn't a bad idea at all.
>
> <nod>
>
>>  However, if we can improve basic page reclaim where it is obviously
>>  lacking, that is always preferable. eg: being a highly speculative
>>  operation, swap prefetch is not great for power efficiency -- but we
>>  still want laptop users to have a good experience as well, right?
>
> Absolutely. Disk I/O is the enemy, and the best I/O is one you never
> had to do in the first place.

almost always true, however there is some amount of I/O that is free with 
todays drives (remember, they read the entire track into ram and then 
give you the sectors on the track that you asked for). and if you have a 
raid array this is even more true.

if you read one sector in from a raid5 array you have done all the same 
I/O that you would have to do to read in the entire stripe, but I don't 
believe that the current system will keep it all around if it exceeds the 
readahead limit.

so in many cases readahead may end up being significantly cheaper then you 
expect.

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/