linux-kernel - Re: [patch 0/5] refault distance-based file cache sizing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 16 May 2012 20:56:54 +0800
From:	"nai.xia" <nai.xia@...il.com>
To:	Johannes Weiner <hannes@...xchg.org>
CC:	linux-mm@...ck.org, Rik van Riel <riel@...hat.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Mel Gorman <mgorman@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Minchan Kim <minchan.kim@...il.com>,
	Hugh Dickins <hughd@...gle.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [patch 0/5] refault distance-based file cache sizing

Hi,

On 2012/05/16 14:51, Johannes Weiner wrote:
> Hi Nai,
>
> On Wed, May 16, 2012 at 01:25:34PM +0800, nai.xia wrote:
>> Hi Johannes,
>>
>> Just out of curiosity(since I didn't study deep into the
>> reclaiming algorithms), I can recall from here that around 2005,
>> there was an(or some?) implementation of the "Clock-pro" algorithm
>> which also have the idea of "reuse distance", but it seems that algo
>> did not work well enough to get merged? Does this patch series finally
>> solve the problem(s) with "Clock-pro" or totally doesn't have to worry
>> about the similar problems?
>
> As far as I understood, clock-pro set out to solve more problems than
> my patch set and it failed to satisfy everybody.
>
> The main error case was that it could not partially cache data of a
> set that was bigger than memory.  Instead, looping over the file
> repeatedly always has to read every single page because the most
> recent page allocations would push out the pages needed in the nearest
> future.  I never promised to solve this problem in the first place.
> But giving more memory to the big looping load is not useful in our
> current situation, and at least my code protects smaller sets of
> active cache from these loops.  So it's not optimal, but it sucks only
> half as much :)

Yep, I see ;)

>
> There may have been improvements from clock-pro, but it's hard to get
> code merged that does not behave as expected in theory with nobody
> understanding what's going on.
>
> My code is fairly simple, works for the tests I've done and the
> behaviour observed so far is understood (at least by me).

OK, I assume that you do aware that the system you constructed with
this simple and understandable idea looks like a so called "feedback
system"? Or in other words, I think theoretically the refault-distance
of a page before and after your algorithm is applied is not the same.
And this changed refault-distance pattern is then feed as input into
your algorithm. A feedback system may be hard(and may be simple) to
analyze but may also work well magically.

Well, again I confess I've not done enough course in this area. Just hope
that my words can help you think more comprehensively. :)


Thanks,

Nai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/