linux-kernel - Re: [PATCH 3/3] [RFC] tmpfs: Add FALLOC_FL_MARK_VOLATILE/UNMARK

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAO6Zf6A_vbuEjPtZEKoUXK83Y_TwE426k-gz41hDJXSvjuwUkw@mail.gmail.com>
Date:	Sun, 10 Jun 2012 08:35:20 +0200
From:	Dmitry Adamushko <dmitry.adamushko@...il.com>
To:	John Stultz <john.stultz@...aro.org>
Cc:	KOSAKI Motohiro <kosaki.motohiro@...il.com>,
	Dave Hansen <dave@...ux.vnet.ibm.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Android Kernel Team <kernel-team@...roid.com>,
	Robert Love <rlove@...gle.com>, Mel Gorman <mel@....ul.ie>,
	Hugh Dickins <hughd@...gle.com>,
	Rik van Riel <riel@...hat.com>,
	Dave Chinner <david@...morbit.com>, Neil Brown <neilb@...e.de>,
	Andrea Righi <andrea@...terlinux.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
	Taras Glek <tgek@...illa.com>, Mike Hommey <mh@...ndium.org>,
	Jan Kara <jack@...e.cz>
Subject: Re: [PATCH 3/3] [RFC] tmpfs: Add FALLOC_FL_MARK_VOLATILE/UNMARK_VOLATILE
 handlers

>
> So maybe the right appraoch give up the per-fs volatile range lru, and try a
> varient of what DaveC and DaveH have suggested: Letting the page based lru
> reclamation handle the selection on a physical page basis, but then zapping
> the entirety of the neighboring range if any one page is reclaimed.  In
> order to try to preserve the range based LRU behavior, activate all the
> pages in the range together when the range is marked volatile.  Since we
> assume ranges are un-touched when volatile, that should preserve LRU purging
> behavior on single node systems and on multi-node systems it will
> approximate fairly closely.
>
> My main concern with this approach is marking and unmarking volatile ranges
> needs to be fast, so I'm worried about the additional overhead of activating
> each of the containing pages on mark_volatile.

(for my education) just to be sure that I got it right. So what you suggest is

(1) to 'deactivate-page' for all the pages in the range upon
mark_volatile. Hence, the pages from the same volatile range are
placed in clusters within their original LRU lists [a] and so

(1.1) the standard per-page reclaim mechanism is more likely to
discard them together;
(1.2) they are also (LRU-style) ordered wrt other volatile ranges (clusters)

[a] it's LRU_INACTIVE_FILE for tmpfs, right? also, the pages can be
from different zones (otoh, at least on x86 HIGH_MEM is likely).

or

(2) somehow remove all the pages from the standard LRU lists (or do
something else) to make sure that that the normal per-page reclaim
procedure can't see them. Then we introduce LRU_VOLATILE (where we
keep whole volatile ranges, not pages) and find the appropriate place
to process it in the reclaim code.

Also, I had another idea (it looks quite hacky though). For (1) above,
we don't necessarily need to touch all the pages... what we can do is
as follows:
- take the first page of the range (or even create a (hacky-hacky) virtual one);
- we need to mark it somehow as belonging to the volatile-reclaim
(modifying page->mapping ?);
- we place it at the beginning of the corresponding LRU_INACTIVE_*
list (hm, more complex if different zones);
  the idea here, is that the standard per-page reclaim code should see
this page before seeing any other page from its range
- once the per-page reclaim code encounters such a page (heh, should
be a low cost check though) - we call into volatile-reclaim...

now, this volatile-reclaim can even purge another volatile range,
because by placing "the page at the beginning of the corresponding
LRU_INACTIVE_* list)" we broke LRU-like behavior for volatile ranges.

>
> The other question I have with this approach is if we're on a system that
> doesn't have swap, it *seems* (not totally sure I understand it yet) the
> tmpfs file pages will be skipped over when we call shrink_lruvec.  So it
> seems we may need to add a new lru_list enum and nr[] entry (maybe
> LRU_VOLATILE?).   So then it may be that when we mark a range as volatile,
> instead of just activating it, we move it to the volatile lru, and then when
> we shrink from that list, we call back to the filesystem to trigger the
> entire range purging.
>

Kind of what I meant with (2) above?

[ I was in a bit of hurry while writing this, so I apologize for
possible confusion... I can elaborate on it more in details later on ]

Thanks,

-- Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/