[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121129033253.GA5554@lizard.sbx05977.paloaca.wayport.net>
Date: Wed, 28 Nov 2012 19:32:54 -0800
From: Anton Vorontsov <anton.vorontsov@...aro.org>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: David Rientjes <rientjes@...gle.com>,
Pekka Enberg <penberg@...nel.org>,
Mel Gorman <mgorman@...e.de>,
Glauber Costa <glommer@...allels.com>,
Michal Hocko <mhocko@...e.cz>,
"Kirill A. Shutemov" <kirill@...temov.name>,
Luiz Capitulino <lcapitulino@...hat.com>,
Greg Thelen <gthelen@...gle.com>,
Leonid Moiseichuk <leonid.moiseichuk@...ia.com>,
KOSAKI Motohiro <kosaki.motohiro@...il.com>,
Minchan Kim <minchan@...nel.org>,
Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>,
John Stultz <john.stultz@...aro.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, linaro-kernel@...ts.linaro.org,
patches@...aro.org, kernel-team@...roid.com,
Robert Love <rlove@...gle.com>,
Colin Cross <ccross@...roid.com>,
Arve Hjønnevåg <arve@...roid.com>
Subject: Re: [RFC] Add mempressure cgroup
On Wed, Nov 28, 2012 at 05:27:51PM -0800, Anton Vorontsov wrote:
> On Wed, Nov 28, 2012 at 03:14:32PM -0800, Andrew Morton wrote:
> [...]
> > Compare this with the shrink_slab() shrinkers. With these, the VM can
> > query and then control the clients. If something goes wrong or is out
> > of balance, it's the VM's problem to solve.
> >
> > So I'm thinking that a better design would be one which puts the kernel
> > VM in control of userspace scanning and freeing. Presumably with a
> > query-and-control interface similar to the slab shrinkers.
>
> Thanks for the ideas, Andrew.
>
> Query-and-control scheme looks very attractive, and that's actually
> resembles my "balance" level idea, when userland tells the kernel how much
> reclaimable memory it has. Except the your scheme works in the reverse
> direction, i.e. the kernel becomes in charge.
>
> But there is one, rather major issue: we're crossing kernel-userspace
> boundary. And with the scheme we'll have to cross the boundary four times:
> query / reply-available / control / reply-shrunk / (and repeat if
> necessary, every SHRINK_BATCH pages). Plus, it has to be done somewhat
> synchronously (all the four stages), and/or we have to make a "userspace
> shrinker" thread working in parallel with the normal shrinker, and here,
> I'm afraid, we'll see more strange interactions. :)
>
> But there is a good news: for these kind of fine-grained control we have a
> better interface, where we don't have to communicate [very often] w/ the
> kernel. These are "volatile ranges", where userland itself marks chunks of
> data as "I might need it, but I won't cry if you recycle it; but when I
> access it next time, let me know if you actually recycled it". Yes,
> userland no longer able to decide which exact page it permits to recycle,
> but we don't have use-cases when we actually care that much. And if we do,
> we'd rather introduce volatile LRUs with different priorities, or
> something alike.
>
> So, we really don't need the full-fledged userland shrinker, since we can
> just let the in-kernel shrinker do its job. If we work with the
> bytes/pages granularity it is just easier (and more efficient in terms of
> communication) to do the volatile ranges.
>
> For the pressure notifications use-cases, we don't even know bytes/pages
> information: "activity managers" are separate processes looking after
> overall system performance.
>
> So, we're not trying to make userland too smart, quite the contrary: we
> realized that for this interface we don't want to mess with the bytes and
> pages, and that's why we cut this stuff down to only three levels. Before
> this, we were actually trying to count bytes, we did not like it and we
> ran away screaming.
>
> OTOH, your scheme makes volatile ranges unneeded, since a thread might
> register a shrinker hook and free stuff by itself. But again, I believe
> this involves more communication with the kernel.
Btw, I believe your idea is something completely new, and I surely cannot
fully evaluate it on my own -- I might be wrong here. So I invite folks to
express their opinions too.
Guys, it's about Andrew's idea of exposing shrinker-alike logic to the
userland (and I made it 'vs. volatile ranges'):
http://lkml.org/lkml/2012/11/28/607
Thanks,
Anton.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists