linux-kernel - Re: [RFC] Add mempressure cgroup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20121201080131.GB21747@lizard.sbx14280.paloaca.wayport.net>
Date:	Sat, 1 Dec 2012 00:01:31 -0800
From:	Anton Vorontsov <anton.vorontsov@...aro.org>
To:	Luiz Capitulino <lcapitulino@...hat.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	David Rientjes <rientjes@...gle.com>,
	Pekka Enberg <penberg@...nel.org>,
	Mel Gorman <mgorman@...e.de>,
	Glauber Costa <glommer@...allels.com>,
	Michal Hocko <mhocko@...e.cz>,
	"Kirill A. Shutemov" <kirill@...temov.name>,
	Greg Thelen <gthelen@...gle.com>,
	Leonid Moiseichuk <leonid.moiseichuk@...ia.com>,
	KOSAKI Motohiro <kosaki.motohiro@...il.com>,
	Minchan Kim <minchan@...nel.org>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>,
	John Stultz <john.stultz@...aro.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, linaro-kernel@...ts.linaro.org,
	patches@...aro.org, kernel-team@...roid.com, aquini@...hat.com,
	riel@...hat.com
Subject: Re: [RFC] Add mempressure cgroup

Hi Luiz,

Thanks for your email!

On Fri, Nov 30, 2012 at 03:47:25PM -0200, Luiz Capitulino wrote:
[...]
> > But there is one, rather major issue: we're crossing kernel-userspace
> > boundary. And with the scheme we'll have to cross the boundary four times:
> > query / reply-available / control / reply-shrunk / (and repeat if
> > necessary, every SHRINK_BATCH pages). Plus, it has to be done somewhat
> > synchronously (all the four stages), and/or we have to make a "userspace
> > shrinker" thread working in parallel with the normal shrinker, and here,
> > I'm afraid, we'll see more strange interactions. :)
> 
> Wouldn't this be just like kswapd?

Sure, this is similar, but only for indirect reclaim (obviously).

How we'd do this for the direct reclaim I have no idea, honestly, with
Andrew's idea it must be all synchronous, so playing ping-pong with
userland during the direct reclaim will be hard.

So, the best thing to do with the direct recaim, IMHO, is just send a
notification.

> > But there is a good news: for these kind of fine-grained control we have a
> > better interface, where we don't have to communicate [very often] w/ the
> > kernel. These are "volatile ranges", where userland itself marks chunks of
> > data as "I might need it, but I won't cry if you recycle it; but when I
> > access it next time, let me know if you actually recycled it". Yes,
> > userland no longer able to decide which exact page it permits to recycle,
> > but we don't have use-cases when we actually care that much. And if we do,
> > we'd rather introduce volatile LRUs with different priorities, or
> > something alike.
> 
> I'm new to this stuff so please take this with a grain of salt, but I'm
> not sure volatile ranges would be a good fit for our use case: we want to
> make (kvm) guests reduce their memory when the host is getting memory
> pressure.

Yes, for this kind of things you want a simple notification.

I wasn't saying that volatile ranges must be a substitute for
notifications, quite the opposite: I was saying that you can do volatile
ranges in userland by using "userland-shrinker".

It can be even wrapped into a library, with the same mmap() libc
interface. But it will be inefficient.

Thanks,
Anton.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/