lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 7 Nov 2012 03:43:46 -0800
From:	Anton Vorontsov <anton.vorontsov@...aro.org>
To:	"Kirill A. Shutemov" <kirill@...temov.name>
Cc:	Mel Gorman <mgorman@...e.de>, Pekka Enberg <penberg@...nel.org>,
	Leonid Moiseichuk <leonid.moiseichuk@...ia.com>,
	KOSAKI Motohiro <kosaki.motohiro@...il.com>,
	Minchan Kim <minchan@...nel.org>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>,
	John Stultz <john.stultz@...aro.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, linaro-kernel@...ts.linaro.org,
	patches@...aro.org, kernel-team@...roid.com,
	linux-man@...r.kernel.org
Subject: Re: [RFC v3 0/3] vmpressure_fd: Linux VM pressure notifications

On Wed, Nov 07, 2012 at 01:21:36PM +0200, Kirill A. Shutemov wrote:
[...]
> Sorry, I didn't follow previous discussion on this, but could you
> explain what's wrong with memory notifications from memcg?
> As I can see you can get pretty similar functionality using memory
> thresholds on the root cgroup. What's the point?

There are a few reasons we don't use cgroup notifications:

1. We're not interested in the absolute number of pages/KB of available
   memory, as provided by cgroup memory controller. What we're interested
   in is the amount of easily reclaimable memory and new memory
   allocations' cost.

   We can have plenty of "free" memory, of which say 90% will be caches,
   and say 10% idle. But we do want to differentiate these types of memory
   (although not going into details about it), i.e. we want to get
   notified when kernel is reclaiming. And we also want to know when the
   memory comes from swapping others' pages out (well, actually we don't
   call it swap, it's "new allocations cost becomes high" -- it might be a
   result of many factors (swapping, fragmentation, etc.) -- and userland
   might analyze the situation when this happens).

   Exposing all the VM details to userland is not an option -- it is not
   possible to build a stable ABI on this. Plus, it makes it really hard
   for userland to deal with all the low level details of Linux VM
   internals.

   So, no, raw numbers of "free/used KBs" are not interesting at all.

1.5. But it is important to understand that vmpressure_fd() is not
     orthogonal to cgroups (like it was with vmevent_fd()). We want it to
     be "cgroup'able" too. :) But optionally.

2. The last time I checked, cgroups memory controller did not (and I guess
   still does not) not account kernel-owned slabs. I asked several times
   why so, but nobody answered.
   
   But no, this is not the main issue -- per "1.", we're not interested in
   kilobytes.

3. Some folks don't like cgroups: it has a penalty for kernel size, for
   performance and memory wastage. But again, it's not the main issue with
   memcg.

Thanks,
Anton.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ