lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6599ad830803280737lf6882bapd9707c02bf26ef12@mail.gmail.com>
Date:	Fri, 28 Mar 2008 07:37:21 -0700
From:	"Paul Menage" <menage@...gle.com>
To:	balbir@...ux.vnet.ibm.com
Cc:	"Andrew Morton" <akpm@...ux-foundation.org>,
	"Pavel Emelianov" <xemul@...nvz.org>,
	"Hugh Dickins" <hugh@...itas.com>,
	"Sudhir Kumar" <skumar@...ux.vnet.ibm.com>,
	"YAMAMOTO Takashi" <yamamoto@...inux.co.jp>, lizf@...fujitsu.com,
	linux-kernel@...r.kernel.org, taka@...inux.co.jp,
	linux-mm@...ck.org, "David Rientjes" <rientjes@...gle.com>,
	"KAMEZAWA Hiroyuki" <kamezawa.hiroyu@...fujitsu.com>
Subject: Re: [RFC][0/3] Virtual address space control for cgroups (v2)

On Thu, Mar 27, 2008 at 8:59 PM, Balbir Singh <balbir@...ux.vnet.ibm.com> wrote:
>  > Java (or at least, Sun's JRE) is an example of a common application
>  > that does this. It creates a huge heap mapping at startup, and faults
>  > it in as necessary.
>  >
>
>  Isn't this controlled by the java -Xm options?
>

Probably - that was just an example, and the behaviour of Java isn't
exactly unreasonable. A different example would be an app that maps a
massive database file, but only pages small amounts of it in at any
one time.

>
>  I understand, but
>
>  1. The system by default enforces overcommit on most distros, so why should we
>  not have something similar and that flexible for cgroups.

Right, I guess I should make it clear that I'm *not* arguing that we
shouldn't have a virtual address space limit subsystem.

My main arguments in this and my previous email were to back up my
assertion that there are a significant set of real-world cases where
it doesn't help, and hence it should be a separate subsystem that can
be turned on or off as desired.

It strikes me that when split into its own subsystem, this is going to
be very simple - basically just a resource counter and some file
handlers. We should probably have something like
include/linux/rescounter_subsys_template.h, so you can do:

#define SUBSYS_NAME va
#define SUBSYS_UNIT_SUFFIX in_bytes
#include <linux/rescounter_subsys_template.h>

then all you have to add are the hooks to call the rescounter
charge/uncharge functions and you're done. It would be nice to have a
separate trivial subsystem like this for each of the rlimit types, not
just virtual address space.

>   And specifying
>  > them manually requires either unusually clueful users (most of whom
>  > have enough trouble figuring out how much physical memory they'll
>  > need, and would just set very high virtual address space limits) or
>  > sysadmins with way too much time on their hands ...
>  >
>
>  It's a one time thing to setup for sysadmins
>

Sure, it's a one-time thing to setup *if* your cluster workload is
completely static.

>
>  > As I said, I think focussing on ways to tell apps that they're running
>  > low on physical memory would be much more productive.
>  >
>
>  We intend to do that as well. We intend to have user space OOM notification.

We've been playing with a user-space OOM notification system at Google
- it's on my TODO list to push it to mainline (as an independent
subsystem, since either cpusets or the memory controller can be used
to cause OOMs that are localized to a cgroup). What we have works
pretty well but I think our interface is a bit too much of a kludge at
this point.

Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ