linux-kernel - Re: [-mm] Add an owner to the mm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 4 Apr 2008 12:11:40 -0700
From:	"Paul Menage" <menage@...gle.com>
To:	balbir@...ux.vnet.ibm.com
Cc:	"Pavel Emelianov" <xemul@...nvz.org>,
	"Hugh Dickins" <hugh@...itas.com>,
	"Sudhir Kumar" <skumar@...ux.vnet.ibm.com>,
	"YAMAMOTO Takashi" <yamamoto@...inux.co.jp>, lizf@...fujitsu.com,
	linux-kernel@...r.kernel.org, taka@...inux.co.jp,
	linux-mm@...ck.org, "David Rientjes" <rientjes@...gle.com>,
	"Andrew Morton" <akpm@...ux-foundation.org>,
	"KAMEZAWA Hiroyuki" <kamezawa.hiroyu@...fujitsu.com>
Subject: Re: [-mm] Add an owner to the mm_struct (v8)

On Fri, Apr 4, 2008 at 2:25 AM, Balbir Singh <balbir@...ux.vnet.ibm.com> wrote:
>  >>  For other controllers,
>  >>  they'll need to monitor exit() callbacks to know when the leader is dead :( (sigh).
>  >
>  > That sounds like a nightmare ...
>  >
>
>  Yes, it would be, but worth the trouble. Is it really critical to move a dead
>  cgroup leader to init_css_set in cgroup_exit()?

It struck me that this whole group leader optimization is broken as it
stands since there could (in strange configurations) be multiple
thread groups sharing the same mm.

I wonder if we can't just delay the exit_mm() call of a group leader
until all its threads have exited?

>
>  > As long as we find someone to pass the mm to quickly, it shouldn't be
>  > too bad - I think we're already optimized for that case. Generally the
>  > group leader's first child will be the new owner, and any subsequent
>  > times the owner exits, they're unlikely to have any children so
>  > they'll go straight to the sibling check and pass the mm to the
>  > parent's first child.
>  >
>  > Unless they all exit in strict sibling order and hence pass the mm
>  > along the chain one by one, we should be fine. And if that exit
>  > ordering does turn out to be common, then simply walking the child and
>  > sibling lists in reverse order to find a victim will minimize the
>  > amount of passing.
>  >
>
>
>  Finding the next mm might not be all that bad, but doing it each time a task
>  exits, can be an overhead, specially for large multi threaded programs.

Right, but we only have that overhead if we actually end up passing
the mm from one to another each time they exit. It would be
interesting to know what order the threads in a large multi-threaded
process exit typically (when the main process exits and all the
threads die).

I guess it's likely to be one of:

- in thread creation order (i.e. in order of parent->children list),
in which case we should try to throw the mm to the parent's last child
- in reverse creation order, in which case we should try to throw the
mm to the parent's first child
- in random order depending on which threads the scheduler runs first
(in which case we can expect that a small fraction of the threads will
have to throw the mm whichever end we start from)

>  This can
>  get severe if the new mm->owner belongs to a different cgroup, in which case we
>  need to use callbacks as well.
>
>  If half the threads belonged to a different cgroup and the new mm->owner kept
>  switching between cgroups, the overhead would be really high, with the callbacks
>  and the mm->owner changing frequently.

To me, it seems that setting up a *virtual address space* cgroup
hierarchy and then putting half your threads in one group and half in
the another is asking for trouble. We need to not break in that
situation, but I'm not sure it's a case to optimize for.

Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/