lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Fri,  9 Oct 2015 23:29:27 -0400
From:	Tejun Heo <tj@...nel.org>
To:	lizefan@...wei.com, hannes@...xchg.org
Cc:	cgroups@...r.kernel.org, cyphar@...har.com,
	linux-kernel@...r.kernel.org, kernel-team@...com
Subject: [PATCHSET cgroup/for-4.4] cgroup: make zombies retain cgroup membership and fix pids controller

Hello,

cgroup currently disassociates a task from its cgroups on exit and
reassigns it to the root cgroup.  This behavior turns out to be
problematic for several reasons.

* Resources can't be tracked for zombies.  This breaks pids controller
  as zombies escape resource restriction.  A cgroup can easily go way
  above its limits by creating a bunch of zombies.

* It's difficult to tell where zombies came from.  /proc/PID/cgroup
  gets reset to / on exit so given a zombie it's difficult to tell
  from which cgroup the zombie came from.

* It creates an extra work for controllers for no reason.  cpu and
  perf_events controllers implement exit callbacks to switch the
  exiting task's membership to root when just leaving it as-is is
  enough.

Unfortunately, fixing this involves opening a few cans of worms.

* Decoupling tasks being on a css_set from its reference counting so
  that css_set can be pinned w/o tasks being on it and decoupling
  css_set existence from whether a cgroup is populated so that pinning
  a css_set doesn't confuse populated state tracking and populated
  state can be used to decide whether certain operations are allowed.

* Making css task iteration drop css_set_rwsem between iteration steps
  so that internal locking is not exposed to iterator users and
  css_set_rwsem can be converted to a spinlock which can be grabbed
  from task free path.

After this patchset, besides pids controller being fixed, the visible
behavior isn't changed on traditional hierarchies but on the default
hierarchy a zombie reports its cgroup at the time of exit in
/proc/PID/cgroup.  If the cgroup gets removed before the task is
reaped, " (deleted)" is appended to the reported path.

This patchset contains the following 14 patches.

 0001-cgroup-remove-an-unused-parameter-from-cgroup_task_m.patch
 0002-cgroup-make-cgroup-nr_populated-count-the-number-of-.patch
 0003-cgroup-replace-cgroup_has_tasks-with-cgroup_is_popul.patch
 0004-cgroup-move-check_for_release-invocation.patch
 0005-cgroup-relocate-cgroup_-try-get-put.patch
 0006-cgroup-make-css_sets-pin-the-associated-cgroups.patch
 0007-cgroup-make-cgroup_destroy_locked-test-cgroup_is_pop.patch
 0008-cgroup-keep-css_set-and-task-lists-in-chronological-.patch
 0009-cgroup-factor-out-css_set_move_task.patch
 0010-cgroup-reorganize-css_task_iter-functions.patch
 0011-cgroup-don-t-hold-css_set_rwsem-across-css-task-iter.patch
 0012-cgroup-make-css_set_rwsem-a-spinlock-and-rename-it-t.patch
 0013-cgroup-keep-zombies-associated-with-their-original-c.patch
 0014-cgroup-add-cgroup_subsys-free-method-and-use-it-to-f.patch

0001-0007 decouple populated state tracking from css_set existence and
allows css_sets to be pinned without tasks on them.

0008-0012 update css_set task iterator to not hold lock across
iteration steps and replace css_set_rwsem with a spinlock.

0013 makes zombies keep their cgroup associations.  0014 introduces
->exit() method and fixes pids controller.

The patchset is pretty lightly tested and I need to verify that the
corner cases behave as expected.

This patchset is on top of cgroup/for-4.4 a3e72739b7a7 ("cgroup: fix
too early usage of static_branch_disable()") and available in the
following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-zombies

diffstat follows.  Thanks.

 Documentation/cgroups/cgroups.txt           |    4 
 Documentation/cgroups/unified-hierarchy.txt |    4 
 include/linux/cgroup-defs.h                 |   16 
 include/linux/cgroup.h                      |   14 
 kernel/cgroup.c                             |  522 +++++++++++++++++-----------
 kernel/cgroup_pids.c                        |    8 
 kernel/cpuset.c                             |    2 
 kernel/events/core.c                        |   16 
 kernel/fork.c                               |    1 
 kernel/sched/core.c                         |   16 
 mm/memcontrol.c                             |    2 
 11 files changed, 354 insertions(+), 251 deletions(-)

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ