[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161005003833.GA29239@mail.hallyn.com>
Date: Tue, 4 Oct 2016 19:38:33 -0500
From: "Serge E. Hallyn" <serge@...lyn.com>
To: John Stultz <john.stultz@...aro.org>
Cc: lkml <linux-kernel@...r.kernel.org>, Tejun Heo <tj@...nel.org>,
Li Zefan <lizefan@...wei.com>,
Jonathan Corbet <corbet@....net>, cgroups@...r.kernel.org,
Android Kernel Team <kernel-team@...roid.com>,
Rom Lemarchand <romlem@...roid.com>,
Colin Cross <ccross@...roid.com>,
Dmitry Shmidt <dimitrysh@...gle.com>,
Todd Kjos <tkjos@...gle.com>,
Christian Poetzsch <christian.potzsch@...tec.com>,
Amit Pundir <amit.pundir@...aro.org>,
"Serge E. Hallyn" <serge@...lyn.com>
Subject: Re: [RFC][PATCH] cgroup: Add new capability to allow a process to
migrate other tasks between cgroups
Quoting John Stultz (john.stultz@...aro.org):
> This patch adds CAP_GROUP_MIGRATE_TASK and logic to allows a process
> to migrate other tasks between cgroups.
>
> In Android (where this feature originated), the ActivityManager tracks
> various application states (TOP_APP, FOREGROUND, BACKGROUND, SYSTEM,
> etc), and then as applications change states, the SchedPolicy logic
> will migrate the application tasks between different cgroups used
> to control the different application states (for example, there is a
> background cpuset cgroup which can limit background tasks to stay
> on one low-power cpu, and the bg_non_interactive cpuctrl cgroup can
> then further limit those background tasks to a small percentage of
> that one cpu's cpu time).
>
> However, for security reasons, Android doesn't want to make the
> system_server (the process that runs the ActivityManager and
> SchedPolicy logic), run as root. So in the Android common.git
> kernel, they have some logic to allow cgroups to loosen their
> permissions so CAP_SYS_NICE tasks can migrate other tasks between
> cgroups.
>
> The approach taken there overloads CAP_SYS_NICE a bit much, and
> is maybe more complicated then needed.
>
> So this patch, as suggested by Tejun, simply adds a new process
> capability flag (CAP_CGROUP_MIGRATE_TASK), and uses it when checking
So realistically, what all can this mean? Freezing tasks, changing
cpu/memory limits, changing network and disk throughput, forbid forking,
and (most importantly) forbid access to certain devices.
I think that's all ok. (And we still separately check for inode write
perms.)
If anything I'd say the GLOBAL_ROOT_UID check could be taken out since
otherwise a host-root task effectively cannot drop this capability.
> if a task can migrate other tasks between cgroups.
>
> I've tested this with AOSP master (though its a bit hacked in as I
> still need to properly get the selinux bits aware of the new
> capability bit) with selinux set to permissive and it seems to be
> working well.
>
> Thouhts and feedback would be appreciated!
>
> Cc: Tejun Heo <tj@...nel.org>
> Cc: Li Zefan <lizefan@...wei.com>
> Cc: Jonathan Corbet <corbet@....net>
> Cc: cgroups@...r.kernel.org
> Cc: Android Kernel Team <kernel-team@...roid.com>
> Cc: Rom Lemarchand <romlem@...roid.com>
> Cc: Colin Cross <ccross@...roid.com>
> Cc: Dmitry Shmidt <dimitrysh@...gle.com>
> Cc: Todd Kjos <tkjos@...gle.com>
> Cc: Christian Poetzsch <christian.potzsch@...tec.com>
> Cc: Amit Pundir <amit.pundir@...aro.org>
> Cc: Serge E. Hallyn <serge@...lyn.com>
Acked-by: Serge Hallyn <serge@...lyn.com>
> Signed-off-by: John Stultz <john.stultz@...aro.org>
> ---
> include/uapi/linux/capability.h | 5 ++++-
> kernel/cgroup.c | 3 ++-
> 2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
> index 49bc062..e199ea0 100644
> --- a/include/uapi/linux/capability.h
> +++ b/include/uapi/linux/capability.h
> @@ -349,8 +349,11 @@ struct vfs_cap_data {
>
> #define CAP_AUDIT_READ 37
>
> +/* Allow migrating tasks between cgroups */
>
> -#define CAP_LAST_CAP CAP_AUDIT_READ
> +#define CAP_CGROUP_MIGRATE_TASK 38
> +
> +#define CAP_LAST_CAP CAP_CGROUP_MIGRATE_TASK
>
> #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
>
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 9ba28310..a318956 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -2847,7 +2847,8 @@ static int cgroup_procs_write_permission(struct task_struct *task,
> */
> if (!uid_eq(cred->euid, GLOBAL_ROOT_UID) &&
> !uid_eq(cred->euid, tcred->uid) &&
> - !uid_eq(cred->euid, tcred->suid))
> + !uid_eq(cred->euid, tcred->suid) &&
> + !ns_capable(tcred->user_ns, CAP_CGROUP_MIGRATE_TASK))
> ret = -EACCES;
>
> if (!ret && cgroup_on_dfl(dst_cgrp)) {
> --
> 1.9.1
Powered by blists - more mailing lists