lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150209112237.GR5029@twins.programming.kicks-ass.net>
Date:	Mon, 9 Feb 2015 12:22:37 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Zefan Li <lizefan@...wei.com>
Cc:	Ingo Molnar <mingo@...nel.org>,
	Mike Galbraith <umgwanakikbuti@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Stefan Bader <stefan.bader@...onical.com>
Subject: Re: [PATCH] sched, autogroup: Fix failure when writing to
 cpu.rt_runtime_us

On Sat, Feb 07, 2015 at 03:02:11PM +0800, Zefan Li wrote:
> Before 8323f26ce342 "sched: Fix race in task_group()", task_group() of
> those RT tasks always return root_task_group, but the escape can't happen.

Ah, yes, I'm an idiot. I'm not sure what I was thinking, but I seemed
to have confused myself very well indeed.

> After commit 8323f26ce342, task_group always return root_task_group except
> for the case I showed:
> 
> 1. Change scheduling policy before setsid():
> 
>  # cat /proc/sched_debug | grep test
>  R           test  4194     24851.893077       945   120     24851.893077     11196.482331         0.000000 /
> 
> 2. Change policy after setsid():
> 
>  R           test  4142      4962.517723       420   120      4962.517723      4974.126149         0.000000 /autogroup-44

Yes, which of course is inconsistent as well, it really is in the
autogroup, regardless of its class.

> I think we can fix it with:
> 
> diff --git a/kernel/sched/auto_group.c b/kernel/sched/auto_group.c
> index 8a2e230..8c3a169 100644
> --- a/kernel/sched/auto_group.c
> +++ b/kernel/sched/auto_group.c
> @@ -115,9 +115,6 @@ bool task_wants_autogroup(struct task_struct *p, struct task_group *tg)
>  	if (tg != &root_task_group)
>  		return false;
>  
> -	if (p->sched_class != &fair_sched_class)
> -		return false;
> -

Yes.

> This is exactly what I did at first, but besides the issue described above,
> seems it might lead to starving RT tasks.
> 
> If there's some rt task in autogroups but none in root cgroup, it's allowed
> to set rt_runtime to 0, so I think we have to disallow this setting, like what
> we already do with global rt_runtime.

> @@ -7540,6 +7543,9 @@ static int __rt_schedulable(struct task_group *tg, u64 period, u64 runtime)
>  		.rt_runtime = runtime,
>  	};
>  
> +	if (tg == &root_task_group && runtime == 0)
> +		return -EINVAL;
> +

Indeed, setting runtime=0 for the root group is a very bad thing
regardless of this patch. It would disallow the kernel from creating RT
threads, which it needs for 'correct' operation in a number of cases.

But lets make that a separate patch.

So how about this?

---
Subject: sched, autogroup: Fix failure to set cpu.rt_runtime_us
From: Peter Zijlstra <peterz@...radead.org>
Date: Mon Feb  9 11:53:18 CET 2015

Because task_group() uses a cache of autogroup_task_group(), whoes
output depends on sched_class, switching classes can generate
problems.

In particular, when started as fair, the cache points to the
autogroup, so when switching to RT the tg_rt_schedulable() test fails
for every cpu.rt_{runtime,period}_us change because now the autogroup
has tasks and no runtime.

Furthermore, going back to the previous semantics of varying
task_group() with sched_class has the down-side that the sched_debug
output varies as well, even though the task really is in the
autogroup.

Therefore add an autogroup exception to tg_has_rt_tasks() -- such that
both (all) task_group() usages in sched/core now have one. And remove
all the remnants of the variable task_group() output.

Cc: Mike Galbraith <umgwanakikbuti@...il.com>
Cc: Stefan Bader <stefan.bader@...onical.com>
Reported-by: Zefan Li <lizefan@...wei.com>
Fixes: 8323f26ce342 ("sched: Fix race in task_group()")
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
---
 kernel/sched/auto_group.c |    6 +-----
 kernel/sched/core.c       |    6 ++++++
 2 files changed, 7 insertions(+), 5 deletions(-)

--- a/kernel/sched/auto_group.c
+++ b/kernel/sched/auto_group.c
@@ -87,8 +87,7 @@ static inline struct autogroup *autogrou
 	 * so we don't have to move tasks around upon policy change,
 	 * or flail around trying to allocate bandwidth on the fly.
 	 * A bandwidth exception in __sched_setscheduler() allows
-	 * the policy change to proceed.  Thereafter, task_group()
-	 * returns &root_task_group, so zero bandwidth is required.
+	 * the policy change to proceed.
 	 */
 	free_rt_sched_group(tg);
 	tg->rt_se = root_task_group.rt_se;
@@ -115,9 +114,6 @@ bool task_wants_autogroup(struct task_st
 	if (tg != &root_task_group)
 		return false;
 
-	if (p->sched_class != &fair_sched_class)
-		return false;
-
 	/*
 	 * We can only assume the task group can't go away on us if
 	 * autogroup_move_group() can see us on ->thread_group list.
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7644,6 +7644,12 @@ static inline int tg_has_rt_tasks(struct
 {
 	struct task_struct *g, *p;
 
+	/*
+	 * Autogroups do not have RT tasks; see autogroup_create().
+	 */
+	if (task_group_is_autogroup(tg))
+		return 0;
+
 	for_each_process_thread(g, p) {
 		if (rt_task(p) && task_group(p) == tg)
 			return 1;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ