linux-kernel - Re: [RFC][PATCH 06/22] sched: SCHED_DEADLINE handles spacial kthreads

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101113195857.GA11411@redhat.com>
Date:	Sat, 13 Nov 2010 20:58:57 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Raistlin <raistlin@...ux.it>, Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	Steven Rostedt <rostedt@...dmis.org>,
	Chris Friesen <cfriesen@...tel.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Darren Hart <darren@...art.com>,
	Johan Eker <johan.eker@...csson.com>,
	"p.faure" <p.faure@...tech.ch>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Claudio Scordino <claudio@...dence.eu.com>,
	michael trimarchi <trimarchi@...is.sssup.it>,
	Fabio Checconi <fabio@...dalf.sssup.it>,
	Tommaso Cucinotta <cucinotta@...up.it>,
	Juri Lelli <juri.lelli@...il.com>,
	Nicola Manica <nicola.manica@...i.unitn.it>,
	Luca Abeni <luca.abeni@...tn.it>,
	Dhaval Giani <dhaval@...is.sssup.it>,
	Harald Gustafsson <hgu1972@...il.com>,
	paulmck <paulmck@...ux.vnet.ibm.com>
Subject: Re: [RFC][PATCH 06/22] sched: SCHED_DEADLINE handles spacial
	kthreads

On 11/13, Peter Zijlstra wrote:
>
> Something like so?.. hasn't even seen a compiler yet but one's got to do
> something to keep the worst bore of saturday night telly in check ;-)

Yes, I _think_ this all can work (and imho makes a lot of sense
if it works).

quick and dirty review below ;)

>  struct take_cpu_down_param {
> -	struct task_struct *caller;
>  	unsigned long mod;
>  	void *hcpu;
>  };
> @@ -208,11 +207,6 @@ static int __ref take_cpu_down(void *_pa
>
>  	cpu_notify(CPU_DYING | param->mod, param->hcpu);
>
> -	if (task_cpu(param->caller) == cpu)
> -		move_task_off_dead_cpu(cpu, param->caller);
> -	/* Force idle task to run as soon as we yield: it should
> -	   immediately notice cpu is offline and die quickly. */
> -	sched_idle_next();

Yes. but we should remove "while (!idle_cpu(cpu))" from _cpu_down().

> @@ -2381,18 +2381,15 @@ static int select_fallback_rq(int cpu, s
>  		return dest_cpu;
>
>  	/* No more Mr. Nice Guy. */
> -	if (unlikely(dest_cpu >= nr_cpu_ids)) {
> -		dest_cpu = cpuset_cpus_allowed_fallback(p);
> -		/*
> -		 * Don't tell them about moving exiting tasks or
> -		 * kernel threads (both mm NULL), since they never
> -		 * leave kernel.
> -		 */
> -		if (p->mm && printk_ratelimit()) {
> -			printk(KERN_INFO "process %d (%s) no "
> -			       "longer affine to cpu%d\n",
> -			       task_pid_nr(p), p->comm, cpu);
> -		}
> +	dest_cpu = cpuset_cpus_allowed_fallback(p);
> +	/*
> +	 * Don't tell them about moving exiting tasks or
> +	 * kernel threads (both mm NULL), since they never
> +	 * leave kernel.
> +	 */
> +	if (p->mm && printk_ratelimit()) {
> +		printk(KERN_INFO "process %d (%s) no longer affine to cpu%d\n",
> +				task_pid_nr(p), p->comm, cpu);
>  	}

Hmm. I was really puzzled until I realized this is just cleanup,
we can't reach this point if dest_cpu < nr_cpu_ids.

> +static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *p)
>  {
>  	struct rq *rq = cpu_rq(dead_cpu);
> +	int needs_cpu, uninitialized_var(dest_cpu);
>
> -	/* Must be exiting, otherwise would be on tasklist. */
> -	BUG_ON(!p->exit_state);
> -
> -	/* Cannot have done final schedule yet: would have vanished. */
> -	BUG_ON(p->state == TASK_DEAD);
> -
> -	get_task_struct(p);
> +	needs_cpu = (task_cpu(p) == dead_cpu) && (p->state != TASK_WAKING);
> +	if (needs_cpu)
> +		dest_cpu = select_fallback_rq(dead_cpu, p);
> +	raw_spin_unlock(&rq->lock);

Probably we do not need any checks. This task was picked by
->pick_next_task(), it should have task_cpu(p) == dead_cpu ?

But. I think there is a problem. We should not migrate current task,
stop thread, which does the migrating. At least, sched_stoptask.c
doesn't implement ->enqueue_task() and we can never wake it up later
for kthread_stop().

Perhaps migrate_tasks() should do for_each_class() by hand to
ignore stop_sched_class. But then _cpu_down() should somewhow
ensure the stop thread on the dead CPU is already parked in
schedule().

> -	case CPU_DYING_FROZEN:
>  		/* Update our root-domain */
>  		raw_spin_lock_irqsave(&rq->lock, flags);
>  		if (rq->rd) {
>  			BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
>  			set_rq_offline(rq);
>  		}
> +		migrate_tasks(cpu);
> +		BUG_ON(rq->nr_running != 0);
>  		raw_spin_unlock_irqrestore(&rq->lock, flags);

Probably we don't really need rq->lock. All cpus run stop threads.

I am not sure about rq->idle, perhaps it should be deactivated.
I don't think we should migrate it.


What I never understood is the meaning of play_dead/etc. If we
remove sched_idle_next(), who will do that logic? And how the
idle thread can call idle_task_exit() ?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/