lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 15 Jul 2007 00:23:56 +0400
From:	Oleg Nesterov <oleg@...sign.ru>
To:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
Cc:	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
	Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [RFC] Thread Migration Preemption - v4

On 07/14, Mathieu Desnoyers wrote:
>
> @@ -4891,10 +4948,42 @@ static int migration_thread(void *data)
>  		list_del_init(head->next);
>  
>  		spin_unlock(&rq->lock);
> -		__migrate_task(req->task, cpu, req->dest_cpu);
> +		migrated = __migrate_task(req->task, cpu, req->dest_cpu);
>  		local_irq_enable();
> -
> -		complete(&req->done);
> +		if (!migrated) {
> +			/*
> +			 * If the process has not been migrated, let it run
> +			 * until it reaches a migration_check() so it can
> +			 * wake us up.
> +			 */
> +			spin_lock_irq(&rq->lock);
> +			head = &rq->migration_queue;
> +			list_add(&req->list, head);
> +			if (req->task->se.on_rq
> +					|| !task_migrate_count(req->task)) {
> +				/*
> +				 * The process is on the runqueue, it could
> +				 * exit its critical section at any moment,
> +				 * don't race with it and retry actively.
> +				 * Also, if the thread is not on the runqueue
> +				 * and has a zero migration count
> +				 * (__migrate_task failed because cpus allowed
> +				 * changed), just retry.
> +				 */
> +				spin_unlock_irq(&rq->lock);
> +				continue;

Again, this can deadlock. migration_thread() is SCHED_FIFO, and it shares the
same CPU with req->task. We are doing a busy-wait loop, req->task may have no
chance to finish its critical section.

> +			}
> +			set_tsk_thread_flag(req->task, TIF_NEED_MIGRATE);
> +			set_current_state(TASK_INTERRUPTIBLE);
> +			spin_unlock_irq(&rq->lock);
> +			/*
> +			 * Wait for the process currently in its critical
> +			 * section.
> +			 */
> +			wake_up_process(req->task);
> +			schedule();

As Peter pointed out, we hold up all other migration requests.

And worse, this can hang forever (until another req comes). do_check_migrate()
can wake_up() the wrong rq.

set_current_state(TASK_INTERRUPTIBLE) is bogus, we can miss a wakeup from
(say) sched_migrate_task().

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ