lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1259245261.31676.73.camel@laptop>
Date:	Thu, 26 Nov 2009 15:21:01 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Mike Galbraith <efault@....de>
Cc:	Ingo Molnar <mingo@...e.hu>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [patch] sched: fix set_task_cpu() and provide an unlocked
 runqueue variant

On Thu, 2009-11-26 at 15:09 +0100, Peter Zijlstra wrote:
> On Thu, 2009-11-26 at 11:16 +0100, Mike Galbraith wrote:
> > > min_vruntime should only ever be poked at when holding the respective
> > > rq->lock, even with a barrier a 64bit read on a 32bit machine can go all
> > > funny.
> > 
> > Yeah, but we're looking at an unlocked runqueue.  But never mind...
> 
> The patch is also poking at rq->clock without rq->lock held... not very
> nice.
> 
> Gah, I hate that we're doing migration things without holding both rq's,
> this is making live so very interesting ;-)

so the problem is this bit in kernel/fork.c, which is ran with
tasklist_lock held for writing:

	/*
	 * The task hasn't been attached yet, so its cpus_allowed mask will
	 * not be changed, nor will its assigned CPU.
	 *
	 * The cpus_allowed mask of the parent may have changed after it was
	 * copied first time - so re-copy it here, then check the child's CPU
	 * to ensure it is on a valid CPU (and if not, just force it back to
	 * parent's CPU). This avoids alot of nasty races.
	 */
	p->cpus_allowed = current->cpus_allowed;
	p->rt.nr_cpus_allowed = current->rt.nr_cpus_allowed;
	if (unlikely(!cpu_isset(task_cpu(p), p->cpus_allowed) ||
			!cpu_online(task_cpu(p))))
		set_task_cpu(p, smp_processor_id());


The problem is that that doesn't close any races at all since
tasklist_lock doesn't fully serialize changes to ->cpus_allowed.

In fact, there is nothing that protects that mask at all.

The second problem is that set_task_cpu() is accessing data from both
the old and the new rq, which basically requires is being ran with both
rq's locked, and the regular migration paths do so.

However things like ttwu() try to be cute and do not, opening the doors
to all kinds of funny.

Clearly we don't really want to do double_rq_lock() in ttwu(), that's
one of the hotter paths around (and looking at it we ought to seriously
look at trimming some of it).

Which sort of leaves us in a predicament all-right...

/me goes puzzle.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ