lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 21 Dec 2011 09:37:33 -0800
From:	Tejun Heo <tj@...nel.org>
To:	tip-bot for Daisuke Nishimura <nishimura@....nes.nec.co.jp>
Cc:	linux-tip-commits@...r.kernel.org, linux-kernel@...r.kernel.org,
	hpa@...or.com, mingo@...hat.com, a.p.zijlstra@...llo.nl,
	pjt@...gle.com, tglx@...utronix.de, mingo@...e.hu,
	Frederic Weisbecker <fweisbec@...il.com>
Subject: Re: [tip:sched/core] sched: Fix cgroup movement of forking process

(cc'ing Frederic)

On Wed, Dec 21, 2011 at 09:26:32AM -0800, Tejun Heo wrote:
> Hello, guys.
> 
> On Wed, Dec 21, 2011 at 03:44:14AM -0800, tip-bot for Daisuke Nishimura wrote:
> > sched: Fix cgroup movement of forking process
> > 
> > There is a small race between task_fork_fair() and sched_move_task(),
> > which is trying to move the parent.
> > 
> >         task_fork_fair()                 sched_move_task()
> > --------------------------------+---------------------------------
> >   cfs_rq = task_cfs_rq(current)
> >     -> cfs_rq is the "old" one.
> >   curr = cfs_rq->curr
> >     -> curr is set to the parent.
> >                                     task_rq_lock()
> >                                     dequeue_task()
> >                                       ->parent.se.vruntime -= (old)cfs_rq->min_vruntime
> >                                     enqueue_task()
> >                                       ->parent.se.vruntime += (new)cfs_rq->min_vruntime
> >                                     task_rq_unlock()
> >   raw_spin_lock_irqsave(rq->lock)
> >   se->vruntime = curr->vruntime
> >     -> vruntime of the child is set to that of the parent
> >        which has already been updated by sched_move_task().
> >   se->vruntime -= (old)cfs_rq->min_vruntime.
> >   raw_spin_unlock_irqrestore(rq->lock)
> > 
> > As a result, vruntime of the child becomes far bigger than expected,
> > if (new)cfs_rq->min_vruntime >> (old)cfs_rq->min_vruntime.
> > 
> > This patch fixes this problem by setting "cfs_rq" and "curr" after
> > holding the rq->lock.
> 
> The race shouldn't happen with threadgroup locking scheduled to be
> merged for the coming merge window.  sched_fork() and cgroup migration
> become exclusive and won't happen concurrently.  Would still make
> sense for -stable tho.

I retract that.  sched_move_task() can also be called from
cgroup_exit() which is outside of threadgroup locking.

Frederic, so, it seems we actually have race conditions here.  I
really wish cgroup made sure that things like this can't happen even
if we pay a bit of overhead in relatively cold paths.  I could be
being unrealistic tho.  Any ideas?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ